Linux: Rebuild RAID After Disk Failure

August 17, 2012 by  
Filed under Every thing you Need to Know

We start by listing the partition tables present on the server to locate the new hard disk

[root@mail ]# fdisk -l

Disk /dev/hda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot start end Blocks Id System
/dev/hda1 1 13 104391 fd Linux raid autodetect
/dev/hda2 14 6387 51199155 fd Linux raid autodetect
/dev/hda3 6388 6769 3068415 fd Linux raid autodetect
/dev/hda4 6770 60801 434012040 fd Linux raid autodetect

Disk /dev/hdc: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/hdc doesn’t contain a valid partition table

Disk /dev/md1: 52.4 GB, 52427816960 bytes
2 heads, 4 sectors/track, 12799760 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md1 doesn’t contain a valid partition table

Disk /dev/md3: 444.4 GB, 444428255232 bytes
2 heads, 4 sectors/track, 108502992 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md3 doesn’t contain a valid partition table

Disk /dev/md2: 3141 MB, 3141926912 bytes
2 heads, 4 sectors/track, 767072 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md2 doesn’t contain a valid partition table

Disk /dev/md0: 106 MB, 106823680 bytes
2 heads, 4 sectors/track, 26080 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md0 doesn’t contain a valid partition table

Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot start end Blocks Id System
/dev/sda1 1 121601 976760001 83 Linux
[root@mail ]#

The above output shows that there are two hard disks (hda and hdc). Partitions are visible on hda however hdc does not contain a valid partition table. In fact this is the replacement for the faulty disk.

To view the health of the Software RAID type cat /proc/mdstat

[root@mail ]# cat /proc/mdstat
Personalities: [raid1]
md0: active raid1 hda1[1]
104320 blocks [2/1] [_U]

md2: active raid1 hda3[1]
3068288 blocks [2/1] [_U]

md3: active raid1 hda4[1]
434011968 blocks [2/1] [_U]

md1: active raid1 hda2[1]
51199040 blocks [2/1] [_U]

unused devices:
[root@mail ]#

Note that [2/1] shows that there are 2 disks in the array however only 1 is active. the symbols [_U] mean that one of the two disks is missing from the array and must be reconfigured.

Since the new disk is /dev/hdc we need create the partitions required for the RAID. Use fdisk to create the boot partition on the new disk.

[root@mail ]# fdisk /dev/hdc
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won’t be recoverable.

The number of cylinders for this disk is set to 60801.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): p

Disk /dev/hdc: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot start end Blocks Id System

Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-60801, default 1): 1
Last cylinder or +size or +sizeM or +sizeK (1-60801, default 60801): +100M

Command (m for help): p

Disk /dev/hdc: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot start end Blocks Id System
/dev/hdc1 1 13 104391 83 Linux

Command (m for help): t
Selected partition 1
Hex code (type L to list codes): l

0 Empty 1e Hidden W95 FAT1 80 Old Minix be Solaris boot
1 FAT12 24 NEC DOS 81 Minix / old Lin bf Solaris
2 XENIX root 39 plan 9 82 Linux swap / so c1 DRDOS/sec (FAT-
3 XENIX usr 3c PartitionMagic 83 Linux c4 DRDOS/sec (FAT-
4 FAT16 . ] recovery = 37.2% (39808/104320) finish=0.5min speed=1809K/sec

md2: active raid1 hda3[1]
3068288 blocks [2/1] [_U]

md3: active raid1 hda4[1]
434011968 blocks [2/1] [_U]

md1: active raid1 hda2[1]
51199040 blocks [2/1] [_U]

unused devices:

The below shows how the output should be when the re-syncing process has ended. Note the [UU]. This confirms that both disks are active.

[root@mail ]# cat /proc/mdstat
Personalities: [raid1]
md0: active raid1 hdc1[0] hda1[1]
104320 blocks [2/2] [UU]

md2: active raid1 hda3[1]
3068288 blocks [2/1] [_U]

md3: active raid1 hda4[1]
434011968 blocks [2/1] [_U]

md1: active raid1 hda2[1]
51199040 blocks [2/1] [_U]

unused devices:
[root@mail ]#

The same process has to be performed on the other partitions

[root@mail ]# mdadm -a /dev/md1 /dev/hdc2
mdadm: added /dev/hdc2
[root@mail ]# mdadm -a /dev/md2 /dev/hdc3
mdadm: added /dev/hdc3
[root@mail ]# mdadm -a /dev/md3 /dev/hdc4
mdadm: added /dev/hdc4
[root@mail ]# cat /proc/mdstat
Personalities: [raid1]
md0: active raid1 hdc1[0] hda1[1]
104320 blocks [2/2] [UU]

md2: active raid1 hdc3[2] hda3[1]
3068288 blocks [2/1] [_U]
resync=DELAYED

md3: active raid1 hdc4[2] hda4[1]
434011968 blocks [2/1] [_U]
resync=DELAYED

md1: active raid1 hdc2[2] hda2[1]
51199040 blocks [2/1] [_U]
[>. ] recovery = 0.0% (37376/51199040) finish=501.5min speed=1698K/sec

unused devices:
[root@mail ]#

Finally we need to install GRUB on the second disk’s MBR.

GNU GRUB version 0.97 (640K lower / 3072K upper memory)

[ Minimal BASH-like line editing is supported. for the first word, TAB
lists possible command completions. anywhere else TAB lists the possible
completions of a device/filename.]

grub> find /grub/grub.conf
(hd0,0)
(hd1,0)

grub> root (hd1,0)
Filesystem type is ext2fs, partition type 0xfd

grub> setup (hd1)
Checking if /boot/grub/stage1 exists. no
Checking if /grub/stage1 exists. yes
Checking if /grub/stage2 exists. yes
Checking if /grub/e2fs_stage1_5 exists. yes
Running embed /grub/e2fs_stage1_5 (hd1). 15 sectors are embedded.
succeeded
Running install /grub/stage1 (hd1) (hd1)1+15 p (hd1,0)/grub/stage2 /grub/grub.conf. succeeded
Done.

grub> quit

Speak Your Mind

Tell us what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!