RAID 1 Repair on Linux

The disks in RAID 1 array may fail at any time.

Situation:

One day morning, a message may show up

A Fail event had been detected on md device /dev/md0. It could be related to component device /dev/sda1.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid1]
md0 : active raid1 sdb1[1] sda1[2](F)
204736 blocks super 1.0 [2/1] [_U]

md2 : active raid1 sda3[2](F) sdb3[1]
483855168 blocks super 1.1 [2/1] [_U]
bitmap: 3/4 pages [12KB], 65536KB chunk

md1 : active raid1 sda2[2](F) sdb2[1]
4192192 blocks super 1.1 [2/1] [_U]

unused devices:

it shows that sda has an issue (may not be hardware failure). Before deciding to replace the hard drive, you can try repairing the RAID 1. you need to remove the failed/missing device with mdadm first and the re-add it to start the rebuild/sync.

Check the status

Personalities : [raid1]
md0 : active raid1 sda1[2] sdb1[1]
204736 blocks super 1.0 [2/2] [UU]

md2 : active raid1 sda3[2] sdb3[1]
483855168 blocks super 1.1 [2/1] [_U]
[==========>..........] recovery = 51.6% (249879168/483855168) finish=272.1min speed=14328K/sec
bitmap: 3/4 pages [12KB], 65536KB chunk

md1 : active raid1 sda2[2] sdb2[1]
4192192 blocks super 1.1 [2/2] [UU]

Replace the failed drive with a new hard drive
If you are going to replace the hard drive, use the existing drive and mirror its partition table structure to the new drive.

then use the command mentioned before to add the partitions back into the RAID Arrays

Install Grub on new hard drive MBR:

We need install grub on the MBR of the newly installed hard drive. So that in case the other drive fails the new drive will be able to boot the OS.

Enter the Grub command line:

Locate grub setup files:

On a RAID 1 with two drives present you should expect to get

(hd0,0)
(hd1,0)

Install grub on the MBR:

We made the second drive /dev/sdb device (hd0) because putting grub on it this way puts a bootable mbr on the 2nd drive and when the first drive is missing the second drive will boot.

This will insure that if the first drive in the Raid Array fails or has already failed that you can boot to the Operating System with the second drive.

How can I detect if grub is installed in /dev/sda and /dev/sdb’s MBR?

You can issue command:


1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.00103986 s, 492 kB/s
0000180: 4752 5542 2000 4765 6f6d 0048 6172 6420 GRUB .Geom.Hard

To check the failed HD info

or

Check Disk Temperature

To replace the failed hard drive and rebuild the RAID 1, you can check those links
http://www.kernelhardware.org/replacing-failed-raid-drive/
http://wiki.contribs.org/Raid#Resynchronising_a_Failed_RAID
http://serverfault.com/questions/481774/degradedarray-event-on-dev-md1
https://www.centos.org/forums/viewtopic.php?t=24641
http://www.ducea.com/2009/03/08/mdadm-cheat-sheet/
http://serverfault.com/questions/97565/raid1-how-do-i-fail-a-drive-thats-marked-as-removed
http://techblog.tgharold.com/2009/01/removing-failed-non-existent-drive-from.shtml
https://bbs.archlinux.org/viewtopic.php?id=106919

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInShare on RedditShare on StumbleUponEmail this to someoneShare on TumblrDigg this

One thought on “RAID 1 Repair on Linux

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">