Replacing a Faulty Disk in a RAID 1 Array (mdadm on Ubuntu 24.04)

1. Check RAID Status

Use the following command to inspect the current state of the RAID:

sudo mdadm --detail /dev/md0

State : clean, degraded
Active Devices : 1
Failed Devices : 1

2. Mark and Remove the Faulty Disk

If the faulty disk is still present but failing, run:

sudo mdadm --fail /dev/md0 /dev/nvme1n1p1

sudo mdadm --remove /dev/md0 /dev/nvme1n1p1

If the disk has already been physically removed, you can skip this step.

3. Replace the Faulty Disk

Power down the server (if required), physically remove the faulty disk, and install a new one. Boot back into the system.

4. Partition the New Disk

RAID 1 mirrors data inside partitions, not the partition layout itself. The new disk must match the layout of the healthy one.
Clone the partition table from the working disk:

sudo sgdisk -R=/dev/nvme1n1 /dev/nvme0n1

sudo sgdisk -G /dev/nvme1n1

-R=/dev/nvme1n1 /dev/nvme0n1: Copies the partition table.
-G: Regenerates the disk GUID to avoid conflicts.
Check the layout:

lsblk

Make sure /dev/nvme1n1p1 exists and matches the partition size of /dev/nvme0n1p2.

5. Add the New Disk to the RAID Array

Once partitioned, re-add the new disk:

sudo mdadm --add /dev/md0 /dev/nvme1n1p1

6. Monitor the Rebuild Process

Use this command to check the rebuild progress:

cat /proc/mdstat

7. Confirm Rebuild Completion

When the rebuild is done, confirm the RAID status:

sudo mdadm --detail /dev/md0

Expected output:
State : clean
Active Devices : 2
Failed Devices : 0

Optional: Update mdadm Config for Boot Persistence

Run the following to ensure mdadm config is saved for future boots:

sudo mdadm --detail --scan | sudo tee /etc/mdadm/mdadm.conf

sudo update-initramfs -u

With these steps, you’ve successfully replaced a failed RAID 1 disk and restored redundancy.

Category: TutorialsBy DanyChrys 31 octombrie 2025 Leave a comment

Author: DanyChrys

https://www.thetimes.ro

Ubuntu senior & web designer