Hi, Firstly, thanks for all the development on btrfs, I've been using it on quite a few systems without any significant issues and have found features like snapshots very useful.
I recently decided that raid6 was stable enough for data with raid1 metadata, and went about converting a raid1 array. But, I ran into a curious problem when trying to convert existing raid1 data to raid6 with a six drive array. >From what I can tell balance doesn't select the raid1 chunks it converts evenly from all drives. As a result, one pair of drives can be converted first which causes the remaining drives to fill up with raid6 data before having a chance to free any data raid1. When this happens the 6-way raid6 can degrade to 5-way, 4-way, or 3-way raid6. What is the selection strategy is btrfs balance using? e.g. starting from the first chunk/drive, by file creation/modification, random? And is there a reason for this behaviour and can it be changed? As a solution, I'm now forcing an even balance by polling device usage and adding the devid filter to the balance. Here's that script for anyone else in a similar situation. https://gist.github.com/rsanger/ad384c0fd3f4a003d3fbd9f097db8443 Below are the full details of what I've seen: I'm running a recent kernel from stretch backports: Debian 4.19.16-1~bpo9+1 (2019-02-07) x86_64 GNU/Linux I have a 6x8tb btrfs array, mounted at /media/Kilo. It has been added to over time, originally starting with 1 or 2 drives. Before conversion 'btrfs device usage' the drives looked something like this: ------------------------------------------------------------------ 5 drives approx: Device size: 7.28TiB Data, RAID1: 6.9TiB ... Unallocated: 500GiB 1 drive, replaced a 4TB with an 8TB before the conversion: Device size: 7.28TiB Data, RAID1: 3.8TiB ... Unallocated: 3.8TiB ------------------------------------------------------------------ I started conversion using: ~ btrfs balance start -dconvert=raid6 /media/Kilo/ Nearing 50% completion I noticed the drive usage looked like this, annotated with my interpretation: ------------------------------------------------------------------ /dev/sda1, ID: 1 Device size: 7.28TiB Device slack: 0.00B Data,RAID1: 3.88TiB Data,RAID6: 1.76TiB <- 6-way Data,RAID6: 7.63GiB <- ? Data,RAID6: 1.62TiB <- 4-way Metadata,RAID1: 12.00GiB System,RAID1: 32.00MiB Unallocated: 2.92GiB /dev/sdb1, ID: 6 Device size: 7.28TiB Device slack: 3.00KiB Data,RAID1: 1.86TiB Data,RAID6: 1.76TiB <- 6-way Data,RAID6: 8.37GiB <- ??? 2-way Data,RAID6: 1.62TiB <- 4-way Data,RAID6: 320.00KiB <- 3-way Metadata,RAID1: 2.00GiB Unallocated: 2.02TiB /dev/sdd1, ID: 2 Device size: 7.28TiB Device slack: 3.50KiB Data,RAID1: 3.88TiB Data,RAID6: 1.76TiB <- 6-way Data,RAID6: 8.37GiB <- ??? 2-way Data,RAID6: 1.62TiB <- 4-way Data,RAID6: 320.00KiB <- 3-way Metadata,RAID1: 7.00GiB System,RAID1: 32.00MiB Unallocated: 5.18GiB /dev/sdf1, ID: 3 Device size: 7.28TiB Device slack: 3.50KiB Data,RAID1: 2.27TiB Data,RAID6: 1.76TiB <- 6-way Data,RAID6: 6.57GiB <- ? Data,RAID6: 1.62TiB <- 4-way Metadata,RAID1: 7.00GiB Unallocated: 1.62TiB /dev/sdg1, ID: 5 Device size: 7.28TiB Device slack: 3.50KiB Data,RAID1: 5.49TiB Data,RAID6: 1.76TiB <- 6-way Data,RAID6: 12.66GiB <- ? Data,RAID6: 320.00KiB <- 3-way Data,RAID6: 3.77GiB <- ? Metadata,RAID1: 6.00GiB Unallocated: 1.00MiB /dev/sdh1, ID: 4 Device size: 7.28TiB Device slack: 3.50KiB Data,RAID1: 5.49TiB Data,RAID6: 1.76TiB <- 6-way Data,RAID6: 7.12GiB <- ? Data,RAID6: 6.31GiB <- ? Metadata,RAID1: 8.00GiB Unallocated: 1.00MiB ------------------------------------------------------------------ >From what I can see disks 4 and 5 have filled up, and balance convert is creating 4-way raid6, not 6-way? At this point, I cancelled the balance and started using the script mentioned earlier. So now I'm in this situation is there anyway to filter to fix just the non 6-way raid-6 or will I have to run a full balance again? It would be great to see the balance operation improved in the future to avoid such situations. I can lodge a proper bug report if that is best. Thanks, Richard Sanger