On Jan 31, 2013, at 2:45 AM, Adam Ryczkowski <adam.ryczkow...@statystyka.net> wrote: >> > Yes, you are right. It is important contributing factor, why relatime mount > option killed my performance so badly.
So is this what was causing the problem? >> > The dedup chunk size isn't clearly stated, but from the README I infer it > deduplicates files as a whole; here is an excerpt from the README > (https://github.com/g2p/bedup/blob/master/README.rst) I wouldn't expect reading file metadata, > This is a summary of the granurality of the allocation pieces in the storage > hierarchy. > On mdadm I have chunk size of 512K, It's quite large for your use case. It's large for most any use case, actually. > I couldn't find any command that tells me the leaf size of already created > btrfs system. Maybe you can tell me? I don't know that it's easily determined after mkfs time, someone else can maybe answer. Default is 4KB. Otherwise you use flags to set it. > I will also check, if there is an alignment problem as well. When I was > reading a manual for each of the layer I came to the conclusion that each > layer is supposed to align to the underlying one automatically. But I try to > can check it. I'm not thinking of an alignment problem, but a poor chosen chunk size for the usage problem. Changing 50 bytes (could be metadata or data), means in your case at least 2MB of RMW with a 512KB chunk. And this gets worse with more disks, because you have more chunks to read. The whole stripe is read, modified, and written on md raid6 currently. You're planning to add four more disks, so that's now 8 disks, and a 4MB full stripe RMW for 50 bytes of changed data. Depending on what GPT partitioned these 3TB disks, it's remotely possible they aren't aligned to 4K sectors however. gdisk should do this correctly by starting the first partition at LBA 2048, and aligning to 16 sector boundaries. parted of recent versions does something similar, but I forget the details. Older versions can misalign by starting at LBA 63, as can other older non-Linux tools. OS X's Disk Utility starts the first partition at LBA 40 which is OK. > Can you tell me more? Because I have only learned, that btrfs multi-device > support cannot join two volumes without striping. And striping in this case > is equivalent to fragmentation, which we want to avoid. In contrast to what > LVM can do. LVM can concatenate the underlying storage together, without > striping. When you create a btrfs file system, by default the data profile is single, and metadata profile is dup. When you add another device to the volume, it stays this way. The single data profile behaves similar to LVM linear, except btrfs will alternate chunk allocations between devices, so that one isn't just sitting there spinning for a month and not being used at all. So it's not striping. But even if it were striping, that would help you on write performance in particular because now it's effectively RAID 60. I don't see why striping is considered fragmentation. To change the profile for the volume, you use -dconvert and/or -mconvert with a rebalance operation. Chris Murphy-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html