On 25/09/17 22:02, Eric Robinson wrote:
Problem:
Under high write load, DRBD exhibits data corruption. In repeated tests
over a month-long period, file corruption occurred after 700-900 GB of
data had been written to the DRBD volume.
Testing Platform:
2 x Dell PowerEdge R610 servers
32GB RAM
6 x Samsung SSD 840 Pro 512GB (latest firmware)
Dell H200 JBOD Controller
SUSE Linux Enterprise Server 12 SP2 (kernel 4.4.74-92.32)
Gigabit network, 900 Mbps throughput, < 1ms latency, 0 packet loss
Initial Setup:
Create 2 RAID-0 software arrays using either mdadm or LVM
On Array 1: sda5 through sdf5, create DRBD replicated
volume (drbd0) with an ext4 filesystem
On Array 2: sda6 through sdf6, create LVM logical
volume with an ext4 filesystem
Procedure:
Download and build the TrimTester SSD burn-in and TRIM
verification tool from Algolia (https://github.com/algolia/trimtester).
Run TrimTester against the filesystem on drbd0, wait
for corruption to occur
Run TrimTester against the non-drbd backed filesystem,
wait for corruption to occur
Results:
In multiple tests over a period of a month, TrimTester would report file
corruption when run against the DRBD volume after 700-900 GB of data had
been written. The error would usually appear within an hour or two.
However, when running it against the non-DRBD volume on the same
physical drives, no corruption would occur. We could let the burn-in run
for 15+ hours and write 20+ TB of data without a problem. Results were
the same with DRBD 8.4 and 9.0. We also tried disabling the TRIM-testing
part of TrimTester and using it as a simple burn-in tool, just to make
sure that SSD TRIM was not a factor.
Conclusion:
We are aware of some controversy surrounding the Samsung SSD 8XX series
drives; however, the issues related to that controversy were resolved
and no longer exist as of kernel 4.2. The 840 Pro drives are confirmed
to support RZAT. Also, the data corruption would only occur when writing
through the DRBD layer. It never occurred when bypassing the DRBD layer
and writing directly to the drives, so we must conclude that DRBD has a
data corruption bug under high write load. However, we would be more
than happy to be proved wrong.
I think the conclusion you've arrived at is not quite accurate. It could
be described more accurately as a possible data corruption issue
specific to drbd and/or the kernel, and trim commands as issued by the
TrimTester software. It appears TrimTester was written to debug a very
specific SSD trim issue when accessing SSDs directly. It would be
necessary to look at exactly what TrimTester is doing to determine if
this is a real bug or some anomaly due to what exactly the software is
doing. If there is an issue it is possible it could be either in in
drbd or the kernel. Just because the issue is present when drbd is used
does not automatically exclude the problem being in kernel, it may be an
issue that only arises in the kernel when drbd is used. What if you
would try with a newer kernel such as 4.9, would the same corruption
occur? Also, would the specific conditions created by TrimTester ever
be reproduced in a real world scenario, or is it something that will
only ever show up when running TrimTester?
_______________________________________________
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user