Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-23 Thread Roland Kammerer
On Thu, Oct 19, 2017 at 05:54:35PM -0500, Robert L wrote: > Greetings, I am a new member on this list and I've been looking into > this problem also. I understand that this is not a bug in DRBD, but > that the resource usage patterns of DRBD can affect the outcome of this > trimtester program. My

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-23 Thread Robert L
Greetings, I am a new member on this list and I've been looking into this problem also. I understand that this is not a bug in DRBD, but that the resource usage patterns of DRBD can affect the outcome of this trimtester program. My main goal is to eliminate false positives. I've implemented a di

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-19 Thread Eric Robinson
> > the source code of the TrimTester tool, and that it had even been > > reviewed by Samsung engineers before, and that is the part that I find > most worrying. I missed your comment about Samsung. I have no explanation for why Samsung would have missed that bug. --Eric

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-19 Thread Eric Robinson
> > However, there are no zeroes anywhere else in the file except at the > > end of every sequence. > > > > I guess it is safe to conclude that your theory is right. The file > > does not appear to really be corrupted. The TrimTester tool is > > reporting a false positive. > > > > --Eric > > > I th

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-19 Thread Robert Altnoeder
On 10/18/2017 07:32 PM, Eric Robinson wrote: > > Okay, here’s the latest. > > [...] > > However, there are no zeroes anywhere else in the file except at the > end of every sequence. > > I guess it is safe to conclude that your theory is right. The file > does not appear to really be corrupted. The

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-18 Thread Eric Robinson
[mailto:drbd-user-boun...@lists.linbit.com] On Behalf Of Lars Ellenberg Sent: Tuesday, October 17, 2017 9:24 AM To: drbd-user@lists.linbit.com Subject: Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0 > Most importantly: once the trimtester (or *any* "corruption d

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-17 Thread Lars Ellenberg
> Most importantly: once the trimtester (or *any* "corruption detecting" > tool) claims that a certain corruption is found, you look at what supposedly is > corrupt, and double check if it in fact is. > > Before doing anything else. > I did that, but I don't know what a "good" file is supposed to

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-17 Thread Eric Robinson
> On Tue, Oct 17, 2017 at 02:46:37PM +, Eric Robinson wrote: > > Guys, I think we have an important new development. Last night I put > > both nodes into standalone mode and promoted the secondary to primary. > > This effectively gave me two identical test platforms, running > > disconnected, b

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-17 Thread Lars Ellenberg
On Tue, Oct 17, 2017 at 02:46:37PM +, Eric Robinson wrote: > Guys, I think we have an important new development. Last night I put > both nodes into standalone mode and promoted the secondary to primary. > This effectively gave me two identical test platforms, running > disconnected, both writin

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-17 Thread Eric Robinson
PM > To: jan.baku...@gmail.com; drbd-user@lists.linbit.com > Subject: Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD > 8.4 and 9.0 > > > > Well, damn. As the program was supposedly reviewed by Samsung > > engineers as part of their efforts to diagnose the ro

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-16 Thread Eric Robinson
> > Well, damn. As the program was supposedly reviewed by Samsung > engineers as part of their efforts to diagnose the root cause of TRIM errors, > it > never occurred to me that it was that buggy. I can't thank you enough for > finding that! The rollout of some new DRBD clusters has been on hold

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-16 Thread Jan Bakuwel
Hi Eric & Lars, Well, damn. As the program was supposedly reviewed by Samsung engineers as part of their efforts to diagnose the root cause of TRIM errors, it never occurred to me that it was that buggy. I can't thank you enough for finding that! The rollout of some new DRBD clusters has been

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-16 Thread Lars Ellenberg
On Mon, Oct 16, 2017 at 09:35:40PM +, Eric Robinson wrote: > Well, damn. As the program was supposedly reviewed by Samsung > engineers as part of their efforts to diagnose the root cause of TRIM > errors, it never occurred to me that it was that buggy. I can't thank > you enough for finding tha

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-16 Thread Eric Robinson
> -Original Message- > From: drbd-user-boun...@lists.linbit.com [mailto:drbd-user- > boun...@lists.linbit.com] On Behalf Of Lars Ellenberg > Sent: Saturday, October 14, 2017 1:05 PM > To: drbd-user@lists.linbit.com > Subject: Re: [DRBD-user] Warning: Data Corruption I

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-14 Thread Lars Ellenberg
On Thu, Oct 12, 2017 at 11:14:55AM +0200, Robert Altnoeder wrote: > On 10/11/2017 11:30 PM, Eric Robinson wrote: > > The TrimTester program consists of three parts. The main executable > > (TrimTester) just writes loads of data to the drive and tests for file > > corruption. My C++ consultant says,

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-13 Thread Eric Robinson
Robinson > Sent: Friday, October 13, 2017 11:31 AM > To: Lars Ellenberg ; drbd-user@lists.linbit.com > Subject: Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD > 8.4 and 9.0 > > > First, too "all of you", > > if someone has some sp

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-13 Thread Eric Robinson
> First, too "all of you", > if someone has some spare hardware and is willing to run the test as > suggested by Eric, please do so. > Both "no corruption reported after X iterations" and "corruption reported > after X iterations" is important feedback. > (State the platform and hardware and storag

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-13 Thread Lars Ellenberg
First, too "all of you", if someone has some spare hardware and is willing to run the test as suggested by Eric, please do so. Both "no corruption reported after X iterations" and "corruption reported after X iterations" is important feedback. (State the platform and hardware and storage subsystem

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-12 Thread Eric Robinson
> Are you referring to this program? > https://github.com/algolia/trimtester/blob/master/trimtester.cpp > Yes, that is the program. > One thing that I can tell you right away is that this program does not appear > to > be very trustworthy, because it may malfunction due to the use of incorrect

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-12 Thread Gandalf Corvotempesta
Do you have any suggestions about software that will check for data corruption by continuously stress a storage ? (Like creating files, read them back, move around, renanes, deletes and so on) Il 12 ott 2017 11:15 AM, "Robert Altnoeder" ha scritto: > On 10/11/2017 11:30 PM, Eric Robinson wrote:

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-12 Thread Robert Altnoeder
On 10/11/2017 11:30 PM, Eric Robinson wrote: > The TrimTester program consists of three parts. The main executable > (TrimTester) just writes loads of data to the drive and tests for file > corruption. My C++ consultant says, "It writes sequential numbers > wrapped at 256, spanning multiple files.

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-11 Thread Eric Robinson
t; Subject: RE: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD > 8.4 and 9.0 > > Hi Lars - > > I'm finally back from my trip and eager to get rolling on this. > > >Interesting. > >Actually, alarming. > > Glad we agree on that! > > &

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-11 Thread Eric Robinson
Hi Lars - I'm finally back from my trip and eager to get rolling on this. >Interesting. >Actually, alarming. Glad we agree on that! > Which *exact* DRBD module versions, identified by their git commit ids? Does this answer your question? ha11a:~ # modinfo drbd filename: /lib/modules/4.

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-05 Thread Eric Robinson
-user-boun...@lists.linbit.com] On Behalf Of Lars Ellenberg Sent: Tuesday, October 3, 2017 12:43 AM To: drbd-user@lists.linbit.com Subject: Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0 On Mon, Sep 25, 2017 at 09:02:57PM +, Eric Robinson wrote: > Problem: > > Un

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-03 Thread Lars Ellenberg
On Mon, Sep 25, 2017 at 09:02:57PM +, Eric Robinson wrote: > Problem: > > Under high write load, DRBD exhibits data corruption. In repeated > tests over a month-long period, file corruption occurred after 700-900 > GB of data had been written to the DRBD volume. Interesting. Actually, alarmin

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-09-26 Thread Eric Robinson
> I think the conclusion you've arrived at is not quite accurate. It could be > described more accurately as a possible data corruption issue specific to drbd > and/or the kernel, and trim commands as issued by the TrimTester software. It > appears TrimTester was written to debug a very specific SS

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-09-25 Thread Eddie Chapman
On 25/09/17 22:02, Eric Robinson wrote: Problem: Under high write load, DRBD exhibits data corruption. In repeated tests over a month-long period, file corruption occurred after 700-900 GB of data had been written to the DRBD volume. Testing Platform: 2 x Dell PowerEdge R610 servers 32GB

[DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-09-25 Thread Eric Robinson
Problem: Under high write load, DRBD exhibits data corruption. In repeated tests over a month-long period, file corruption occurred after 700-900 GB of data had been written to the DRBD volume. Testing Platform: 2 x Dell PowerEdge R610 servers 32GB RAM 6 x Samsung SSD 840 Pro 512GB (latest fir