Re: [Gluster-users] gnfs split brain when 1 server in 3x1 down (high load) - help request

2020-04-15 Thread Erik Jacobson
It is important to note that our testing has shown zero split-brain errors since the patch... And that it is significantly harder to hit the seg fault than it was to hit split-brain before. It's still sufficiently frequent that we can't let it out the door. In my intensive test case (found elsewhe

Re: [Gluster-users] gnfs split brain when 1 server in 3x1 down (high load) - help request

2020-04-15 Thread Erik Jacobson
> The new split-brain issue is much harder to reproduce, but after several (correcting to say new seg fault issue, the split brain is gone!!) > intense runs, it usually hits once. > > We switched to pure gluster74 plus your patch so we're apples to apples > now. > > I'm going to see if Scott ca

Re: [Gluster-users] gnfs split brain when 1 server in 3x1 down (high load) - help request

2020-04-15 Thread Erik Jacobson
The new split-brain issue is much harder to reproduce, but after several intense runs, it usually hits once. We switched to pure gluster74 plus your patch so we're apples to apples now. I'm going to see if Scott can help debug it. Here is the back trace info from the core dump: -rw-r- 1 ro

Re: [Gluster-users] gnfs split brain when 1 server in 3x1 down (high load) - help request

2020-04-15 Thread Erik Jacobson
After several successful runs of the test case, we thought we were solved. Indeed, split-brain is gone. But we're triggering a seg fault now, even in a less loaded case. We're going to switch to gluster74, which was your intention, and report back. On Wed, Apr 15, 2020 at 10:33:01AM -0500, Erik

Re: [Gluster-users] gnfs split brain when 1 server in 3x1 down (high load) - help request

2020-04-15 Thread Erik Jacobson
> Attached the wrong patch by mistake in my previous mail. Sending the correct > one now. Early results loook GREAT !! We'll keep beating on it. We applied it to glsuter72 as that is what we have to ship with. It applied fine with some line moves. If you would like us to also run a test with glu

Re: [Gluster-users] gnfs split brain when 1 server in 3x1 down (high load) - help request

2020-04-15 Thread Ravishankar N
Attached the wrong patch by mistake in my previous mail. Sending the correct one now. -Ravi On 15/04/20 2:05 pm, Ravishankar N wrote: On 10/04/20 2:06 am, Erik Jacobson wrote: Once again thanks for sticking with us. Here is a reply from Scott Titus. If you have something for us to try, we'd

Re: [Gluster-users] gnfs split brain when 1 server in 3x1 down (high load) - help request

2020-04-15 Thread Ravishankar N
On 10/04/20 2:06 am, Erik Jacobson wrote: Once again thanks for sticking with us. Here is a reply from Scott Titus. If you have something for us to try, we'd love it. The code had your patch applied when gdb was run: Here is the addr2line output for those addresses. Very interesting command,