Re: [Gluster-devel] changelog bug

2016-02-08 Thread Emmanuel Dreyfus
On Mon, Feb 08, 2016 at 12:53:33AM -0500, Manikandan Selvaganesh wrote:
> Thanks and as you have mentioned, I have no clue how my changes 
> produced a core due to a NULL pointer in changelog. 

It is probably an unrelated bug that was nice enough to pop up here.

Too often people disregard NetBSD failures and just retrigger without
looking at the cause, but it has already proven its ability to expose
bugs that are unwilling to come to light in Linux regressions, but still
exist on Linux.

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rebalance data migration and corruption

2016-02-08 Thread Raghavendra Gowdappa


- Original Message -
> From: "Joe Julian" 
> To: gluster-devel@gluster.org
> Sent: Monday, February 8, 2016 12:20:27 PM
> Subject: Re: [Gluster-devel] Rebalance data migration and corruption
> 
> Is this in current release versions?

Yes. This bug is present in currently released versions. However, it can happen 
only if writes from application are happening to a file when it is being 
migrated. So, vaguely one can say probability is less.

> 
> On 02/07/2016 07:43 PM, Shyam wrote:
> > On 02/06/2016 06:36 PM, Raghavendra Gowdappa wrote:
> >>
> >>
> >> - Original Message -
> >>> From: "Raghavendra Gowdappa" 
> >>> To: "Sakshi Bansal" , "Susant Palai"
> >>> 
> >>> Cc: "Gluster Devel" , "Nithya
> >>> Balachandran" , "Shyamsundar
> >>> Ranganathan" 
> >>> Sent: Friday, February 5, 2016 4:32:40 PM
> >>> Subject: Re: Rebalance data migration and corruption
> >>>
> >>> +gluster-devel
> >>>
> 
>  Hi Sakshi/Susant,
> 
>  - There is a data corruption issue in migration code. Rebalance
>  process,
> 1. Reads data from src
> 2. Writes (say w1) it to dst
> 
> However, 1 and 2 are not atomic, so another write (say w2) to
>  same region
> can happen between 1. But these two writes can reach dst in the
>  order
> (w2,
> w1) resulting in a subtle corruption. This issue is not fixed
>  yet and can
> cause subtle data corruptions. The fix is simple and involves
>  rebalance
> process acquiring a mandatory lock to make 1 and 2 atomic.
> >>>
> >>> We can make use of compound fop framework to make sure we don't
> >>> suffer a
> >>> significant performance hit. Following will be the sequence of
> >>> operations
> >>> done by rebalance process:
> >>>
> >>> 1. issues a compound (mandatory lock, read) operation on src.
> >>> 2. writes this data to dst.
> >>> 3. issues unlock of lock acquired in 1.
> >>>
> >>> Please co-ordinate with Anuradha for implementation of this compound
> >>> fop.
> >>>
> >>> Following are the issues I see with this approach:
> >>> 1. features/locks provides mandatory lock functionality only for
> >>> posix-locks
> >>> (flock and fcntl based locks). So, mandatory locks will be
> >>> posix-locks which
> >>> will conflict with locks held by application. So, if an application
> >>> has held
> >>> an fcntl/flock, migration cannot proceed.
> >>
> >> We can implement a "special" domain for mandatory internal locks.
> >> These locks will behave similar to posix mandatory locks in that
> >> conflicting fops (like write, read) are blocked/failed if they are
> >> done while a lock is held.
> >>
> >>> 2. data migration will be less efficient because of an extra unlock
> >>> (with
> >>> compound lock + read) or extra lock and unlock (for non-compound fop
> >>> based
> >>> implementation) for every read it does from src.
> >>
> >> Can we use delegations here? Rebalance process can acquire a
> >> mandatory-write-delegation (an exclusive lock with a functionality
> >> that delegation is recalled when a write operation happens). In that
> >> case rebalance process, can do something like:
> >>
> >> 1. Acquire a read delegation for entire file.
> >> 2. Migrate the entire file.
> >> 3. Remove/unlock/give-back the delegation it has acquired.
> >>
> >> If a recall is issued from brick (when a write happens from mount),
> >> it completes the current write to dst (or throws away the read from
> >> src) to maintain atomicity. Before doing next set of (read, src) and
> >> (write, dst) tries to reacquire lock.
> >
> > With delegations this simplifies the normal path, when a file is
> > exclusively handled by rebalance. It also improves the case where a
> > client and rebalance are conflicting on a file, to degrade to
> > mandatory locks by either parties.
> >
> > I would prefer we take the delegation route for such needs in the future.
> >
> >>
> >> @Soumyak, can something like this be done with delegations?
> >>
> >> @Pranith,
> >> Afr does transactions for writing to its subvols. Can you suggest any
> >> optimizations here so that rebalance process can have a transaction
> >> for (read, src) and (write, dst) with minimal performance overhead?
> >>
> >> regards,
> >> Raghavendra.
> >>
> >>>
> >>> Comments?
> >>>
> 
>  regards,
>  Raghavendra.
> >>>
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [FAILED] NetBSD-regression for ./tests/basic/afr/self-heald.t

2016-02-08 Thread Emmanuel Dreyfus
On Mon, Feb 08, 2016 at 03:26:54PM +0530, Milind Changire wrote:
> https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/14089/consoleFull
> 
> 
> [08:44:20] ./tests/basic/afr/self-heald.t ..
> not ok 37 Got "0" instead of "1"
> not ok 52 Got "0" instead of "1"
> not ok 67
> Failed 4/83 subtests

There is a core but it is from NetBSD FUSE subsystem. The trace is
not helpful but suggests an abort() call because of unexpected 
situation:

Core was generated by `perfused'.
Program terminated with signal SIGABRT, Aborted.
#0  0xbb7574b7 in _lwp_kill () from /usr/lib/libc.so.12
(gdb) bt
#0  0xbb7574b7 in _lwp_kill () from /usr/lib/libc.so.12

/var/log/messages has a hint:
Feb  8 08:43:15 nbslave7c perfused: file write grow without resize

Indeed I have this assertion in NetBSD FUSE to catch a race condition. 
I think it is the first time I see hit raised, but I am unable to 
conclude on the cause. Let us retrigger (I did it) and see if someone 
else ever hit that again. The bug is more likely in NetBSD FUSE than 
in glusterfs.

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [FAILED] NetBSD-regression for ./tests/basic/afr/self-heald.t

2016-02-08 Thread Michael Scherer
Le lundi 08 février 2016 à 16:22 +0530, Pranith Kumar Karampuri a
écrit :
> 
> On 02/08/2016 04:16 PM, Ravishankar N wrote:
> > [Removing Milind, adding Pranith]
> >
> > On 02/08/2016 04:09 PM, Emmanuel Dreyfus wrote:
> >> On Mon, Feb 08, 2016 at 04:05:44PM +0530, Ravishankar N wrote:
> >>> The patch to add it to bad tests has already been merged, so I guess 
> >>> this
> >>> .t's failure won't pop up again.
> >> IMo that was a bit too quick.
> > I guess Pranith merged it because of last week's complaint for the 
> > same .t and not wanting to block other patches from being merged.
> 
> Yes, two people came to my desk and said their patches are blocked 
> because of this. So had to merge until we figure out the problem.

I suspect it would be better if people did use the list rather than
going to the desk, as it would help others who are either absent, in
another office or even not working in the same company be aware of the
issue.

next time this happen, can you direct people to gluster-devel ?

-- 
Michael Scherer
Sysadmin, Community Infrastructure and Platform, OSAS




signature.asc
Description: This is a digitally signed message part
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Rebalance data migration and corruption

2016-02-08 Thread Soumya Koduri



On 02/08/2016 09:13 AM, Shyam wrote:

On 02/06/2016 06:36 PM, Raghavendra Gowdappa wrote:



- Original Message -

From: "Raghavendra Gowdappa" 
To: "Sakshi Bansal" , "Susant Palai"

Cc: "Gluster Devel" , "Nithya
Balachandran" , "Shyamsundar
Ranganathan" 
Sent: Friday, February 5, 2016 4:32:40 PM
Subject: Re: Rebalance data migration and corruption

+gluster-devel



Hi Sakshi/Susant,

- There is a data corruption issue in migration code. Rebalance
process,
   1. Reads data from src
   2. Writes (say w1) it to dst

   However, 1 and 2 are not atomic, so another write (say w2) to
same region
   can happen between 1. But these two writes can reach dst in the
order
   (w2,
   w1) resulting in a subtle corruption. This issue is not fixed yet
and can
   cause subtle data corruptions. The fix is simple and involves
rebalance
   process acquiring a mandatory lock to make 1 and 2 atomic.


We can make use of compound fop framework to make sure we don't suffer a
significant performance hit. Following will be the sequence of
operations
done by rebalance process:

1. issues a compound (mandatory lock, read) operation on src.
2. writes this data to dst.
3. issues unlock of lock acquired in 1.

Please co-ordinate with Anuradha for implementation of this compound
fop.

Following are the issues I see with this approach:
1. features/locks provides mandatory lock functionality only for
posix-locks
(flock and fcntl based locks). So, mandatory locks will be
posix-locks which
will conflict with locks held by application. So, if an application
has held
an fcntl/flock, migration cannot proceed.


What if the file is opened with O_NONBLOCK? Cant rebalance process skip 
the file and continue in case if mandatory lock acquisition fails?




We can implement a "special" domain for mandatory internal locks.
These locks will behave similar to posix mandatory locks in that
conflicting fops (like write, read) are blocked/failed if they are
done while a lock is held.


So is the only difference between mandatory internal locks and posix 
mandatory locks is that internal locks shall not conflict with other 
application locks(advisory/mandatory)?





2. data migration will be less efficient because of an extra unlock
(with
compound lock + read) or extra lock and unlock (for non-compound fop
based
implementation) for every read it does from src.


Can we use delegations here? Rebalance process can acquire a
mandatory-write-delegation (an exclusive lock with a functionality
that delegation is recalled when a write operation happens). In that
case rebalance process, can do something like:

1. Acquire a read delegation for entire file.
2. Migrate the entire file.
3. Remove/unlock/give-back the delegation it has acquired.

If a recall is issued from brick (when a write happens from mount), it
completes the current write to dst (or throws away the read from src)
to maintain atomicity. Before doing next set of (read, src) and
(write, dst) tries to reacquire lock.


With delegations this simplifies the normal path, when a file is
exclusively handled by rebalance. It also improves the case where a
client and rebalance are conflicting on a file, to degrade to mandatory
locks by either parties.

I would prefer we take the delegation route for such needs in the future.

Right. But if there are simultaneous access to the same file from any 
other client and rebalance process, delegations shall not be granted or 
revoked if granted even though they are operating at different offsets. 
So if you rely only on delegations, migration may not proceed if an 
application has held a lock or doing any I/Os.


Also ideally rebalance process has to take write delegation as it would 
end up writing the data on destination brick which shall affect READ 
I/Os, (though of course we can have special checks/hacks for internal 
generated fops).


That said, having delegations shall definitely ensure correctness with 
respect to exclusive file access.


Thanks,
Soumya



@Soumyak, can something like this be done with delegations?

@Pranith,
Afr does transactions for writing to its subvols. Can you suggest any
optimizations here so that rebalance process can have a transaction
for (read, src) and (write, dst) with minimal performance overhead?

regards,
Raghavendra.



Comments?



regards,
Raghavendra.



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Reviewers needed for NSR client and server patches

2016-02-08 Thread Avra Sengupta

Hi,

We have two patches(mentioned below) for NSR client and NSR server 
available. These patches provide the basic client and server 
functionality as described in the design 
(https://docs.google.com/document/d/1bbxwjUmKNhA08wTmqJGkVd_KNCyaAMhpzx4dswokyyA/edit?usp=sharing). 
It would be great if people interested, could have a look at the patches 
and review them.


NSR Client patch : http://review.gluster.org/#/c/12388/
NSR Server patch : http://review.gluster.org/#/c/12705/

Regards,
Avra
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] [FAILED] NetBSD-regression for ./tests/basic/quota-anon-fd-nfs.t, ./tests/basic/tier/fops-during-migration.t, ./tests/basic/tier/record-metadata-heat.t

2016-02-08 Thread Milind Changire
http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/14096/consoleFull

[11:56:33] ./tests/basic/quota-anon-fd-nfs.t ..
not ok 21
not ok 22
not ok 24
not ok 26
not ok 28
not ok 30
not ok 32
not ok 34
not ok 36
Failed 9/40 subtests



[12:10:07] ./tests/basic/tier/fops-during-migration.t ..
not ok 22
Failed 1/22 subtests



[12:14:30] ./tests/basic/tier/record-metadata-heat.t ..
not ok 16 Got "no" instead of "yes"
Failed 1/18 subtests


Looks like some cores are available as well.


Please advise.


--

Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] [FAILED] NetBSD-regression for ./tests/basic/afr/self-heald.t

2016-02-08 Thread Milind Changire
https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/14089/consoleFull


[08:44:20] ./tests/basic/afr/self-heald.t ..
not ok 37 Got "0" instead of "1"
not ok 52 Got "0" instead of "1"
not ok 67
Failed 4/83 subtests


Please advise.

--

Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [FAILED] NetBSD-regression for ./tests/basic/afr/self-heald.t

2016-02-08 Thread Pranith Kumar Karampuri



On 02/08/2016 04:16 PM, Ravishankar N wrote:

[Removing Milind, adding Pranith]

On 02/08/2016 04:09 PM, Emmanuel Dreyfus wrote:

On Mon, Feb 08, 2016 at 04:05:44PM +0530, Ravishankar N wrote:
The patch to add it to bad tests has already been merged, so I guess 
this

.t's failure won't pop up again.

IMo that was a bit too quick.
I guess Pranith merged it because of last week's complaint for the 
same .t and not wanting to block other patches from being merged.


Yes, two people came to my desk and said their patches are blocked 
because of this. So had to merge until we figure out the problem.


Pranith

  What is the procedure to get out of the
list?

Usually, you just fix the problem with the testcase and send a patch 
with the fix and removing it from bad_tests. (For example 
http://review.gluster.org/13233)




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [FAILED] NetBSD-regression for ./tests/basic/afr/self-heald.t

2016-02-08 Thread Emmanuel Dreyfus
On Mon, Feb 08, 2016 at 04:05:44PM +0530, Ravishankar N wrote:
> The patch to add it to bad tests has already been merged, so I guess this
> .t's failure won't pop up again.

IMo that was a bit too quick. What is the procedure to get out of the
list?

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] How to cope with spurious regression failures

2016-02-08 Thread Raghavendra Talur
On Tue, Jan 19, 2016 at 8:33 PM, Emmanuel Dreyfus  wrote:

> On Tue, Jan 19, 2016 at 07:08:03PM +0530, Raghavendra Talur wrote:
> > a. Allowing re-running to tests to make them pass leads to complacency
> with
> > how tests are written.
> > b. A test is bad if it is not deterministic and running a bad test has
> *no*
> > value. We are wasting time even if the test runs for a few seconds.
>
> I agree with your vision for the long term, but my proposal address the
> short term situation. But we could use the retry approahc to fuel your
> blacklist approach:
>
> We could immagine a system where the retry feature would cast votes on
> individual tests: each time we fail once and succeed on retry, cast
> a +1 unreliable for the test.
>
> After a few days, we will have a wall of shame for unreliable tests,
> which could either be fixed or go to the blacklist.
>
> I do not know what software to use to collect and display the results,
> though. Should we have a gerrit change for each test?
>

This should be the process of adding tests to bad tests list. However, I
have run out of time on this one.
If someone would like to implement go ahead. I don't see myself trying this
any soon.


>
> --
> Emmanuel Dreyfus
> m...@netbsd.org



Thanks for the inputs.

I have refactored run-tests.sh to use retry option.
If run-tests.sh is started with -r flag, failed tests would be run once
again and won't be considered as failed if they pass. Note: Adding -r flag
to jenkins config is not done yet.

I have also implemented a better version of blacklist which complies with
requirements from Manu on granularity of bad tests to be OS.
Here is the patch: http://review.gluster.org/#/c/13393/
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [FAILED] NetBSD-regression for ./tests/basic/afr/self-heald.t

2016-02-08 Thread Ravishankar N

On 02/08/2016 03:37 PM, Emmanuel Dreyfus wrote:

On Mon, Feb 08, 2016 at 03:26:54PM +0530, Milind Changire wrote:

https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/14089/consoleFull


[08:44:20] ./tests/basic/afr/self-heald.t ..
not ok 37 Got "0" instead of "1"
not ok 52 Got "0" instead of "1"
not ok 67
Failed 4/83 subtests

There is a core but it is from NetBSD FUSE subsystem. The trace is
not helpful but suggests an abort() call because of unexpected
situation:

Core was generated by `perfused'.
Program terminated with signal SIGABRT, Aborted.
#0  0xbb7574b7 in _lwp_kill () from /usr/lib/libc.so.12
(gdb) bt
#0  0xbb7574b7 in _lwp_kill () from /usr/lib/libc.so.12

/var/log/messages has a hint:
Feb  8 08:43:15 nbslave7c perfused: file write grow without resize

Indeed I have this assertion in NetBSD FUSE to catch a race condition.
I think it is the first time I see hit raised, but I am unable to
conclude on the cause. Let us retrigger (I did it) and see if someone
else ever hit that again. The bug is more likely in NetBSD FUSE than
in glusterfs.

The .t has been added to bad tests for now @ 
http://review.gluster.org/#/c/13344/, so you can probably rebase your patch.
I'm not sure this is a problem with the case, the same issue was 
reported by Manikandan last week : 
https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/13895/consoleFull 

Is it one of those vndconfig errors? The .t seems to have skipped a few 
tests:


---
./tests/basic/afr/self-heald.t (Wstat: 0 Tests: 82 Failed: 3)
  Failed tests:  37, 52, 67
  Parse errors: Tests out of sequence.  Found (31) but expected (30)
Tests out of sequence.  Found (32) but expected (31)
Tests out of sequence.  Found (33) but expected (32)
Tests out of sequence.  Found (34) but expected (33)
Tests out of sequence.  Found (35) but expected (34)
Displayed the first 5 of 54 TAP syntax errors.
Re-run prove with the -p option to see them all.



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [FAILED] NetBSD-regression for ./tests/basic/afr/self-heald.t

2016-02-08 Thread Emmanuel Dreyfus
On Mon, Feb 08, 2016 at 10:26:22AM +, Emmanuel Dreyfus wrote:
> Indeed, same problem. But unfortunately it is not very reproductible since
> we need to make a full week of runs to see it again. I am tempted to
> just remove the assertion.

NB: this does not fail on stock NetBSD release: the assertion is only there
because FUSE is build with -DDEBUG on NetBSD slave VM. 

OTOH if it happens only in tests/basic/afr/self-heal.t I may be able to 
get it by looping on the test for a while. I will try this on nbslave70.

In the meatime if that one pops up too often and gets annoying, I can get
rid of it by just disabling debug mode.

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [FAILED] NetBSD-regression for ./tests/basic/afr/self-heald.t

2016-02-08 Thread Pranith Kumar Karampuri



On 02/08/2016 04:22 PM, Pranith Kumar Karampuri wrote:



On 02/08/2016 04:16 PM, Ravishankar N wrote:

[Removing Milind, adding Pranith]

On 02/08/2016 04:09 PM, Emmanuel Dreyfus wrote:

On Mon, Feb 08, 2016 at 04:05:44PM +0530, Ravishankar N wrote:
The patch to add it to bad tests has already been merged, so I 
guess this

.t's failure won't pop up again.

IMo that was a bit too quick.
I guess Pranith merged it because of last week's complaint for the 
same .t and not wanting to block other patches from being merged.


Yes, two people came to my desk and said their patches are blocked 
because of this. So had to merge until we figure out the problem.


Patch is from last week though.

Pranith


Pranith

  What is the procedure to get out of the
list?

Usually, you just fix the problem with the testcase and send a patch 
with the fix and removing it from bad_tests. (For example 
http://review.gluster.org/13233)




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [FAILED] NetBSD-regression for ./tests/basic/afr/self-heald.t

2016-02-08 Thread Emmanuel Dreyfus
On Mon, Feb 08, 2016 at 03:44:43PM +0530, Ravishankar N wrote:
> The .t has been added to bad tests for now @

I am note sure this is relevant: does it fails again? I am very interested
if it is reproductible.

> http://review.gluster.org/#/c/13344/, so you can probably rebase your patch.
> I'm not sure this is a problem with the case, the same issue was reported by
> Manikandan last week : 
> https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/13895/consoleFull

Indeed, same problem. But unfortunately it is not very reproductible since
we need to make a full week of runs to see it again. I am tempted to
just remove the assertion.

> Is it one of those vndconfig errors? The .t seems to have skipped a few
> tests:

This is because FUSE went away during the test.
The vnconfig problems are fixed now and should not happen anymore.
> 

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [FAILED] NetBSD-regression for ./tests/basic/afr/self-heald.t

2016-02-08 Thread Ravishankar N

On 02/08/2016 04:00 PM, Emmanuel Dreyfus wrote:

On Mon, Feb 08, 2016 at 10:26:22AM +, Emmanuel Dreyfus wrote:

Indeed, same problem. But unfortunately it is not very reproductible since
we need to make a full week of runs to see it again. I am tempted to
just remove the assertion.

NB: this does not fail on stock NetBSD release: the assertion is only there
because FUSE is build with -DDEBUG on NetBSD slave VM.

OTOH if it happens only in tests/basic/afr/self-heal.t I may be able to
get it by looping on the test for a while. I will try this on nbslave70.

Thanks Emmanuel!


In the meatime if that one pops up too often and gets annoying, I can get
rid of it by just disabling debug mode.

The patch to add it to bad tests has already been merged, so I guess 
this .t's failure won't pop up again.


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [FAILED] NetBSD-regression for ./tests/basic/afr/self-heald.t

2016-02-08 Thread Ravishankar N

[Removing Milind, adding Pranith]

On 02/08/2016 04:09 PM, Emmanuel Dreyfus wrote:

On Mon, Feb 08, 2016 at 04:05:44PM +0530, Ravishankar N wrote:

The patch to add it to bad tests has already been merged, so I guess this
.t's failure won't pop up again.

IMo that was a bit too quick.
I guess Pranith merged it because of last week's complaint for the same 
.t and not wanting to block other patches from being merged.

  What is the procedure to get out of the
list?

Usually, you just fix the problem with the testcase and send a patch 
with the fix and removing it from bad_tests. (For example 
http://review.gluster.org/13233)


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rebalance data migration and corruption

2016-02-08 Thread Raghavendra G
On Mon, Feb 8, 2016 at 4:31 PM, Soumya Koduri  wrote:

>
>
> On 02/08/2016 09:13 AM, Shyam wrote:
>
>> On 02/06/2016 06:36 PM, Raghavendra Gowdappa wrote:
>>
>>>
>>>
>>> - Original Message -
>>>
 From: "Raghavendra Gowdappa" 
 To: "Sakshi Bansal" , "Susant Palai"
 
 Cc: "Gluster Devel" , "Nithya
 Balachandran" , "Shyamsundar
 Ranganathan" 
 Sent: Friday, February 5, 2016 4:32:40 PM
 Subject: Re: Rebalance data migration and corruption

 +gluster-devel


> Hi Sakshi/Susant,
>
> - There is a data corruption issue in migration code. Rebalance
> process,
>1. Reads data from src
>2. Writes (say w1) it to dst
>
>However, 1 and 2 are not atomic, so another write (say w2) to
> same region
>can happen between 1. But these two writes can reach dst in the
> order
>(w2,
>w1) resulting in a subtle corruption. This issue is not fixed yet
> and can
>cause subtle data corruptions. The fix is simple and involves
> rebalance
>process acquiring a mandatory lock to make 1 and 2 atomic.
>

 We can make use of compound fop framework to make sure we don't suffer a
 significant performance hit. Following will be the sequence of
 operations
 done by rebalance process:

 1. issues a compound (mandatory lock, read) operation on src.
 2. writes this data to dst.
 3. issues unlock of lock acquired in 1.

 Please co-ordinate with Anuradha for implementation of this compound
 fop.

 Following are the issues I see with this approach:
 1. features/locks provides mandatory lock functionality only for
 posix-locks
 (flock and fcntl based locks). So, mandatory locks will be
 posix-locks which
 will conflict with locks held by application. So, if an application
 has held
 an fcntl/flock, migration cannot proceed.

>>>
> What if the file is opened with O_NONBLOCK? Cant rebalance process skip
> the file and continue in case if mandatory lock acquisition fails?


Similar functionality can be achieved by acquiring non-blocking inodelk
like SETLK (as opposed to SETLKW). However whether rebalance process should
block or not depends on the use case. In Some use-cases (like remove-brick)
rebalance process _has_ to migrate all the files. Even for other scenarios
skipping too many files is not a good idea as it beats the purpose of
running rebalance. So one of the design goals is to migrate as many files
as possible without making design too complex.


>
>
>>> We can implement a "special" domain for mandatory internal locks.
>>> These locks will behave similar to posix mandatory locks in that
>>> conflicting fops (like write, read) are blocked/failed if they are
>>> done while a lock is held.
>>>
>>
> So is the only difference between mandatory internal locks and posix
> mandatory locks is that internal locks shall not conflict with other
> application locks(advisory/mandatory)?


Yes. Mandatory internal locks (aka Mandatory inodelk for this discussion)
will conflict only in their domain. They also conflict with any fops that
might change the file (primarily write here, but different fops can be
added based on requirement). So in a fop like writev we need to check in
two lists - external lock (posix lock) list _and_ mandatory inodelk list.

The reason (if not clear) for using mandatory locks by rebalance process is
that clients need not be bothered with acquiring a lock (which will
unnecessarily degrade performance of I/O when there is no rebalance going
on). Thanks to Raghavendra Talur for suggesting this idea (though in a
different context of lock migration, but the use-cases are similar).


>
>
>>> 2. data migration will be less efficient because of an extra unlock
 (with
 compound lock + read) or extra lock and unlock (for non-compound fop
 based
 implementation) for every read it does from src.

>>>
>>> Can we use delegations here? Rebalance process can acquire a
>>> mandatory-write-delegation (an exclusive lock with a functionality
>>> that delegation is recalled when a write operation happens). In that
>>> case rebalance process, can do something like:
>>>
>>> 1. Acquire a read delegation for entire file.
>>> 2. Migrate the entire file.
>>> 3. Remove/unlock/give-back the delegation it has acquired.
>>>
>>> If a recall is issued from brick (when a write happens from mount), it
>>> completes the current write to dst (or throws away the read from src)
>>> to maintain atomicity. Before doing next set of (read, src) and
>>> (write, dst) tries to reacquire lock.
>>>
>>
>> With delegations this simplifies the normal path, when a file is
>> exclusively handled by rebalance. It also improves the case where a
>> client and rebalance are 

Re: [Gluster-devel] Cores on NetBSD of brick https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/14100/consoleFull

2016-02-08 Thread Pranith Kumar Karampuri



On 02/08/2016 08:20 PM, Emmanuel Dreyfus wrote:

On Mon, Feb 08, 2016 at 07:27:46PM +0530, Pranith Kumar Karampuri wrote:

   I don't see any logs in the archive. Did we change something?

I think thay are in a different tarball, in /archives/logs
I think the regression run is not giving that link anymore when the 
crash happens? Could you please add that also as a link in regression run?


Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rebalance data migration and corruption

2016-02-08 Thread Raghavendra G
>>Right. But if there are simultaneous access to the same file from

> any other client and rebalance process, delegations shall not be
>> granted or revoked if granted even though they are operating at
>> different offsets. So if you rely only on delegations, migration may
>> not proceed if an application has held a lock or doing any I/Os.
>>
>>
>> Does the brick process wait for the response of delegation holder
>> (rebalance process here) before it wipes out the delegation/locks? If
>> that's the case, rebalance process can complete one transaction of
>> (read, src) and (write, dst) before responding to a delegation recall.
>> That way there is no starvation for both applications and rebalance
>> process (though this makes both of them slower, but that cannot helped I
>> think).
>>
>
> yes. Brick process should wait for certain period before revoking the
> delegations forcefully in case if it is not returned by the client. Also if
> required (like done by NFS servers) we can choose to increase this timeout
> value at run time if the client is diligently flushing the data.


hmm.. I would prefer an infinite timeout. The only scenario where brick
process can forcefully flush leases would be connection lose with rebalance
process. The more scenarios where brick can flush leases without knowledge
of rebalance process, we open up more race-windows for this bug to occur.

In fact at least in theory to be correct, rebalance process should replay
all the transactions that happened during the lease which got flushed out
by brick (after re-acquiring that lease). So, we would like to avoid any
such scenarios.

Btw, what is the necessity of timeouts? Is it an insurance against rogue
clients who won't respond back to lease recalls?
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Rebalance data migration and corruption

2016-02-08 Thread Raghavendra Gowdappa


- Original Message -
> From: "Joe Julian" 
> To: "Raghavendra Gowdappa" 
> Cc: gluster-devel@gluster.org
> Sent: Monday, February 8, 2016 9:08:45 PM
> Subject: Re: [Gluster-devel] Rebalance data migration and corruption
> 
> 
> 
> On 02/08/2016 12:18 AM, Raghavendra Gowdappa wrote:
> >
> > - Original Message -
> >> From: "Joe Julian" 
> >> To: gluster-devel@gluster.org
> >> Sent: Monday, February 8, 2016 12:20:27 PM
> >> Subject: Re: [Gluster-devel] Rebalance data migration and corruption
> >>
> >> Is this in current release versions?
> > Yes. This bug is present in currently released versions. However, it can
> > happen only if writes from application are happening to a file when it is
> > being migrated. So, vaguely one can say probability is less.
> 
> Probability is quite high when the volume is used for VM images, which
> many are.

The primary requirement for this corruption is that file should be under 
migration. Given that rebalance is done only during add/remove brick scenarios 
(or may be as a routine housekeeping to make lookups faster), I added that 
probability is lower. However, this will not be the case with tier where files 
can be under constant promotion/demotion because of access patterns. If there 
is a constant migration, dht too is susceptible to this bug with similar 
probability.

> 
> >
> >> On 02/07/2016 07:43 PM, Shyam wrote:
> >>> On 02/06/2016 06:36 PM, Raghavendra Gowdappa wrote:
> 
>  - Original Message -
> > From: "Raghavendra Gowdappa" 
> > To: "Sakshi Bansal" , "Susant Palai"
> > 
> > Cc: "Gluster Devel" , "Nithya
> > Balachandran" , "Shyamsundar
> > Ranganathan" 
> > Sent: Friday, February 5, 2016 4:32:40 PM
> > Subject: Re: Rebalance data migration and corruption
> >
> > +gluster-devel
> >
> >> Hi Sakshi/Susant,
> >>
> >> - There is a data corruption issue in migration code. Rebalance
> >> process,
> >> 1. Reads data from src
> >> 2. Writes (say w1) it to dst
> >>
> >> However, 1 and 2 are not atomic, so another write (say w2) to
> >> same region
> >> can happen between 1. But these two writes can reach dst in the
> >> order
> >> (w2,
> >> w1) resulting in a subtle corruption. This issue is not fixed
> >> yet and can
> >> cause subtle data corruptions. The fix is simple and involves
> >> rebalance
> >> process acquiring a mandatory lock to make 1 and 2 atomic.
> > We can make use of compound fop framework to make sure we don't
> > suffer a
> > significant performance hit. Following will be the sequence of
> > operations
> > done by rebalance process:
> >
> > 1. issues a compound (mandatory lock, read) operation on src.
> > 2. writes this data to dst.
> > 3. issues unlock of lock acquired in 1.
> >
> > Please co-ordinate with Anuradha for implementation of this compound
> > fop.
> >
> > Following are the issues I see with this approach:
> > 1. features/locks provides mandatory lock functionality only for
> > posix-locks
> > (flock and fcntl based locks). So, mandatory locks will be
> > posix-locks which
> > will conflict with locks held by application. So, if an application
> > has held
> > an fcntl/flock, migration cannot proceed.
>  We can implement a "special" domain for mandatory internal locks.
>  These locks will behave similar to posix mandatory locks in that
>  conflicting fops (like write, read) are blocked/failed if they are
>  done while a lock is held.
> 
> > 2. data migration will be less efficient because of an extra unlock
> > (with
> > compound lock + read) or extra lock and unlock (for non-compound fop
> > based
> > implementation) for every read it does from src.
>  Can we use delegations here? Rebalance process can acquire a
>  mandatory-write-delegation (an exclusive lock with a functionality
>  that delegation is recalled when a write operation happens). In that
>  case rebalance process, can do something like:
> 
>  1. Acquire a read delegation for entire file.
>  2. Migrate the entire file.
>  3. Remove/unlock/give-back the delegation it has acquired.
> 
>  If a recall is issued from brick (when a write happens from mount),
>  it completes the current write to dst (or throws away the read from
>  src) to maintain atomicity. Before doing next set of (read, src) and
>  (write, dst) tries to reacquire lock.
> >>> With delegations this simplifies the normal path, when a file is
> >>> exclusively handled by rebalance. It also improves the case where a
> >>> client and rebalance are conflicting on a file, to 

Re: [Gluster-devel] Rebalance data migration and corruption

2016-02-08 Thread Soumya Koduri



On 02/09/2016 10:27 AM, Raghavendra G wrote:



On Mon, Feb 8, 2016 at 4:31 PM, Soumya Koduri > wrote:



On 02/08/2016 09:13 AM, Shyam wrote:

On 02/06/2016 06:36 PM, Raghavendra Gowdappa wrote:



- Original Message -

From: "Raghavendra Gowdappa" >
To: "Sakshi Bansal" >, "Susant Palai"
>
Cc: "Gluster Devel" >, "Nithya
Balachandran" >, "Shyamsundar
Ranganathan" >
Sent: Friday, February 5, 2016 4:32:40 PM
Subject: Re: Rebalance data migration and corruption

+gluster-devel


Hi Sakshi/Susant,

- There is a data corruption issue in migration
code. Rebalance
process,
1. Reads data from src
2. Writes (say w1) it to dst

However, 1 and 2 are not atomic, so another
write (say w2) to
same region
can happen between 1. But these two writes can
reach dst in the
order
(w2,
w1) resulting in a subtle corruption. This issue
is not fixed yet
and can
cause subtle data corruptions. The fix is simple
and involves
rebalance
process acquiring a mandatory lock to make 1 and
2 atomic.


We can make use of compound fop framework to make sure
we don't suffer a
significant performance hit. Following will be the
sequence of
operations
done by rebalance process:

1. issues a compound (mandatory lock, read) operation on
src.
2. writes this data to dst.
3. issues unlock of lock acquired in 1.

Please co-ordinate with Anuradha for implementation of
this compound
fop.

Following are the issues I see with this approach:
1. features/locks provides mandatory lock functionality
only for
posix-locks
(flock and fcntl based locks). So, mandatory locks will be
posix-locks which
will conflict with locks held by application. So, if an
application
has held
an fcntl/flock, migration cannot proceed.


What if the file is opened with O_NONBLOCK? Cant rebalance process
skip the file and continue in case if mandatory lock acquisition fails?


Similar functionality can be achieved by acquiring non-blocking inodelk
like SETLK (as opposed to SETLKW). However whether rebalance process
should block or not depends on the use case. In Some use-cases (like
remove-brick) rebalance process _has_ to migrate all the files. Even for
other scenarios skipping too many files is not a good idea as it beats
the purpose of running rebalance. So one of the design goals is to
migrate as many files as possible without making design too complex.




We can implement a "special" domain for mandatory internal
locks.
These locks will behave similar to posix mandatory locks in that
conflicting fops (like write, read) are blocked/failed if
they are
done while a lock is held.


So is the only difference between mandatory internal locks and posix
mandatory locks is that internal locks shall not conflict with other
application locks(advisory/mandatory)?


Yes. Mandatory internal locks (aka Mandatory inodelk for this
discussion) will conflict only in their domain. They also conflict with
any fops that might change the file (primarily write here, but different
fops can be added based on requirement). So in a fop like writev we need
to check in two lists - external lock (posix lock) list _and_ mandatory
inodelk list.

The reason (if not clear) for using mandatory locks by rebalance process
is that clients need not be bothered with acquiring a lock (which will
unnecessarily degrade performance of I/O when there is no rebalance
going on). Thanks to Raghavendra Talur for suggesting this idea (though
in a different context of lock migration, but the use-cases are similar).




2. 

[Gluster-devel] Cores on NetBSD of brick https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/14100/consoleFull

2016-02-08 Thread Pranith Kumar Karampuri

Emmanuel,
  I don't see any logs in the archive. Did we change something?

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [FAILED] NetBSD-regression for ./tests/basic/afr/self-heald.t

2016-02-08 Thread Pranith Kumar Karampuri



On 02/08/2016 05:04 PM, Michael Scherer wrote:

Le lundi 08 février 2016 à 16:22 +0530, Pranith Kumar Karampuri a
écrit :

On 02/08/2016 04:16 PM, Ravishankar N wrote:

[Removing Milind, adding Pranith]

On 02/08/2016 04:09 PM, Emmanuel Dreyfus wrote:

On Mon, Feb 08, 2016 at 04:05:44PM +0530, Ravishankar N wrote:

The patch to add it to bad tests has already been merged, so I guess
this
.t's failure won't pop up again.

IMo that was a bit too quick.

I guess Pranith merged it because of last week's complaint for the
same .t and not wanting to block other patches from being merged.

Yes, two people came to my desk and said their patches are blocked
because of this. So had to merge until we figure out the problem.

I suspect it would be better if people did use the list rather than
going to the desk, as it would help others who are either absent, in
another office or even not working in the same company be aware of the
issue.

next time this happen, can you direct people to gluster-devel ?

Will do :-).

Pranith




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [FAILED] NetBSD-regression for ./tests/basic/quota-anon-fd-nfs.t, ./tests/basic/tier/fops-during-migration.t, ./tests/basic/tier/record-metadata-heat.t

2016-02-08 Thread Emmanuel Dreyfus
On Mon, Feb 08, 2016 at 06:25:09PM +0530, Milind Changire wrote:
> Looks like some cores are available as well.
> Please advise.

#0  0xb99912b4 in gf_changelog_reborp_rpcsvc_notify (rpc=0xb7b160f0, 
mydata=0xb7b1a830, event=RPCSVC_EVENT_ACCEPT, data=0xb76a4030)
at 
/home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered/xlators/features/changelog/lib/src/gf-changelog-reborp.c:110
110 return 0;

Crash on return: That smells like stack coruption.



-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Cores on NetBSD of brick https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/14100/consoleFull

2016-02-08 Thread Emmanuel Dreyfus
On Mon, Feb 08, 2016 at 07:27:46PM +0530, Pranith Kumar Karampuri wrote:
>   I don't see any logs in the archive. Did we change something?

I think thay are in a different tarball, in /archives/logs
-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rebalance data migration and corruption

2016-02-08 Thread Joe Julian



On 02/08/2016 12:18 AM, Raghavendra Gowdappa wrote:


- Original Message -

From: "Joe Julian" 
To: gluster-devel@gluster.org
Sent: Monday, February 8, 2016 12:20:27 PM
Subject: Re: [Gluster-devel] Rebalance data migration and corruption

Is this in current release versions?

Yes. This bug is present in currently released versions. However, it can happen 
only if writes from application are happening to a file when it is being 
migrated. So, vaguely one can say probability is less.


Probability is quite high when the volume is used for VM images, which 
many are.





On 02/07/2016 07:43 PM, Shyam wrote:

On 02/06/2016 06:36 PM, Raghavendra Gowdappa wrote:


- Original Message -

From: "Raghavendra Gowdappa" 
To: "Sakshi Bansal" , "Susant Palai"

Cc: "Gluster Devel" , "Nithya
Balachandran" , "Shyamsundar
Ranganathan" 
Sent: Friday, February 5, 2016 4:32:40 PM
Subject: Re: Rebalance data migration and corruption

+gluster-devel


Hi Sakshi/Susant,

- There is a data corruption issue in migration code. Rebalance
process,
1. Reads data from src
2. Writes (say w1) it to dst

However, 1 and 2 are not atomic, so another write (say w2) to
same region
can happen between 1. But these two writes can reach dst in the
order
(w2,
w1) resulting in a subtle corruption. This issue is not fixed
yet and can
cause subtle data corruptions. The fix is simple and involves
rebalance
process acquiring a mandatory lock to make 1 and 2 atomic.

We can make use of compound fop framework to make sure we don't
suffer a
significant performance hit. Following will be the sequence of
operations
done by rebalance process:

1. issues a compound (mandatory lock, read) operation on src.
2. writes this data to dst.
3. issues unlock of lock acquired in 1.

Please co-ordinate with Anuradha for implementation of this compound
fop.

Following are the issues I see with this approach:
1. features/locks provides mandatory lock functionality only for
posix-locks
(flock and fcntl based locks). So, mandatory locks will be
posix-locks which
will conflict with locks held by application. So, if an application
has held
an fcntl/flock, migration cannot proceed.

We can implement a "special" domain for mandatory internal locks.
These locks will behave similar to posix mandatory locks in that
conflicting fops (like write, read) are blocked/failed if they are
done while a lock is held.


2. data migration will be less efficient because of an extra unlock
(with
compound lock + read) or extra lock and unlock (for non-compound fop
based
implementation) for every read it does from src.

Can we use delegations here? Rebalance process can acquire a
mandatory-write-delegation (an exclusive lock with a functionality
that delegation is recalled when a write operation happens). In that
case rebalance process, can do something like:

1. Acquire a read delegation for entire file.
2. Migrate the entire file.
3. Remove/unlock/give-back the delegation it has acquired.

If a recall is issued from brick (when a write happens from mount),
it completes the current write to dst (or throws away the read from
src) to maintain atomicity. Before doing next set of (read, src) and
(write, dst) tries to reacquire lock.

With delegations this simplifies the normal path, when a file is
exclusively handled by rebalance. It also improves the case where a
client and rebalance are conflicting on a file, to degrade to
mandatory locks by either parties.

I would prefer we take the delegation route for such needs in the future.


@Soumyak, can something like this be done with delegations?

@Pranith,
Afr does transactions for writing to its subvols. Can you suggest any
optimizations here so that rebalance process can have a transaction
for (read, src) and (write, dst) with minimal performance overhead?

regards,
Raghavendra.


Comments?


regards,
Raghavendra.

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel