Re: [Gluster-devel] Gerrit review, submit type and Jenkins testing

2016-01-11 Thread Michael Adam
On 2016-01-08 at 12:03 +0530, Raghavendra Talur wrote:
> Top posting, this is a very old thread.
> 
> Keeping in view the recent NetBSD problems and the number of bugs creeping
> in, I suggest we do these things right now:
> 
> a. Change the gerrit merge type to fast forward only.
> As explained below in the thread, with our current setup even if both
> PatchA and PatchB pass regression separately when both are merged it is
> possible that a functional bug creeps in.
> This is the only solution to prevent that from happening.
> I will work with Kaushal to get this done.
> 
> b. In Jenkins, remove gerrit trigger and make it a manual operation
> 
> Too many developers use the upstream infra as a test cluster and it is
> *not*.
> It is a verification mechanism for maintainers to ensure that the patch
> does not cause regression.
>
> It is required that all developers run full regression on their machines
> before asking for reviews.

Hmm, I am not 100% sure I would underwrite that.
I am coming from the Samba process, where we have exactly
that: A developer should have run full selftest before
submitting the change for review. Then after two samba
team developers have given their review+ (counting the
author), it can be pushed to our automatism that keeps
rebasing on current upstream and running selftest until
either selftest succeeds and is pushed as a fast forward
or selftest fails.

The reality is that people are lazy and think they know
when they can skip selftest. But people are deceived and
overlook problems.  Hence either reviewers run into failures
or the automatic pre-push selftest fails. The problem
I see with this is that it wastes the precios time of
the reviewers.

When I started contributing to Gluster, I found it to
be a big, big plus to have automatic regression runs
as a first step after submission, so that a reviewer
has the option to only start looking at the patch once
automatic tests have passed.

I completely agree that the fast-forward-only and
post-review-pre-merge-regression-run approach
is the way to go, only this way the original problem
described by Talur can be avoided.

But would it be possible to keep and even require some
amount of automatic pre-review test run (build and at
least some amount of runtimte test)?
It really prevents waste of time of reviewers/maintainers.

The problem with this is of course that it can increase
the (real) time needed to complete a review from submission
until upstream merge.

Just a few thoughts...

Cheers - Michael



signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] NetBSD hang in quota-anon-fd-nfs.t

2016-01-11 Thread Emmanuel Dreyfus
On Mon, Jan 11, 2016 at 11:51:25AM +0530, Vijaikumar Mallikarjuna wrote:
> All quota test-cases uses 'tests/basic/quota.c' to write data
> Does sync flags have any impact?

It seems to change internal behavior but not the result, I can still
see write calls taking e.g.: 1169s

For the sake of compteness: instead of awaiting for page locked 
(probably by NFS subsystem), it now waits for NFS RPC replies
from the server. Example of kernel backtrace:
sleepq_block
cv_timedwait
nfs_rcvlock
nfs_request
nfs_writerpc
nfs_doio
VOP_STRATEGY
genfs_do_io
genfs_gop_write
genfs_do_putpages
genfs_putpages
VOP_PUTPAGES
nfs_write
VOP_WRITE
vn_write
dofilewrite
sys_write
syscall

I note we mount with -o noac,soft,nolock,vers=3
with the scripts turn into -o tcp,-R=2,soft,nfs3 fro NetBSD.
-R is retry. There is no timeout. Do we need one?

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client

2016-01-11 Thread Soumya Koduri
I have made changes to fix the lookup leak in a different way (as 
discussed with Pranith) and uploaded them in the latest patch set #4

- http://review.gluster.org/#/c/13096/

Please check if it resolves the mem leak and hopefully doesn't result in 
any assertion :)


Thanks,
Soumya

On 01/08/2016 05:04 PM, Soumya Koduri wrote:

I could reproduce while testing deep directories with in the mount
point. I root caus'ed the issue & had discussion with Pranith to
understand the purpose and recommended way of taking nlookup on inodes.

I shall make changes to my existing fix and post the patch soon.
Thanks for your patience!

-Soumya

On 01/07/2016 07:34 PM, Oleksandr Natalenko wrote:

OK, I've patched GlusterFS v3.7.6 with 43570a01 and 5cffb56b (the most
recent
revisions) and NFS-Ganesha v2.3.0 with 8685abfc (most recent revision
too).

On traversing GlusterFS volume with many files in one folder via NFS
mount I
get an assertion:

===
ganesha.nfsd: inode.c:716: __inode_forget: Assertion `inode->nlookup >=
nlookup' failed.
===

I used GDB on NFS-Ganesha process to get appropriate stacktraces:

1. short stacktrace of failed thread:

https://gist.github.com/7f63bb99c530d26ded18

2. full stacktrace of failed thread:

https://gist.github.com/d9bc7bc8f6a0bbff9e86

3. short stacktrace of all threads:

https://gist.github.com/f31da7725306854c719f

4. full stacktrace of all threads:

https://gist.github.com/65cbc562b01211ea5612

GlusterFS volume configuration:

https://gist.github.com/30f0129d16e25d4a5a52

ganesha.conf:

https://gist.github.com/9b5e59b8d6d8cb84c85d

How I mount NFS share:

===
mount -t nfs4 127.0.0.1:/mail_boxes /mnt/tmp -o
defaults,_netdev,minorversion=2,noac,noacl,lookupcache=none,timeo=100
===

On четвер, 7 січня 2016 р. 12:06:42 EET Soumya Koduri wrote:

Entries_HWMark = 500;




___
Gluster-users mailing list
gluster-us...@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Gerrit review, submit type and Jenkins testing

2016-01-11 Thread Raghavendra Talur
On Jan 12, 2016 2:50 AM, "Niels de Vos"  wrote:
>
> On Fri, Jan 08, 2016 at 12:46:09PM +0530, Raghavendra Talur wrote:
> > On Fri, Jan 8, 2016 at 12:28 PM, Kaushal M  wrote:
> >
> > > On Fri, Jan 8, 2016 at 12:10 PM, Kaushal M 
wrote:
> > > > On Fri, Jan 8, 2016 at 12:03 PM, Raghavendra Talur <
rta...@redhat.com>
> > > wrote:
> > > >> Top posting, this is a very old thread.
> > > >>
> > > >> Keeping in view the recent NetBSD problems and the number of bugs
> > > creeping
> > > >> in, I suggest we do these things right now:
> > > >>
> > > >> a. Change the gerrit merge type to fast forward only.
> > > >> As explained below in the thread, with our current setup even if
both
> > > PatchA
> > > >> and PatchB pass regression separately when both are merged it is
> > > possible
> > > >> that a functional bug creeps in.
> > > >> This is the only solution to prevent that from happening.
> > > >> I will work with Kaushal to get this done.
> > > >>
> > > >> b. In Jenkins, remove gerrit trigger and make it a manual operation
> > > >
> > > > Making it manual might be too much work for maintainers. I suggest
(as
> > > > I've suggested before) we make regressions trigger when a change has
> > > > been reviewed +2 by a maintainer.
> > > >
> > >
> >
> > Makes sense. I have disabled it completely for now and lets keep it that
> > way till
> > developers realize it(a day should be enough). We will change this
trigger
> > to on Code Review +2 by tomorrow.
>
> Ah! And I have been wondering why patches don't get verified
> anymore :-/
>
> An email to the maintainers list as a heads up would have been welcome.
>
> How would we handle patches that get sent by maintainers? Most
> developers that do code reviews will only +1 those changes. Those will
> never get automatically regression tested then. I dont think a
> maintainer should +2 their own patch immediately either, that suggests
> no further reviews are needed.
>
> Niels

After realising this we configured Jenkins to be triggered either on code
review +2 or a verified +1. Even if it is the maintainer who sent the
patch, he/she can certainly give a +1 verified.

There seems to be some problem with both these type of events though. I
tried various combinations yesterday,  yet the events don't reach Jenkins.
I am afraid we will have to go back to patch set triggers until we update
our plugins.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Gerrit review, submit type and Jenkins testing

2016-01-11 Thread Emmanuel Dreyfus
Niels de Vos  wrote:

> How would we handle patches that get sent by maintainers? Most
> developers that do code reviews will only +1 those changes. Those will
> never get automatically regression tested then. I dont think a
> maintainer should +2 their own patch immediately either, that suggests
> no further reviews are needed.

Indeed it is a bit odd, but I just CR +2 my own changes...

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] freebsd smoke failure

2016-01-11 Thread Atin Mukherjee
I've been observing freebsd smoke failure for all the patches for last
few days with the following error:

mkdir: /usr/local/lib/python2.7/site-packages/gluster: Permission denied
mkdir: /usr/local/lib/python2.7/site-packages/gluster: Permission denied

Can any one from infra team can help here?

~Atin
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] freebsd smoke failure

2016-01-11 Thread Jiffin Tony Thottan



On 12/01/16 11:19, Atin Mukherjee wrote:

I've been observing freebsd smoke failure for all the patches for last
few days with the following error:

mkdir: /usr/local/lib/python2.7/site-packages/gluster: Permission denied
mkdir: /usr/local/lib/python2.7/site-packages/gluster: Permission denied

Can any one from infra team can help here?

Niels send a patch for the same http://review.gluster.org/#/c/13208/
--
Jiffin



~Atin
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gerrit review, submit type and Jenkins testing

2016-01-11 Thread Raghavendra Talur
On Jan 12, 2016 3:44 AM, "Michael Adam"  wrote:
>
> On 2016-01-08 at 12:03 +0530, Raghavendra Talur wrote:
> > Top posting, this is a very old thread.
> >
> > Keeping in view the recent NetBSD problems and the number of bugs
creeping
> > in, I suggest we do these things right now:
> >
> > a. Change the gerrit merge type to fast forward only.
> > As explained below in the thread, with our current setup even if both
> > PatchA and PatchB pass regression separately when both are merged it is
> > possible that a functional bug creeps in.
> > This is the only solution to prevent that from happening.
> > I will work with Kaushal to get this done.
> >
> > b. In Jenkins, remove gerrit trigger and make it a manual operation
> >
> > Too many developers use the upstream infra as a test cluster and it is
> > *not*.
> > It is a verification mechanism for maintainers to ensure that the patch
> > does not cause regression.
> >
> > It is required that all developers run full regression on their machines
> > before asking for reviews.
>
> Hmm, I am not 100% sure I would underwrite that.
> I am coming from the Samba process, where we have exactly
> that: A developer should have run full selftest before
> submitting the change for review. Then after two samba
> team developers have given their review+ (counting the
> author), it can be pushed to our automatism that keeps
> rebasing on current upstream and running selftest until
> either selftest succeeds and is pushed as a fast forward
> or selftest fails.
>
> The reality is that people are lazy and think they know
> when they can skip selftest. But people are deceived and
> overlook problems.  Hence either reviewers run into failures
> or the automatic pre-push selftest fails. The problem
> I see with this is that it wastes the precios time of
> the reviewers.
>
> When I started contributing to Gluster, I found it to
> be a big, big plus to have automatic regression runs
> as a first step after submission, so that a reviewer
> has the option to only start looking at the patch once
> automatic tests have passed.
>
> I completely agree that the fast-forward-only and
> post-review-pre-merge-regression-run approach
> is the way to go, only this way the original problem
> described by Talur can be avoided.
>
> But would it be possible to keep and even require some
> amount of automatic pre-review test run (build and at
> least some amount of runtimte test)?
> It really prevents waste of time of reviewers/maintainers.
>
> The problem with this is of course that it can increase
> the (real) time needed to complete a review from submission
> until upstream merge.
>
> Just a few thoughts...
>
> Cheers - Michael
>

We had same concern from many other maintainers. I guess it would be better
if test runs both before and after review. With these changes we would have
removed test runs of work in progress patches.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] GFID to Path Conversion

2016-01-11 Thread Aravinda


regards
Aravinda

On 01/11/2016 08:00 PM, Shyam wrote:

On 01/06/2016 04:46 AM, Aravinda wrote:


regards
Aravinda

On 01/06/2016 02:49 AM, Shyam wrote:

On 12/09/2015 12:47 AM, Aravinda wrote:

Hi,

Sharing draft design for GFID to Path Conversion.(Directory GFID to
Path is
very easy in DHT v.1, this design may not work in case of DHT 2.0)


(current thought) DHT2 would extend the manner in which name,pGFID is
stored for files, for directories. So reverse path walking would
leverage the same mechanism as explained below.

Of course, as this would involve MDS hopping, the intention would be
to *not* use this in IO critical paths, and rather use this in the
tool set that needs reverse path walks to provide information to 
admins.




Performance and Storage space impact yet to be analyzed.

Storing the required informaton
---
Metadata information related to Parent GFID and Basename will reside
with the file. PGFID and hash of Basename will become part of Xattr
Key name and Basename will be saved as Value.

 Xattr Key = meta..
 Xattr Value = 


I would think we should keep the xattr name constant, and specialize
the value, instead of encoding data in the xattr value itself. The
issue is of course multiple xattr name:value pairs where name is
constant is not feasible and needs some thought.

If we use single xattr for multiple values then updating one's basename
will have to parse the existing xattr before update(in case of 
hardlinks)

Wrote about other experiments did to update and read xattrs.
http://www.gluster.org/pipermail/gluster-devel/2015-December/047380.html


Agree and understood, I am more thinking how we will enumerate all 
such xattrs, when we just know the name. We possibly would do 
listxattr in that case, would that be right?
To Create/Update the Xattr, search is not required. For example, Create 
d1/d2/f1


pgfid = get_gfid(d1/d2)
xattr_name = "meta." + pgfid + "." + HASH(f1)
value = "f1"
setxattr(d1/d2/f1, xattr_name, value)

In case of Rename(d1/d2/f1 => d1/d3/f3),
pgfid_old = get_gfid(d1/d2)
pgfid_new = get_gfid(d1/d3)
xattr_name_old = "meta." + pgfid_old + "." + HASH(f1)
xattr_name_new = "meta." + pgfid_new + "." + HASH(f3)
value_new = "f3"
removexattr(d1/d2/f1, xattr_name_old)
setxattr(d1/d3/f3, xattr_name_new, value_new)

Populate xattrs example, 
https://gist.github.com/aravindavk/5307489f68cbcfb37d3d


Each xattrs can be independently handled(thread safe) since xattr 
key/value is not dependent on others base name.


To read xattr and convert to path(Python example, 
https://gist.github.com/aravindavk/d1d0ca9c874b7d3d8d86)


paths = []
all_xattrs = listxattr(PATH)
for xattr in all_xattrs{
if xattr_name.startswith("meta."){
paths.append(getxattr(PATH, xattr_name))
}
}
print paths





Non-crypto hash is suitable for this purpose.
Number of Xattrs on a file = Number of Links

Converting GFID to Path
---
Example GFID: 78e8bce0-a8c9-4e67-9ffb-c4c4c7eff038


Here is where we get into a bit of a problem, if a file has links.
Which path to follow would be a dilemma. We could return all paths,
but tools like glusterfind or backup related, would prefer a single
file. One of the thoughts is, if we could feed a pGFID:GFID pair as
input, this still does not solve a file having links within the same
pGFID.

Anyway, something to note or consider.



1. List all xattrs of GFID file in the brick backend.
($BRICK_ROOT/.glusterfs/78/e8/78e8bce0-a8c9-4e67-9ffb-c4c4c7eff038)
2. If Xattr Key starts with “meta”, Split to get parent GFID and 
collect

xattr value
3. Convert Parent GFID to path using recursive readlink till path.


This is the part which should/would change with DHT2 in my opinion.
Sort of repeating step (2) here instead of a readlink.


4. Join Converted parent dir path and xattr value(basename)

Recording
-
MKNOD/CREATE/LINK/SYMLINK: Add new Xattr(PGFID, BN)


Most of these operations as they exist today are not atomic, i.e we
create the file and then add the xattrs and then possibly hardlink the
GFID, so by the time the GFID makes it's presence, the file is all
ready and (maybe) hence consistent.

The other way to look at this is that we get the GFID representation
ready, and then hard link the name into the name tree. Alternately we
could leverage O_TMPFILE to create the file encode all its inode
information and then bring it to life in the namespace. This is
orthogonal to this design, but brings in needs to be consistent on
failures.

Either way, if a failure occurs midway, we have no way to recover the
information for the inode and set it right. Thoughts?


RENAME: Remove old xattr(PGFID1, BN1), Add new xattr(PGFID2, BN2)
UNLINK: If Link count > 1 then Remove xattr(PGFID, BN)

Heal on Lookup
--
Healing on lookup can be enabled if required, by default we can
disable this option since this may have performance implications
during read.

Enabling the logging

Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client

2016-01-11 Thread Soumya Koduri



On 01/11/2016 05:11 PM, Oleksandr Natalenko wrote:

Brief test shows that Ganesha stopped leaking and crashing, so it seems
to be good for me.


Thanks for checking.


Nevertheless, back to my original question: what about FUSE client? It
is still leaking despite all the fixes applied. Should it be considered
another issue?


For fuse client, I tried vfs drop_caches as suggested by Vijay in an 
earlier mail. Though all the inodes get purged, I still doesn't see much 
difference in the memory footprint drop. Need to investigate what else 
is consuming so much memory here.


Thanks,
Soumya



11.01.2016 12:26, Soumya Koduri написав:

I have made changes to fix the lookup leak in a different way (as
discussed with Pranith) and uploaded them in the latest patch set #4
- http://review.gluster.org/#/c/13096/

Please check if it resolves the mem leak and hopefully doesn't result
in any assertion :)

Thanks,
Soumya

On 01/08/2016 05:04 PM, Soumya Koduri wrote:

I could reproduce while testing deep directories with in the mount
point. I root caus'ed the issue & had discussion with Pranith to
understand the purpose and recommended way of taking nlookup on inodes.

I shall make changes to my existing fix and post the patch soon.
Thanks for your patience!

-Soumya

On 01/07/2016 07:34 PM, Oleksandr Natalenko wrote:

OK, I've patched GlusterFS v3.7.6 with 43570a01 and 5cffb56b (the most
recent
revisions) and NFS-Ganesha v2.3.0 with 8685abfc (most recent revision
too).

On traversing GlusterFS volume with many files in one folder via NFS
mount I
get an assertion:

===
ganesha.nfsd: inode.c:716: __inode_forget: Assertion `inode->nlookup >=
nlookup' failed.
===

I used GDB on NFS-Ganesha process to get appropriate stacktraces:

1. short stacktrace of failed thread:

https://gist.github.com/7f63bb99c530d26ded18

2. full stacktrace of failed thread:

https://gist.github.com/d9bc7bc8f6a0bbff9e86

3. short stacktrace of all threads:

https://gist.github.com/f31da7725306854c719f

4. full stacktrace of all threads:

https://gist.github.com/65cbc562b01211ea5612

GlusterFS volume configuration:

https://gist.github.com/30f0129d16e25d4a5a52

ganesha.conf:

https://gist.github.com/9b5e59b8d6d8cb84c85d

How I mount NFS share:

===
mount -t nfs4 127.0.0.1:/mail_boxes /mnt/tmp -o
defaults,_netdev,minorversion=2,noac,noacl,lookupcache=none,timeo=100
===

On четвер, 7 січня 2016 р. 12:06:42 EET Soumya Koduri wrote:

Entries_HWMark = 500;




___
Gluster-users mailing list
gluster-us...@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Gerrit review, submit type and Jenkins testing

2016-01-11 Thread Raghavendra Gowdappa


- Original Message -
> From: "Raghavendra Talur" 
> To: "Michael Adam" 
> Cc: "Gluster Devel" 
> Sent: Tuesday, January 12, 2016 8:46:05 AM
> Subject: Re: [Gluster-devel] Gerrit review, submit type and Jenkins testing
> 
> 
> 
> 
> On Jan 12, 2016 3:44 AM, "Michael Adam" < ob...@samba.org > wrote:
> > 
> > On 2016-01-08 at 12:03 +0530, Raghavendra Talur wrote:
> > > Top posting, this is a very old thread.
> > > 
> > > Keeping in view the recent NetBSD problems and the number of bugs
> > > creeping
> > > in, I suggest we do these things right now:
> > > 
> > > a. Change the gerrit merge type to fast forward only.
> > > As explained below in the thread, with our current setup even if both
> > > PatchA and PatchB pass regression separately when both are merged it is
> > > possible that a functional bug creeps in.
> > > This is the only solution to prevent that from happening.
> > > I will work with Kaushal to get this done.
> > > 
> > > b. In Jenkins, remove gerrit trigger and make it a manual operation
> > > 
> > > Too many developers use the upstream infra as a test cluster and it is
> > > *not*.
> > > It is a verification mechanism for maintainers to ensure that the patch
> > > does not cause regression.
> > > 
> > > It is required that all developers run full regression on their machines
> > > before asking for reviews.
> > 
> > Hmm, I am not 100% sure I would underwrite that.
> > I am coming from the Samba process, where we have exactly
> > that: A developer should have run full selftest before
> > submitting the change for review. Then after two samba
> > team developers have given their review+ (counting the
> > author), it can be pushed to our automatism that keeps
> > rebasing on current upstream and running selftest until
> > either selftest succeeds and is pushed as a fast forward
> > or selftest fails.
> > 
> > The reality is that people are lazy and think they know
> > when they can skip selftest. But people are deceived and
> > overlook problems. Hence either reviewers run into failures
> > or the automatic pre-push selftest fails. The problem
> > I see with this is that it wastes the precios time of
> > the reviewers.
> > 
> > When I started contributing to Gluster, I found it to
> > be a big, big plus to have automatic regression runs
> > as a first step after submission, so that a reviewer
> > has the option to only start looking at the patch once
> > automatic tests have passed.
> > 
> > I completely agree that the fast-forward-only and
> > post-review-pre-merge-regression-run approach
> > is the way to go, only this way the original problem
> > described by Talur can be avoided.
> > 
> > But would it be possible to keep and even require some
> > amount of automatic pre-review test run (build and at
> > least some amount of runtimte test)?
> > It really prevents waste of time of reviewers/maintainers.
> > 
> > The problem with this is of course that it can increase
> > the (real) time needed to complete a review from submission
> > until upstream merge.
> > 
> > Just a few thoughts...
> > 
> > Cheers - Michael
> > 
> 
> We had same concern from many other maintainers. I guess it would be better
> if test runs both before and after review. With these changes we would have
> removed test runs of work in progress patches.

Yes. I think It would be better one set of regressions to be run before a +1. 
Normally it takes couple of iterations to +1 a patch. So, I think it would 
unnecessarily serialize regression runs and review. If reviews are run 
parallely, developers can work on regression failures parallely while others do 
the review.

> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client

2016-01-11 Thread Oleksandr Natalenko
Just in case, here is Valgrind output on FUSE client with 3.7.6 + 
API-related patches we discussed before:


https://gist.github.com/cd6605ca19734c1496a4

12.01.2016 08:24, Soumya Koduri написав:

For fuse client, I tried vfs drop_caches as suggested by Vijay in an
earlier mail. Though all the inodes get purged, I still doesn't see
much difference in the memory footprint drop. Need to investigate what
else is consuming so much memory here.

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Gerrit review, submit type and Jenkins testing

2016-01-11 Thread Niels de Vos
On Fri, Jan 08, 2016 at 12:46:09PM +0530, Raghavendra Talur wrote:
> On Fri, Jan 8, 2016 at 12:28 PM, Kaushal M  wrote:
> 
> > On Fri, Jan 8, 2016 at 12:10 PM, Kaushal M  wrote:
> > > On Fri, Jan 8, 2016 at 12:03 PM, Raghavendra Talur 
> > wrote:
> > >> Top posting, this is a very old thread.
> > >>
> > >> Keeping in view the recent NetBSD problems and the number of bugs
> > creeping
> > >> in, I suggest we do these things right now:
> > >>
> > >> a. Change the gerrit merge type to fast forward only.
> > >> As explained below in the thread, with our current setup even if both
> > PatchA
> > >> and PatchB pass regression separately when both are merged it is
> > possible
> > >> that a functional bug creeps in.
> > >> This is the only solution to prevent that from happening.
> > >> I will work with Kaushal to get this done.
> > >>
> > >> b. In Jenkins, remove gerrit trigger and make it a manual operation
> > >
> > > Making it manual might be too much work for maintainers. I suggest (as
> > > I've suggested before) we make regressions trigger when a change has
> > > been reviewed +2 by a maintainer.
> > >
> >
> 
> Makes sense. I have disabled it completely for now and lets keep it that
> way till
> developers realize it(a day should be enough). We will change this trigger
> to on Code Review +2 by tomorrow.

Ah! And I have been wondering why patches don't get verified
anymore :-/

An email to the maintainers list as a heads up would have been welcome.

How would we handle patches that get sent by maintainers? Most
developers that do code reviews will only +1 those changes. Those will
never get automatically regression tested then. I dont think a
maintainer should +2 their own patch immediately either, that suggests
no further reviews are needed.

Niels


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] 3.8 Plan changes - proposal

2016-01-11 Thread Vijay Bellur

Hi All,

We discussed the following proposal for 3.8 in the maintainers mailing 
list and there was general consensus about the changes being a step in 
the right direction. Would like to hear your thoughts about the same.


Changes to 3.8 Plan:


1. Include 4.0 features such as NSR, dht2, glusterd2.0 & eventing as 
experimental features in 3.8. 4.0 features are shaping up reasonably 
well and it would be nice to have them packaged in 3.8 so that we can 
get more meaningful feedback early on. As 4.0 features mature, we can 
push them out through subsequent 3.8.x releases to derive iterative 
feedback.


2. Ensure that most of our components have tests in distaf by 3.8. I 
would like us to have more deterministic pre-release testing for 3.8 and 
having tests in distaf should help us in accomplishing that goal.


3. Add "forward compatibility" section to all feature pages proposed for 
3.8 so that we carefully review the impact of a feature on all upcoming 
Gluster.next features.


4. Have Niels de Vos as the maintainer for 3.8 with immediate effect. 
This is a change from the past where we have had release maintainers 
after a .0 release is in place. I think Niels' diligence as a release 
manager will help us in having a more polished .0 release.


5. Move out 3.8 by 2-3 months (end of May or early June 2016) to 
accomplish these changes.


Appreciate your feedback about this proposal!

Thanks,
Vijay
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Force lookup if inode ctx is not present for master xlators.

2016-01-11 Thread Mohammed Rafi K C
Hi All,

We are facing an issue where inode_ctx is missing for some of the
xlators, even though inode was already linked to inode table. This can
happen if the inode was linked from any xlators which is loaded below
master xlator. A similar problem was reported in bug 1297311
 . To solve this
problem we are thinking to force a lookup if the inode_ctx was not
present on master xlators. We are assuming that, After the first
successful lookup, every linked inode should have an inode_ctx in all
xlators which stores an inode_ctx, ie as part of resolving an entry  if
inode_ctx was not present for master xlators, we are planning to send a
lookup.

Your suggestion and comments will be most welcome.

Regards
Rafi KC

 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client

2016-01-11 Thread Oleksandr Natalenko
Brief test shows that Ganesha stopped leaking and crashing, so it seems 
to be good for me.


Nevertheless, back to my original question: what about FUSE client? It 
is still leaking despite all the fixes applied. Should it be considered 
another issue?


11.01.2016 12:26, Soumya Koduri написав:

I have made changes to fix the lookup leak in a different way (as
discussed with Pranith) and uploaded them in the latest patch set #4
- http://review.gluster.org/#/c/13096/

Please check if it resolves the mem leak and hopefully doesn't result
in any assertion :)

Thanks,
Soumya

On 01/08/2016 05:04 PM, Soumya Koduri wrote:

I could reproduce while testing deep directories with in the mount
point. I root caus'ed the issue & had discussion with Pranith to
understand the purpose and recommended way of taking nlookup on 
inodes.


I shall make changes to my existing fix and post the patch soon.
Thanks for your patience!

-Soumya

On 01/07/2016 07:34 PM, Oleksandr Natalenko wrote:
OK, I've patched GlusterFS v3.7.6 with 43570a01 and 5cffb56b (the 
most

recent
revisions) and NFS-Ganesha v2.3.0 with 8685abfc (most recent revision
too).

On traversing GlusterFS volume with many files in one folder via NFS
mount I
get an assertion:

===
ganesha.nfsd: inode.c:716: __inode_forget: Assertion `inode->nlookup 
>=

nlookup' failed.
===

I used GDB on NFS-Ganesha process to get appropriate stacktraces:

1. short stacktrace of failed thread:

https://gist.github.com/7f63bb99c530d26ded18

2. full stacktrace of failed thread:

https://gist.github.com/d9bc7bc8f6a0bbff9e86

3. short stacktrace of all threads:

https://gist.github.com/f31da7725306854c719f

4. full stacktrace of all threads:

https://gist.github.com/65cbc562b01211ea5612

GlusterFS volume configuration:

https://gist.github.com/30f0129d16e25d4a5a52

ganesha.conf:

https://gist.github.com/9b5e59b8d6d8cb84c85d

How I mount NFS share:

===
mount -t nfs4 127.0.0.1:/mail_boxes /mnt/tmp -o
defaults,_netdev,minorversion=2,noac,noacl,lookupcache=none,timeo=100
===

On четвер, 7 січня 2016 р. 12:06:42 EET Soumya Koduri wrote:

Entries_HWMark = 500;




___
Gluster-users mailing list
gluster-us...@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] GFID to Path Conversion

2016-01-11 Thread Shyam

On 01/06/2016 04:46 AM, Aravinda wrote:


regards
Aravinda

On 01/06/2016 02:49 AM, Shyam wrote:

On 12/09/2015 12:47 AM, Aravinda wrote:

Hi,

Sharing draft design for GFID to Path Conversion.(Directory GFID to
Path is
very easy in DHT v.1, this design may not work in case of DHT 2.0)


(current thought) DHT2 would extend the manner in which name,pGFID is
stored for files, for directories. So reverse path walking would
leverage the same mechanism as explained below.

Of course, as this would involve MDS hopping, the intention would be
to *not* use this in IO critical paths, and rather use this in the
tool set that needs reverse path walks to provide information to admins.



Performance and Storage space impact yet to be analyzed.

Storing the required informaton
---
Metadata information related to Parent GFID and Basename will reside
with the file. PGFID and hash of Basename will become part of Xattr
Key name and Basename will be saved as Value.

 Xattr Key = meta..
 Xattr Value = 


I would think we should keep the xattr name constant, and specialize
the value, instead of encoding data in the xattr value itself. The
issue is of course multiple xattr name:value pairs where name is
constant is not feasible and needs some thought.

If we use single xattr for multiple values then updating one's basename
will have to parse the existing xattr before update(in case of hardlinks)
Wrote about other experiments did to update and read xattrs.
http://www.gluster.org/pipermail/gluster-devel/2015-December/047380.html


Agree and understood, I am more thinking how we will enumerate all such 
xattrs, when we just know the name. We possibly would do listxattr in 
that case, would that be right?






Non-crypto hash is suitable for this purpose.
Number of Xattrs on a file = Number of Links

Converting GFID to Path
---
Example GFID: 78e8bce0-a8c9-4e67-9ffb-c4c4c7eff038


Here is where we get into a bit of a problem, if a file has links.
Which path to follow would be a dilemma. We could return all paths,
but tools like glusterfind or backup related, would prefer a single
file. One of the thoughts is, if we could feed a pGFID:GFID pair as
input, this still does not solve a file having links within the same
pGFID.

Anyway, something to note or consider.



1. List all xattrs of GFID file in the brick backend.
($BRICK_ROOT/.glusterfs/78/e8/78e8bce0-a8c9-4e67-9ffb-c4c4c7eff038)
2. If Xattr Key starts with “meta”, Split to get parent GFID and collect
xattr value
3. Convert Parent GFID to path using recursive readlink till path.


This is the part which should/would change with DHT2 in my opinion.
Sort of repeating step (2) here instead of a readlink.


4. Join Converted parent dir path and xattr value(basename)

Recording
-
MKNOD/CREATE/LINK/SYMLINK: Add new Xattr(PGFID, BN)


Most of these operations as they exist today are not atomic, i.e we
create the file and then add the xattrs and then possibly hardlink the
GFID, so by the time the GFID makes it's presence, the file is all
ready and (maybe) hence consistent.

The other way to look at this is that we get the GFID representation
ready, and then hard link the name into the name tree. Alternately we
could leverage O_TMPFILE to create the file encode all its inode
information and then bring it to life in the namespace. This is
orthogonal to this design, but brings in needs to be consistent on
failures.

Either way, if a failure occurs midway, we have no way to recover the
information for the inode and set it right. Thoughts?


RENAME: Remove old xattr(PGFID1, BN1), Add new xattr(PGFID2, BN2)
UNLINK: If Link count > 1 then Remove xattr(PGFID, BN)

Heal on Lookup
--
Healing on lookup can be enabled if required, by default we can
disable this option since this may have performance implications
during read.

Enabling the logging
-
This can be enabled using Volume set option. Option name TBD.

Rebuild Index
-
Offline activity, crawls the backend filesystem and builds all the
required xattrs.


Frequency of the rebuild? I would assume this would be run when the
option is enabled, and later almost never, unless we want to recover
from some inconsistency in the data (how to detect the same would be
an open question).

Also I think once this option is enabled, we should prevent disabling
the same (or at least till the packages are downgraded), as this would
be a hinge that multiple other features may depend on, and so we
consider this an on-disk change that is made once, and later
maintained for the volume, rather than turn on/off.

Which means the initial index rebuild would be a volume version
conversion from current to this representation and may need aditional
thoughts on how we maintain volume versions.



Comments and Suggestions Welcome.

regards
Aravinda

On 11/25/2015 10:08 AM, Aravinda wrote:


regards
Aravinda

On 11/24/2015 11:25 PM, Shyam wrote:


Re: [Gluster-devel] GFID to Path Conversion

2016-01-11 Thread Shyam

On 01/05/2016 08:24 PM, Venky Shankar wrote:



Shyam wrote:

On 12/09/2015 12:47 AM, Aravinda wrote:

Hi,

Sharing draft design for GFID to Path Conversion.(Directory GFID to
Path is
very easy in DHT v.1, this design may not work in case of DHT 2.0)


(current thought) DHT2 would extend the manner in which name,pGFID is
stored for files, for directories. So reverse path walking would
leverage the same mechanism as explained below.

Of course, as this would involve MDS hopping, the intention would be to
*not* use this in IO critical paths, and rather use this in the tool set
that needs reverse path walks to provide information to admins.



Performance and Storage space impact yet to be analyzed.

Storing the required informaton
---
Metadata information related to Parent GFID and Basename will reside
with the file. PGFID and hash of Basename will become part of Xattr
Key name and Basename will be saved as Value.

Xattr Key = meta..
Xattr Value = 


I would think we should keep the xattr name constant, and specialize the
value, instead of encoding data in the xattr value itself. The issue is
of course multiple xattr name:value pairs where name is constant is not
feasible and needs some thought.


With DHT2, the "multi-value key" could possibly be stored efficiently in
some kvdb rather than xattrs (when it does move there). With current
DHT, we're still stuck with using xattrs, where having a compounded
value would rather be inefficient.


With DHT2 if we get to the point of on-disk xlator controlling the doff, 
then we could/may need that information as well for optimization 
purposes (just stating).








Non-crypto hash is suitable for this purpose.
Number of Xattrs on a file = Number of Links

Converting GFID to Path
---
Example GFID: 78e8bce0-a8c9-4e67-9ffb-c4c4c7eff038


Here is where we get into a bit of a problem, if a file has links. Which
path to follow would be a dilemma. We could return all paths, but tools
like glusterfind or backup related, would prefer a single file. One of
the thoughts is, if we could feed a pGFID:GFID pair as input, this still
does not solve a file having links within the same pGFID.


Why not just list all possible paths? I think that might be the correct
thing to do. In most cases, this would just be dealing with a single
link count. For other cases (nlink > 1), the higher level code would
need to do some sort of juggling - utilities such as glusterfind (or
other backup tools) could possibly perform additional checks before
doing their job, but in most cases, they would be dealing with a single
link count.


Hmmm... one way for sure. But, let's say it is a backup application that 
needs files that changed, in the above example we would list a file with 
hardlinks twice. Hence the thought, of what we could do here.



Maybe we could pass the  back as the handle, where the 
index is a number uniquely identifying the base name in the inode KV 
list. This way at least change log and others based on the handle can 
arrive at which bname/index was being operated upon etc.







Anyway, something to note or consider.



1. List all xattrs of GFID file in the brick backend.
($BRICK_ROOT/.glusterfs/78/e8/78e8bce0-a8c9-4e67-9ffb-c4c4c7eff038)
2. If Xattr Key starts with “meta”, Split to get parent GFID and collect
xattr value
3. Convert Parent GFID to path using recursive readlink till path.


This is the part which should/would change with DHT2 in my opinion. Sort
of repeating step (2) here instead of a readlink.


4. Join Converted parent dir path and xattr value(basename)

Recording
-
MKNOD/CREATE/LINK/SYMLINK: Add new Xattr(PGFID, BN)


Most of these operations as they exist today are not atomic, i.e we
create the file and then add the xattrs and then possibly hardlink the
GFID, so by the time the GFID makes it's presence, the file is all ready
and (maybe) hence consistent.

The other way to look at this is that we get the GFID representation
ready, and then hard link the name into the name tree. Alternately we
could leverage O_TMPFILE to create the file encode all its inode
information and then bring it to life in the namespace. This is
orthogonal to this design, but brings in needs to be consistent on
failures.


IIRC, last time I checked, using O_TMPFILE was not portable, but this
can still be used wherever its available.



Either way, if a failure occurs midway, we have no way to recover the
information for the inode and set it right. Thoughts?


RENAME: Remove old xattr(PGFID1, BN1), Add new xattr(PGFID2, BN2)
UNLINK: If Link count > 1 then Remove xattr(PGFID, BN)

Heal on Lookup
--
Healing on lookup can be enabled if required, by default we can
disable this option since this may have performance implications
during read.

Enabling the logging
-
This can be enabled using Volume set option. Option name TBD.

Rebuild Index
-
Offline activity, crawls the