Re: [Gluster-devel] [Gluster-users] Trashcan issue with vim editor

2016-01-28 Thread Anoop C S
On Wed, 2016-01-27 at 15:25 +0530, PankaJ Singh wrote:
> 
> Hi,
> 
> We are using gluster 3.7.6 on ubuntu 14.04. We are facing an issue
> with trashcan feature.
> Our scenario is as follow:
> 
> 1. 2 node server (ubuntu 14.04 with glusterfs 3.7.6)
> 2. 1 client node (ubuntu 14.04)
> 3. I have created one volume vol1 with 2 bricks in replica and with
> transport = tcp mode.
> 4. I have enabled quota on vol1
> 5. Now I have enabled trashcan feature on vol1 
> 6. Now I have mounted vol1 on client's home directory "mount -t
> glusterfs -o transport=tcp server-1:/vol1 /home/"
> 7. Now when I logged in via any existing non-root user and perform
> any editing via vim editor then I getting this error "E200: *ReadPre
> autocommands made the file unreadable" and my user's home
> directory permission get changed to 000. after sometime these
> permission gets revert back automatically.
> 
> (NOTE: user's home directories are copied in mounted directory
> glusterfs volume vol1)
> 

As discussed over irc, we will definitely look into this issue [1] and
get back asap. On the other side, I have some solid reasons in
recommending not to use swap/backup files, created/used by Vim, when
trash is enabled for a volume (assuming you have the basic vimrc config
where swap/backup files are enabled by default):

1. You will see lot of foo.swpx/foo.swp files (with time stamp appended
   in their filenames) inside trashcan as Vim creates and removes these
   swap files every now and then.

2. Regarding backup files, you will notice a list of 4913 named files
   inside .trashcan. These files are created and deleted by Vim to make
   sure that it can create files in the current directory. And of
   course every time you save it with :w.

3. Similar is the case with undo files like .foo.un~.

4. Last but not the least, every time you do a :w, Vim performs a
   truncate operation which will cause the previous version of file to
   be moved to .trashcan.

Having said that, you can insert the following lines to your vimrc file
to prevent those unnecessary files, described through first 3 points,
to land inside .trashcan.

set noundofile
set noswapfile
set nobackup
set nowritebackup

As per the current implementation, we cannot prevent previous versions
of file being created inside trash directory and I think these files
will serve as backup files for future which is a good to have feature.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1302307

--Anoop C S

> 
> Thanks & Regards
> PankaJ Singh
> ___
> Gluster-users mailing list
> gluster-us...@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] 3.7.7. patch freeze

2016-01-28 Thread Pranith Kumar Karampuri
unless the patches are data-loss/crashes will not take any more other 
than the ones which help make regressions consistent:


Final set:
http://review.gluster.org/#/c/12768/
http://review.gluster.org/#/c/13305/<< user asked for this on 
gluster-users.

http://review.gluster.org/#/c/13119/
http://review.gluster.org/13292
http://review.gluster.org/13071
http://review.gluster.org/13312
http://review.gluster.org/#/c/13127/

Pranith

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Throttling xlator on the bricks

2016-01-28 Thread Shreyas Siravara
So the way our throttling works is (intentionally) very simplistic. 

(1) When someone mounts an NFS share, we tag the frame with a 32 bit hash of 
the export name they were authorized to mount.
(2) io-stats keeps track of the "current rate" of fops we're seeing for that 
particular mount, using a sampling of fops and a moving average over a short 
period of time.
(3) Based on whether the share violated its allowed rate (which is defined in a 
config file), we tag the FOP as "least-pri". Of course this makes the 
assumption that all NFS endpoints are receiving roughly the same # of FOPs. The 
rate defined in the config file is a *per* NFS endpoint number. So if your 
cluster has 10 NFS endpoints, and you've pre-computed that it can do roughly 
1000 FOPs per second, the rate in the config file would be 100.
(4) IO-Threads then shoves the FOP into the least-pri queue, rather than its 
default. The value is honored all the way down to the bricks.

The code is actually complete, and I'll put it up for review after we iron out 
a few minor issues.

> On Jan 27, 2016, at 9:48 PM, Ravishankar N  wrote:
> 
> On 01/26/2016 08:41 AM, Richard Wareing wrote:
>> In any event, it might be worth having Shreyas detail his throttling feature 
>> (that can throttle any directory hierarchy no less) to illustrate how a 
>> simpler design can achieve similar results to these more complicated (and it 
>> followsbug prone) approaches.
>> 
>> Richard
> Hi Shreyas,
> 
> Wondering if you can share the details of the throttling feature you're 
> working on. Even if there's no code, a description of what it is trying to 
> achieve and how will be great.
> 
> Thanks,
> Ravi

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] 3.7 pending patches

2016-01-28 Thread Nithya Balachandran
Sorry - forgot to provide the link:
http://review.gluster.org/#/c/13262/

Regards,
Nithya

- Original Message -
> From: "Nithya Balachandran" 
> To: "Pranith Kumar Karampuri" 
> Cc: "Gluster Devel" 
> Sent: Thursday, 28 January, 2016 8:47:11 PM
> Subject: Re: [Gluster-devel] 3.7 pending patches
> 
> Sakshi has a fix for this:
> 
> 
> Regards,
> Nithya
> 
> - Original Message -
> > From: "Pranith Kumar Karampuri" 
> > To: "Venky Shankar" , "Gluster Devel"
> > 
> > Cc: "Vijay Bellur" , "Raghavendra Gowdappa"
> > , "Nithya Balachandran"
> > 
> > Sent: Thursday, 28 January, 2016 8:29:16 PM
> > Subject: Re: 3.7 pending patches
> > 
> > 
> > 
> > On 01/28/2016 07:05 PM, Venky Shankar wrote:
> > > Hey folks,
> > >
> > > I just merged patch #13302 (and it's 3.7 equivalent) which fixes a
> > > scrubber
> > > crash.
> > > This was causing other patches to fail regression.
> > >
> > > Requesting a rebase of patches (especially 3.7 pending) that were blocked
> > > due to
> > > this.
> > Thanks a lot for this venky, kotresh, Emmanuel. I re-triggered the builds.
> > 
> > I observed the following crash in one of the runs for
> > https://build.gluster.org/job/rackspace-regression-2GB-triggered/17819/console
> > (3.7):
> > (gdb) bt
> > #0  0x0040ecff in glusterfs_rebalance_event_notify_cbk (
> >  req=0x7f0e58006dbc, iov=0x7f0e6cadb5d0, count=1,
> > myframe=0x7f0e58003a7c)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd-mgmt.c:1812
> > #1  0x7f0e79a1274b in saved_frames_unwind (saved_frames=0x19ffe70)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:366
> > #2  0x7f0e79a127ea in saved_frames_destroy (frames=0x19ffe70)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:383
> > #3  0x7f0e79a12c41 in rpc_clnt_connection_cleanup (conn=0x19fea20)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:532
> > #4  0x7f0e79a136cb in rpc_clnt_notify (trans=0x19fee70,
> > mydata=0x19fea20,
> >  event=RPC_TRANSPORT_DISCONNECT, data=0x19fee70)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:854
> > #5  0x7f0e79a0fb76 in rpc_transport_notify (this=0x19fee70,
> >  event=RPC_TRANSPORT_DISCONNECT, data=0x19fee70)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-transport.c:546
> > #6  0x7f0e6f1fd621 in socket_event_poll_err (this=0x19fee70)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-t---Type
> >  to continue, or q  to quit---
> > ransport/socket/src/socket.c:1151
> > #7  0x7f0e6f20234c in socket_event_handler (fd=9, idx=1,
> > data=0x19fee70,
> >  poll_in=1, poll_out=0, poll_err=24)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2356
> > #8  0x7f0e79cc386c in event_dispatch_epoll_handler
> > (event_pool=0x19c3c90,
> >  event=0x7f0e6cadbe70)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:575
> > #9  0x7f0e79cc3c5a in event_dispatch_epoll_worker (data=0x7f0e68014970)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:678
> > #10 0x7f0e78f2aa51 in start_thread () from ./lib64/libpthread.so.0
> > #11 0x7f0e7889493d in clone () from ./lib64/libc.so.6
> > (gdb) fr 0
> > #0  0x0040ecff in glusterfs_rebalance_event_notify_cbk (
> >  req=0x7f0e58006dbc, iov=0x7f0e6cadb5d0, count=1,
> > myframe=0x7f0e58003a7c)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd-mgmt.c:1812
> > 1812in
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd-mgmt.c
> > (gdb) info locals
> > rsp = {op_ret = 0, op_errno = 0, dict = {dict_len = 0, dict_val = 0x0}}
> > frame = 0x7f0e58003a7c
> > ctx = 0x0
> > ret = 0
> > __FUNCTION__ = "glusterfs_rebalance_event_notify_cbk"
> > (gdb) p frame->this
> > $1 = (xlator_t *) 0x3a6000
> > (gdb) p frame->this->name
> > Cannot access memory at address 0x3a6000
> > 
> > Pranith
> > >
> > > Thanks,
> > >
> > >  Venky
> > 
> > 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] 3.7 pending patches

2016-01-28 Thread Raghavendra Gowdappa
+ Sakshi

- Original Message -
> From: "Raghavendra Gowdappa" 
> To: "Pranith Kumar Karampuri" 
> Cc: "Venky Shankar" , "Gluster Devel" 
> , "Vijay Bellur"
> , "Nithya Balachandran" 
> Sent: Thursday, January 28, 2016 8:47:51 PM
> Subject: Re: 3.7 pending patches
> 
> 
> 
> - Original Message -
> > From: "Pranith Kumar Karampuri" 
> > To: "Venky Shankar" , "Gluster Devel"
> > 
> > Cc: "Vijay Bellur" , "Raghavendra Gowdappa"
> > , "Nithya Balachandran"
> > 
> > Sent: Thursday, January 28, 2016 8:29:16 PM
> > Subject: Re: 3.7 pending patches
> > 
> > 
> > 
> > On 01/28/2016 07:05 PM, Venky Shankar wrote:
> > > Hey folks,
> > >
> > > I just merged patch #13302 (and it's 3.7 equivalent) which fixes a
> > > scrubber
> > > crash.
> > > This was causing other patches to fail regression.
> > >
> > > Requesting a rebase of patches (especially 3.7 pending) that were blocked
> > > due to
> > > this.
> > Thanks a lot for this venky, kotresh, Emmanuel. I re-triggered the builds.
> > 
> > I observed the following crash in one of the runs for
> > https://build.gluster.org/job/rackspace-regression-2GB-triggered/17819/console
> > (3.7):
> > (gdb) bt
> > #0  0x0040ecff in glusterfs_rebalance_event_notify_cbk (
> >  req=0x7f0e58006dbc, iov=0x7f0e6cadb5d0, count=1,
> > myframe=0x7f0e58003a7c)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd-mgmt.c:1812
> > #1  0x7f0e79a1274b in saved_frames_unwind (saved_frames=0x19ffe70)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:366
> > #2  0x7f0e79a127ea in saved_frames_destroy (frames=0x19ffe70)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:383
> > #3  0x7f0e79a12c41 in rpc_clnt_connection_cleanup (conn=0x19fea20)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:532
> > #4  0x7f0e79a136cb in rpc_clnt_notify (trans=0x19fee70,
> > mydata=0x19fea20,
> >  event=RPC_TRANSPORT_DISCONNECT, data=0x19fee70)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:854
> > #5  0x7f0e79a0fb76 in rpc_transport_notify (this=0x19fee70,
> >  event=RPC_TRANSPORT_DISCONNECT, data=0x19fee70)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-transport.c:546
> > #6  0x7f0e6f1fd621 in socket_event_poll_err (this=0x19fee70)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-t---Type
> >  to continue, or q  to quit---
> > ransport/socket/src/socket.c:1151
> > #7  0x7f0e6f20234c in socket_event_handler (fd=9, idx=1,
> > data=0x19fee70,
> >  poll_in=1, poll_out=0, poll_err=24)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2356
> > #8  0x7f0e79cc386c in event_dispatch_epoll_handler
> > (event_pool=0x19c3c90,
> >  event=0x7f0e6cadbe70)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:575
> > #9  0x7f0e79cc3c5a in event_dispatch_epoll_worker (data=0x7f0e68014970)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:678
> > #10 0x7f0e78f2aa51 in start_thread () from ./lib64/libpthread.so.0
> > #11 0x7f0e7889493d in clone () from ./lib64/libc.so.6
> > (gdb) fr 0
> > #0  0x0040ecff in glusterfs_rebalance_event_notify_cbk (
> >  req=0x7f0e58006dbc, iov=0x7f0e6cadb5d0, count=1,
> > myframe=0x7f0e58003a7c)
> >  at
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd-mgmt.c:1812
> > 1812in
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd-mgmt.c
> > (gdb) info locals
> > rsp = {op_ret = 0, op_errno = 0, dict = {dict_len = 0, dict_val = 0x0}}
> > frame = 0x7f0e58003a7c
> > ctx = 0x0
> > ret = 0
> > __FUNCTION__ = "glusterfs_rebalance_event_notify_cbk"
> > (gdb) p frame->this
> > $1 = (xlator_t *) 0x3a6000
> > (gdb) p frame->this->name
> > Cannot access memory at address 0x3a6000
> 
> There is a patch by sakshi on master at:
> http://review.gluster.org/#/c/13262/
> 
> Its blocked on netbsd regression failures.
> 
> @sakshi,
> 
> Can you figure out what is the issue with netbsd regression failure? Also,
> can you send a backport for 3.7.
> 
> regards,
> Raghavendra.
> 
> > 
> > Pranith
> > >
> > > Thanks,
> > >
> > >  Venky
> > 
> >
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] 3.7 pending patches

2016-01-28 Thread Raghavendra Gowdappa


- Original Message -
> From: "Pranith Kumar Karampuri" 
> To: "Venky Shankar" , "Gluster Devel" 
> 
> Cc: "Vijay Bellur" , "Raghavendra Gowdappa" 
> , "Nithya Balachandran"
> 
> Sent: Thursday, January 28, 2016 8:29:16 PM
> Subject: Re: 3.7 pending patches
> 
> 
> 
> On 01/28/2016 07:05 PM, Venky Shankar wrote:
> > Hey folks,
> >
> > I just merged patch #13302 (and it's 3.7 equivalent) which fixes a scrubber
> > crash.
> > This was causing other patches to fail regression.
> >
> > Requesting a rebase of patches (especially 3.7 pending) that were blocked
> > due to
> > this.
> Thanks a lot for this venky, kotresh, Emmanuel. I re-triggered the builds.
> 
> I observed the following crash in one of the runs for
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/17819/console
> (3.7):
> (gdb) bt
> #0  0x0040ecff in glusterfs_rebalance_event_notify_cbk (
>  req=0x7f0e58006dbc, iov=0x7f0e6cadb5d0, count=1,
> myframe=0x7f0e58003a7c)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd-mgmt.c:1812
> #1  0x7f0e79a1274b in saved_frames_unwind (saved_frames=0x19ffe70)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:366
> #2  0x7f0e79a127ea in saved_frames_destroy (frames=0x19ffe70)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:383
> #3  0x7f0e79a12c41 in rpc_clnt_connection_cleanup (conn=0x19fea20)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:532
> #4  0x7f0e79a136cb in rpc_clnt_notify (trans=0x19fee70,
> mydata=0x19fea20,
>  event=RPC_TRANSPORT_DISCONNECT, data=0x19fee70)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:854
> #5  0x7f0e79a0fb76 in rpc_transport_notify (this=0x19fee70,
>  event=RPC_TRANSPORT_DISCONNECT, data=0x19fee70)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-transport.c:546
> #6  0x7f0e6f1fd621 in socket_event_poll_err (this=0x19fee70)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-t---Type
>  to continue, or q  to quit---
> ransport/socket/src/socket.c:1151
> #7  0x7f0e6f20234c in socket_event_handler (fd=9, idx=1,
> data=0x19fee70,
>  poll_in=1, poll_out=0, poll_err=24)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2356
> #8  0x7f0e79cc386c in event_dispatch_epoll_handler
> (event_pool=0x19c3c90,
>  event=0x7f0e6cadbe70)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:575
> #9  0x7f0e79cc3c5a in event_dispatch_epoll_worker (data=0x7f0e68014970)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:678
> #10 0x7f0e78f2aa51 in start_thread () from ./lib64/libpthread.so.0
> #11 0x7f0e7889493d in clone () from ./lib64/libc.so.6
> (gdb) fr 0
> #0  0x0040ecff in glusterfs_rebalance_event_notify_cbk (
>  req=0x7f0e58006dbc, iov=0x7f0e6cadb5d0, count=1,
> myframe=0x7f0e58003a7c)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd-mgmt.c:1812
> 1812in
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd-mgmt.c
> (gdb) info locals
> rsp = {op_ret = 0, op_errno = 0, dict = {dict_len = 0, dict_val = 0x0}}
> frame = 0x7f0e58003a7c
> ctx = 0x0
> ret = 0
> __FUNCTION__ = "glusterfs_rebalance_event_notify_cbk"
> (gdb) p frame->this
> $1 = (xlator_t *) 0x3a6000
> (gdb) p frame->this->name
> Cannot access memory at address 0x3a6000

There is a patch by sakshi on master at:
http://review.gluster.org/#/c/13262/

Its blocked on netbsd regression failures.

@sakshi,

Can you figure out what is the issue with netbsd regression failure? Also, can 
you send a backport for 3.7.

regards,
Raghavendra.

> 
> Pranith
> >
> > Thanks,
> >
> >  Venky
> 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] 3.7 pending patches

2016-01-28 Thread Nithya Balachandran
Sakshi has a fix for this:


Regards,
Nithya

- Original Message -
> From: "Pranith Kumar Karampuri" 
> To: "Venky Shankar" , "Gluster Devel" 
> 
> Cc: "Vijay Bellur" , "Raghavendra Gowdappa" 
> , "Nithya Balachandran"
> 
> Sent: Thursday, 28 January, 2016 8:29:16 PM
> Subject: Re: 3.7 pending patches
> 
> 
> 
> On 01/28/2016 07:05 PM, Venky Shankar wrote:
> > Hey folks,
> >
> > I just merged patch #13302 (and it's 3.7 equivalent) which fixes a scrubber
> > crash.
> > This was causing other patches to fail regression.
> >
> > Requesting a rebase of patches (especially 3.7 pending) that were blocked
> > due to
> > this.
> Thanks a lot for this venky, kotresh, Emmanuel. I re-triggered the builds.
> 
> I observed the following crash in one of the runs for
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/17819/console
> (3.7):
> (gdb) bt
> #0  0x0040ecff in glusterfs_rebalance_event_notify_cbk (
>  req=0x7f0e58006dbc, iov=0x7f0e6cadb5d0, count=1,
> myframe=0x7f0e58003a7c)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd-mgmt.c:1812
> #1  0x7f0e79a1274b in saved_frames_unwind (saved_frames=0x19ffe70)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:366
> #2  0x7f0e79a127ea in saved_frames_destroy (frames=0x19ffe70)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:383
> #3  0x7f0e79a12c41 in rpc_clnt_connection_cleanup (conn=0x19fea20)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:532
> #4  0x7f0e79a136cb in rpc_clnt_notify (trans=0x19fee70,
> mydata=0x19fea20,
>  event=RPC_TRANSPORT_DISCONNECT, data=0x19fee70)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:854
> #5  0x7f0e79a0fb76 in rpc_transport_notify (this=0x19fee70,
>  event=RPC_TRANSPORT_DISCONNECT, data=0x19fee70)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-transport.c:546
> #6  0x7f0e6f1fd621 in socket_event_poll_err (this=0x19fee70)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-t---Type
>  to continue, or q  to quit---
> ransport/socket/src/socket.c:1151
> #7  0x7f0e6f20234c in socket_event_handler (fd=9, idx=1,
> data=0x19fee70,
>  poll_in=1, poll_out=0, poll_err=24)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2356
> #8  0x7f0e79cc386c in event_dispatch_epoll_handler
> (event_pool=0x19c3c90,
>  event=0x7f0e6cadbe70)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:575
> #9  0x7f0e79cc3c5a in event_dispatch_epoll_worker (data=0x7f0e68014970)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:678
> #10 0x7f0e78f2aa51 in start_thread () from ./lib64/libpthread.so.0
> #11 0x7f0e7889493d in clone () from ./lib64/libc.so.6
> (gdb) fr 0
> #0  0x0040ecff in glusterfs_rebalance_event_notify_cbk (
>  req=0x7f0e58006dbc, iov=0x7f0e6cadb5d0, count=1,
> myframe=0x7f0e58003a7c)
>  at
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd-mgmt.c:1812
> 1812in
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd-mgmt.c
> (gdb) info locals
> rsp = {op_ret = 0, op_errno = 0, dict = {dict_len = 0, dict_val = 0x0}}
> frame = 0x7f0e58003a7c
> ctx = 0x0
> ret = 0
> __FUNCTION__ = "glusterfs_rebalance_event_notify_cbk"
> (gdb) p frame->this
> $1 = (xlator_t *) 0x3a6000
> (gdb) p frame->this->name
> Cannot access memory at address 0x3a6000
> 
> Pranith
> >
> > Thanks,
> >
> >  Venky
> 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] Trashcan issue with vim editor

2016-01-28 Thread Pranith Kumar Karampuri

+Anoop, Jiffin

On 01/27/2016 03:25 PM, PankaJ Singh wrote:


Hi,

We are using gluster 3.7.6 on ubuntu 14.04. We are facing an issue 
with trashcan feature.

Our scenario is as follow:

1. 2 node server (ubuntu 14.04 with glusterfs 3.7.6)
2. 1 client node (ubuntu 14.04)
3. I have created one volume vol1 with 2 bricks in replica and with 
transport = tcp mode.

4. I have enabled quota on vol1
5. Now I have enabled trashcan feature on vol1
6. Now I have mounted vol1 on client's home directory "mount -t 
glusterfs -o transport=tcp server-1:/vol1 /home/"
7. Now when I logged in via any existing non-root user and perform any 
editing via vim editor then I getting this error "*E200: *ReadPre 
autocommands made the file unreadable*" and my user's home 
directory*permission get changed to 000*. after sometime these 
permission gets revert back automatically.


(NOTE: user's home directories are copied in mounted directory 
glusterfs volume vol1)



Thanks & Regards
PankaJ Singh


___
Gluster-users mailing list
gluster-us...@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] 3.7 pending patches

2016-01-28 Thread Pranith Kumar Karampuri



On 01/28/2016 07:05 PM, Venky Shankar wrote:

Hey folks,

I just merged patch #13302 (and it's 3.7 equivalent) which fixes a scrubber 
crash.
This was causing other patches to fail regression.

Requesting a rebase of patches (especially 3.7 pending) that were blocked due to
this.

Thanks a lot for this venky, kotresh, Emmanuel. I re-triggered the builds.

I observed the following crash in one of the runs for 
https://build.gluster.org/job/rackspace-regression-2GB-triggered/17819/console 
(3.7):

(gdb) bt
#0  0x0040ecff in glusterfs_rebalance_event_notify_cbk (
req=0x7f0e58006dbc, iov=0x7f0e6cadb5d0, count=1, 
myframe=0x7f0e58003a7c)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd-mgmt.c:1812

#1  0x7f0e79a1274b in saved_frames_unwind (saved_frames=0x19ffe70)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:366

#2  0x7f0e79a127ea in saved_frames_destroy (frames=0x19ffe70)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:383

#3  0x7f0e79a12c41 in rpc_clnt_connection_cleanup (conn=0x19fea20)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:532
#4  0x7f0e79a136cb in rpc_clnt_notify (trans=0x19fee70, 
mydata=0x19fea20,

event=RPC_TRANSPORT_DISCONNECT, data=0x19fee70)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-clnt.c:854

#5  0x7f0e79a0fb76 in rpc_transport_notify (this=0x19fee70,
event=RPC_TRANSPORT_DISCONNECT, data=0x19fee70)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-transport.c:546

#6  0x7f0e6f1fd621 in socket_event_poll_err (this=0x19fee70)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-t---Type 
 to continue, or q  to quit---

ransport/socket/src/socket.c:1151
#7  0x7f0e6f20234c in socket_event_handler (fd=9, idx=1, 
data=0x19fee70,

poll_in=1, poll_out=0, poll_err=24)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2356
#8  0x7f0e79cc386c in event_dispatch_epoll_handler 
(event_pool=0x19c3c90,

event=0x7f0e6cadbe70)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:575

#9  0x7f0e79cc3c5a in event_dispatch_epoll_worker (data=0x7f0e68014970)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:678

#10 0x7f0e78f2aa51 in start_thread () from ./lib64/libpthread.so.0
#11 0x7f0e7889493d in clone () from ./lib64/libc.so.6
(gdb) fr 0
#0  0x0040ecff in glusterfs_rebalance_event_notify_cbk (
req=0x7f0e58006dbc, iov=0x7f0e6cadb5d0, count=1, 
myframe=0x7f0e58003a7c)
at 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd-mgmt.c:1812
1812in 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd-mgmt.c

(gdb) info locals
rsp = {op_ret = 0, op_errno = 0, dict = {dict_len = 0, dict_val = 0x0}}
frame = 0x7f0e58003a7c
ctx = 0x0
ret = 0
__FUNCTION__ = "glusterfs_rebalance_event_notify_cbk"
(gdb) p frame->this
$1 = (xlator_t *) 0x3a6000
(gdb) p frame->this->name
Cannot access memory at address 0x3a6000

Pranith


Thanks,

 Venky


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Throttling xlator on the bricks

2016-01-28 Thread Jeff Darcy
> TBF isn't complicated at all - it's widely used for traffic shaping, cgroups,
> UML to rate limit disk I/O.

It's not complicated and it's widely used, but that doesn't mean it's
the right fit for our needs.  Token buckets are good to create a
*ceiling* on resource utilization, but what if you want to set a floor
or allocate fair shares instead?  Even if what you want is a ceiling,
there's a problem of how many tokens should be entering the system.
Ideally that number should match the actual number of operations the
resource can handle per time quantum, but for networks and disks that
number can be pretty variable.  That's why network QoS is a poorly
solved problem and disk QoS is even worse.

To create a floor using token buckets, you have to chain buckets
together.  Each user/activity draws first from its own bucket, setting
the floor.  When that bucket is exhausted, it starts drawing from the
next bucket, eventually from an infinite "best effort" bucket at the end
of the chain.  To allocate fair shares (which is probably closest to
what we want in this case) you need active monitoring of how much work
the resource is actually doing.  As that number fluctuates, so does the
number of tokens, which are then divided *proportionally* between
buckets.  Hybrid approaches - e.g. low and high watermarks,
bucket-filling priorities - are also possible.

Then we get to the problem of how to distribute a resource fairly
*across nodes* when the limits are actually being applied locally on
each.  This is very similar to the problem we faced with quota over DHT,
and the same kind of approaches (e.g. a "balancing daemon") might apply.

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] 答复: Re: Gluster AFR volume write performance has been seriously affected by GLUSTERFS_WRITE_IS_APPEND in afr_writev

2016-01-28 Thread li . ping288
Sorry for the late reply.

Pranith Kumar Karampuri  写于 2016/01/25 17:48:06:

> From: Pranith Kumar Karampuri 
> To: li.ping...@zte.com.cn, 
> Cc: li.y...@zte.com.cn, zhou.shigan...@zte.com.cn, 
> liu.jianj...@zte.com.cn, yang.bi...@zte.com.cn
> Date: 2016/01/25 17:48
> Subject: Re: 答复: Re: [Gluster-devel] Gluster AFR volume write 
> performance has been seriously affected by GLUSTERFS_WRITE_IS_APPEND
> in afr_writev
> 
> 

> On 01/25/2016 03:09 PM, li.ping...@zte.com.cn wrote:
> Hi Pranith, 
> 
> I'd be willing to have a chance to do my contribution to open-source. 
> It's my first time to deliver a patch for GlusterFS, hence I'm not 
> quite familiar with the code review and submitting procedures. 
> 
> I'll try to make it ASAP. By the way is there any guidelines to do this 
work?
> http://www.gluster.org/community/documentation/index.php/
> Simplified_dev_workflow may be helpful. Feel free to ask any doubt 
> you may have.
> 
> How do you guys use glusterfs?
> 
> Pranith

Thanks for your warm tips.  We currently use glusterfs to build the shared 
storage for distributed cluster nodes.

Here are the solutions I pondered over these days:

1,Reverting the AFR GLUSTERFS_WRITE_IS_APPEND modifications.  because 
this optimization only play a part for appending write fops, 
 but most of the time of writing it is not kind of this. Hence I think 
it is not worth to do an optimization for the low probability situation 
 at cost of the vast majority of AFR writing performance drop. 
2,Revising the fixed GLUSTERFS_WRITE_IS_APPEND dictionary option in 
afr_writev in a dynamic way.  i.e. adding a new dynamic configurable
 option "write_is_append" just as the existing "ensure-durability" for 
AFR.  It could be configured on if AFR writing performance is not mainly 
 concerned and off if the performance is demanded.
 
I have been trying to find out a way in posix_writev to predict the 
appending write  in advance and then lock/unlock or not lock accordingly 
in the 
shortest and soonest, but I get no chance.

Anybody's other good ideas are appreciated.

Ping.Li

> 
> Thanks & Best Regards. 
> 
> Pranith Kumar Karampuri  写于 2016/01/23 14:01:36:
> 
> > From: Pranith Kumar Karampuri  
> > To: li.ping...@zte.com.cn, gluster-devel@gluster.org, 
> > Cc: li.y...@zte.com.cn, liu.jianj...@zte.com.cn, 
> > zhou.shigan...@zte.com.cn, yang.bi...@zte.com.cn 
> > Date: 2016/01/23 14:02 
> > Subject: Re: 答复: Re: [Gluster-devel] Gluster AFR volume write 
> > performance has been seriously affected by GLUSTERFS_WRITE_IS_APPEND
> > in afr_writev 
> > 
> > 
> 
> > On 01/22/2016 07:14 AM, li.ping...@zte.com.cn wrote: 
> > Hi Pranith, it is appreciated for your reply. 
> > 
> > Pranith Kumar Karampuri  写于 2016/01/20 
18:51:19:
> > 
> > > 发件人:  Pranith Kumar Karampuri  
> > > 收件人:  li.ping...@zte.com.cn, gluster-devel@gluster.org, 
> > > 日期:  2016/01/20 18:51 
> > > 主题: Re: [Gluster-devel] Gluster AFR volume write performance has 
> > > been seriously affected by GLUSTERFS_WRITE_IS_APPEND in afr_writev 
> > > 
> > > Sorry for the delay in response.
> > 
> > > On 01/15/2016 02:34 PM, li.ping...@zte.com.cn wrote: 
> > > GLUSTERFS_WRITE_IS_APPEND Setting in afr_writev function at 
> > > glusterfs client end makes the posix_writev in the server end  deal 
> > > IO write fops from parallel  to serial in consequence. 
> > > 
> > > i.e.  multiple io-worker threads carrying out IO write fops are 
> > > blocked in posix_writev to execute final write fop pwrite/pwritev in
> > > __posix_writev function ONE AFTER ANOTHER. 
> > > 
> > > For example: 
> > > 
> > > thread1: iot_worker -> ...  -> posix_writev()   | 
> > > thread2: iot_worker -> ...  -> posix_writev()   | 
> > > thread3: iot_worker -> ...  -> posix_writev()   -> __posix_writev() 
> > > thread4: iot_worker -> ...  -> posix_writev()   | 
> > > 
> > > there are 4 iot_worker thread doing the 128KB IO write fops as 
> > > above, but only one can execute __posix_writev function and the 
> > > others have to wait. 
> > > 
> > > however, if the afr volume is configured on with storage.linux-aio 
> > > which is off in default,  the iot_worker will use posix_aio_writev 
> > > instead of posix_writev to write data. 
> > > the posix_aio_writev function won't be affected by 
> > > GLUSTERFS_WRITE_IS_APPEND, and the AFR volume write performance goes 
up. 
> > > I think this is a bug :-(. 
> > 
> > Yeah, I agree with you. I suppose the GLUSTERFS_WRITE_IS_APPEND is a
> > misuse in afr_writev. 
> > I checked the original intent of GLUSTERS_WRITE_IS_APPEND change at 
> > review website: 
> > http://review.gluster.org/#/c/5501/ 
> > 
> > The initial purpose seems to avoid an unnecessary fsync() in 
> > afr_changelog_post_op_safe function if the writing data position 
> > was currently at the end of the file, detected by 
> > (preop.ia_size == offset || (fd->flags & O_APPEND)) in posix_writev. 
> > 
> > In comparison with the afr write performance loss, I think 
> > it costs too much. 
> > 
> > 

[Gluster-devel] 3.7 pending patches

2016-01-28 Thread Venky Shankar
Hey folks,

I just merged patch #13302 (and it's 3.7 equivalent) which fixes a scrubber 
crash.
This was causing other patches to fail regression.

Requesting a rebase of patches (especially 3.7 pending) that were blocked due to
this.

Thanks,

Venky
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Minutes: Gluster Community Bug Triage meeting 28th Jan 2016

2016-01-28 Thread Hari Gowtham
Hi All,

The minutes of todays minutes:

Meeting summary
---
* agenda https://public.pad.fsfe.org/p/gluster-bug-triage  (hgowtham_,
  12:01:59)
* roll call  (hgowtham_, 12:02:10)

* Group Triage  (hgowtham_, 12:06:22)
  * LINK: https://public.pad.fsfe.org/p/gluster-bugs-to-triage
(hgowtham_, 12:06:57)
  * LINK:
http://gluster.readthedocs.org/en/latest/Contributors-Guide/Bug-Triage/
(hgowtham_, 12:07:09)

* Open Floor  (hgowtham_, 12:20:22)

Meeting ended at 12:23:34 UTC.


Action Items

All the AI are carried on to next week as assignee weren't available.



People Present (lines said)
---
* hgowtham_ (19)
* Manikandan (6)
* skoduri (3)
* zodbot (3)
* jiffin (2)


- Forwarded Message -
From: "Hari Gowtham" 
To: "Gluster Devel" 
Sent: Thursday, January 28, 2016 3:50:21 PM
Subject: [Gluster-devel] REMINDER: Gluster Community Bug Triage meeting (Today)

Hi all,

The weekly bug triage is about to take place in ~100 minutes.

Meeting details:
- location: #gluster-meeting on Freenode IRC
( https://webchat.freenode.net/?channels=gluster-meeting )
- date: every Tuesday
- time: 12:00 UTC  
(in your terminal, run: date -d "12:00 UTC")
- agenda: https://public.pad.fsfe.org/p/gluster-bug-triage

Currently the following items are listed:
* Roll Call
* Status of last weeks action items
* Group Triage
* Open Floor

Appreciate your participation.

-- 
Regards, 
Hari. 

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

-- 
Regards, 
Hari. 

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] make install throwing python error

2016-01-28 Thread Bipin Kunal
Hi,

I have downloaded glusterfs source rpm from :
http://download.gluster.org/pub/gluster/glusterfs/LATEST/Fedora/fedora-21/SRPMS/

I extracted the source and I tried compiling and installing it. While
running "make install" I started getting error.


Here is the steps performed :

1) ./autogen.sh
2) ./configure
3) make
4) make install

Steps 1, 2 and 3 was error free.

Here is the error during "make install"

Making install in glupy
Making install in src
Making install in glupy
 /usr/bin/mkdir -p '/usr/lib/python2.7/site-packages/gluster/glupy'
 /usr/bin/install -c -m 644 __init__.py
'/usr/lib/python2.7/site-packages/gluster/glupy'
../../../../../py-compile: Missing argument to --destdir.
Makefile:414: recipe for target 'install-pyglupyPYTHON' failed
make[6]: *** [install-pyglupyPYTHON] Error 1
Makefile:511: recipe for target 'install-am' failed
make[5]: *** [install-am] Error 2
Makefile:658: recipe for target 'install-recursive' failed
make[4]: *** [install-recursive] Error 1
Makefile:445: recipe for target 'install-recursive' failed
make[3]: *** [install-recursive] Error 1
Makefile:448: recipe for target 'install-recursive' failed
make[2]: *** [install-recursive] Error 1
Makefile:447: recipe for target 'install-recursive' failed
make[1]: *** [install-recursive] Error 1
Makefile:576: recipe for target 'install-recursive' failed
make: *** [install-recursive] Error 1

Am I missing some binaries?

Please help me in installing.

Thanks,
Bipin Kunal
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] GlusterFS FUSE client hangs on rsyncing lots of file

2016-01-28 Thread Pranith Kumar Karampuri
With baul jianguo's help I am able to see that FLUSH fops are hanging 
for some reason.


pk1@localhost - ~/Downloads
17:02:13 :) ⚡ grep "unique=" client-dump1.txt
unique=3160758373
unique=2073075682
unique=1455047665
unique=0

pk1@localhost - ~/Downloads
17:02:21 :) ⚡ grep "unique=" client-dump-0.txt
unique=3160758373
unique=2073075682
unique=1455047665
unique=0

I will be debugging a bit more and post my findings.

Pranith
On 01/28/2016 03:18 PM, baul jianguo wrote:

the client glusterfs gdb info, main thread id is 70800。
  In the top output,70800 thread time 1263:30,70810 thread time
1321:10,other thread time too small。
(gdb) thread apply all bt



Thread 9 (Thread 0x7fc21acaf700 (LWP 70801)):

#0  0x7fc21cc0c535 in sigwait () from /lib64/libpthread.so.0

#1  0x0040539b in glusterfs_sigwaiter (arg=) at glusterfsd.c:1653

#2  0x7fc21cc04a51 in start_thread () from /lib64/libpthread.so.0

#3  0x7fc21c56e93d in clone () from /lib64/libc.so.6



Thread 8 (Thread 0x7fc21a2ae700 (LWP 70802)):

#0  0x7fc21cc08a0e in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0

#1  0x7fc21ded02bf in syncenv_task (proc=0x121ee60) at syncop.c:493

#2  0x7fc21ded6300 in syncenv_processor (thdata=0x121ee60) at syncop.c:571

#3  0x7fc21cc04a51 in start_thread () from /lib64/libpthread.so.0

#4  0x7fc21c56e93d in clone () from /lib64/libc.so.6



Thread 7 (Thread 0x7fc2198ad700 (LWP 70803)):

#0  0x7fc21cc08a0e in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0

#1  0x7fc21ded02bf in syncenv_task (proc=0x121f220) at syncop.c:493

#2  0x7fc21ded6300 in syncenv_processor (thdata=0x121f220) at syncop.c:571

#3  0x7fc21cc04a51 in start_thread () from /lib64/libpthread.so.0

#4  0x7fc21c56e93d in clone () from /lib64/libc.so.6



Thread 6 (Thread 0x7fc21767d700 (LWP 70805)):

#0  0x7fc21cc0bfbd in nanosleep () from /lib64/libpthread.so.0

#1  0x7fc21deb16bc in gf_timer_proc (ctx=0x11f2010) at timer.c:170

#2  0x7fc21cc04a51 in start_thread () from /lib64/libpthread.so.0

#3  0x7fc21c56e93d in clone () from /lib64/libc.so.6



Thread 5 (Thread 0x7fc20fb1e700 (LWP 70810)):

#0  0x7fc21c566987 in readv () from /lib64/libc.so.6

#1  0x7fc21accbc55 in fuse_thread_proc (data=0x120f450) at
fuse-bridge.c:4752

#2  0x7fc21cc04a51 in start_thread () from /lib64/libpthread.so.0

#3  0x7fc21c56e93d in clone () from /lib64/libc.so.6 时间最多



Thread 4 (Thread 0x7fc20f11d700 (LWP 70811)): 少点

#0  0x7fc21cc0b7dd in read () from /lib64/libpthread.so.0

#1  0x7fc21acc0e73 in read (data=) at
/usr/include/bits/unistd.h:45

#2  notify_kernel_loop (data=) at fuse-bridge.c:3786

#3  0x7fc21cc04a51 in start_thread () from /lib64/libpthread.so.0

#4  0x7fc21c56e93d in clone () from /lib64/libc.so.6



Thread 3 (Thread 0x7fc1b16fe700 (LWP 206224)):

---Type  to continue, or q  to quit---

#0  0x7fc21cc08a0e in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0

#1  0x7fc20e515e60 in iot_worker (data=0x19eeda0) at io-threads.c:157

#2  0x7fc21cc04a51 in start_thread () from /lib64/libpthread.so.0

#3  0x7fc21c56e93d in clone () from /lib64/libc.so.6



Thread 2 (Thread 0x7fc1b0bfb700 (LWP 214361)):

#0  0x7fc21cc08a0e in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0

#1  0x7fc20e515e60 in iot_worker (data=0x19eeda0) at io-threads.c:157

#2  0x7fc21cc04a51 in start_thread () from /lib64/libpthread.so.0

#3  0x7fc21c56e93d in clone () from /lib64/libc.so.6



Thread 1 (Thread 0x7fc21e31e700 (LWP 70800)):

#0  0x7fc21c56ef33 in epoll_wait () from /lib64/libc.so.6

#1  0x7fc21deea3e7 in event_dispatch_epoll (event_pool=0x120dec0)
at event-epoll.c:428

#2  0x004075e4 in main (argc=4, argv=0x7fff3dc93698) at
glusterfsd.c:1983

On Thu, Jan 28, 2016 at 5:29 PM, baul jianguo  wrote:

http://pastebin.centos.org/38941/
client statedump,only the pid 27419,168030,208655 hang,you can search
this pid in the statedump file。

On Wed, Jan 27, 2016 at 4:35 PM, Pranith Kumar Karampuri
 wrote:

Hi,
   If the hang appears on enabling client side io-threads then it could
be because of some race that is seen when io-threads is enabled on the
client side. 2 things will help us debug this issue:
1) thread apply all bt inside gdb (with debuginfo rpms/debs installed )
2) Complete statedump of the mount at two intervals preferably 10 seconds
apart. It becomes difficult to find out which ones are stuck vs the ones
that are on-going when we have just one statedump. If we have two, we can
find which frames are common in both of the statedumps and then take a
closer look there.

Feel free to ping me on #gluster-dev, nick: pranithk if you have the process
hung in that state and you guys don't mind me do a live debugging with you
guys. This option is the best of the lot!

Thanks a lot baul, Oleksandr for the debugging so far!

Pranith


On 01/25/2016 01:03 PM, baul 

[Gluster-devel] REMINDER: Gluster Community Bug Triage meeting (Today)

2016-01-28 Thread Hari Gowtham
Hi all,

The weekly bug triage is about to take place in ~100 minutes.

Meeting details:
- location: #gluster-meeting on Freenode IRC
( https://webchat.freenode.net/?channels=gluster-meeting )
- date: every Tuesday
- time: 12:00 UTC  
(in your terminal, run: date -d "12:00 UTC")
- agenda: https://public.pad.fsfe.org/p/gluster-bug-triage

Currently the following items are listed:
* Roll Call
* Status of last weeks action items
* Group Triage
* Open Floor

Appreciate your participation.

-- 
Regards, 
Hari. 

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] patch #10954

2016-01-28 Thread Sakshi Bansal
Just checking if the changes made to brick_up_status by patch #12913 are 
required in 3.7 as well (since it is not backported)

- Original Message -
From: "Ravishankar N" 
To: "Venky Shankar" , "Sakshi Bansal" 
Cc: "Ravishankar Narayanankutty" , "Gluster Devel" 

Sent: Thursday, January 28, 2016 2:21:18 PM
Subject: Re: [Gluster-devel] patch #10954

On 01/28/2016 12:50 PM, Venky Shankar wrote:
> Yes, that should be good. Better to have just one version of the routine. 
> Also, I
> think Ravi found a bug in brick_up_status() [or the _1 version?].
http://review.gluster.org/12913 fixed it upstream already. It wasn't 
sent to 3.7.
I think the patch http://review.gluster.org/13276 in 3.7  probably 
hand-copied brick_up_status() from an earlier git HEAD


>   So, that should
> also be incorporated.
>
> You'll probably get a conflict during backport as the routine was hand copied.
>
>> >


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] GlusterFS FUSE client hangs on rsyncing lots of file

2016-01-28 Thread baul jianguo
the client glusterfs gdb info, main thread id is 70800。
 In the top output,70800 thread time 1263:30,70810 thread time
1321:10,other thread time too small。
(gdb) thread apply all bt



Thread 9 (Thread 0x7fc21acaf700 (LWP 70801)):

#0  0x7fc21cc0c535 in sigwait () from /lib64/libpthread.so.0

#1  0x0040539b in glusterfs_sigwaiter (arg=) at glusterfsd.c:1653

#2  0x7fc21cc04a51 in start_thread () from /lib64/libpthread.so.0

#3  0x7fc21c56e93d in clone () from /lib64/libc.so.6



Thread 8 (Thread 0x7fc21a2ae700 (LWP 70802)):

#0  0x7fc21cc08a0e in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0

#1  0x7fc21ded02bf in syncenv_task (proc=0x121ee60) at syncop.c:493

#2  0x7fc21ded6300 in syncenv_processor (thdata=0x121ee60) at syncop.c:571

#3  0x7fc21cc04a51 in start_thread () from /lib64/libpthread.so.0

#4  0x7fc21c56e93d in clone () from /lib64/libc.so.6



Thread 7 (Thread 0x7fc2198ad700 (LWP 70803)):

#0  0x7fc21cc08a0e in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0

#1  0x7fc21ded02bf in syncenv_task (proc=0x121f220) at syncop.c:493

#2  0x7fc21ded6300 in syncenv_processor (thdata=0x121f220) at syncop.c:571

#3  0x7fc21cc04a51 in start_thread () from /lib64/libpthread.so.0

#4  0x7fc21c56e93d in clone () from /lib64/libc.so.6



Thread 6 (Thread 0x7fc21767d700 (LWP 70805)):

#0  0x7fc21cc0bfbd in nanosleep () from /lib64/libpthread.so.0

#1  0x7fc21deb16bc in gf_timer_proc (ctx=0x11f2010) at timer.c:170

#2  0x7fc21cc04a51 in start_thread () from /lib64/libpthread.so.0

#3  0x7fc21c56e93d in clone () from /lib64/libc.so.6



Thread 5 (Thread 0x7fc20fb1e700 (LWP 70810)):

#0  0x7fc21c566987 in readv () from /lib64/libc.so.6

#1  0x7fc21accbc55 in fuse_thread_proc (data=0x120f450) at
fuse-bridge.c:4752

#2  0x7fc21cc04a51 in start_thread () from /lib64/libpthread.so.0

#3  0x7fc21c56e93d in clone () from /lib64/libc.so.6 时间最多



Thread 4 (Thread 0x7fc20f11d700 (LWP 70811)): 少点

#0  0x7fc21cc0b7dd in read () from /lib64/libpthread.so.0

#1  0x7fc21acc0e73 in read (data=) at
/usr/include/bits/unistd.h:45

#2  notify_kernel_loop (data=) at fuse-bridge.c:3786

#3  0x7fc21cc04a51 in start_thread () from /lib64/libpthread.so.0

#4  0x7fc21c56e93d in clone () from /lib64/libc.so.6



Thread 3 (Thread 0x7fc1b16fe700 (LWP 206224)):

---Type  to continue, or q  to quit---

#0  0x7fc21cc08a0e in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0

#1  0x7fc20e515e60 in iot_worker (data=0x19eeda0) at io-threads.c:157

#2  0x7fc21cc04a51 in start_thread () from /lib64/libpthread.so.0

#3  0x7fc21c56e93d in clone () from /lib64/libc.so.6



Thread 2 (Thread 0x7fc1b0bfb700 (LWP 214361)):

#0  0x7fc21cc08a0e in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0

#1  0x7fc20e515e60 in iot_worker (data=0x19eeda0) at io-threads.c:157

#2  0x7fc21cc04a51 in start_thread () from /lib64/libpthread.so.0

#3  0x7fc21c56e93d in clone () from /lib64/libc.so.6



Thread 1 (Thread 0x7fc21e31e700 (LWP 70800)):

#0  0x7fc21c56ef33 in epoll_wait () from /lib64/libc.so.6

#1  0x7fc21deea3e7 in event_dispatch_epoll (event_pool=0x120dec0)
at event-epoll.c:428

#2  0x004075e4 in main (argc=4, argv=0x7fff3dc93698) at
glusterfsd.c:1983

On Thu, Jan 28, 2016 at 5:29 PM, baul jianguo  wrote:
> http://pastebin.centos.org/38941/
> client statedump,only the pid 27419,168030,208655 hang,you can search
> this pid in the statedump file。
>
> On Wed, Jan 27, 2016 at 4:35 PM, Pranith Kumar Karampuri
>  wrote:
>> Hi,
>>   If the hang appears on enabling client side io-threads then it could
>> be because of some race that is seen when io-threads is enabled on the
>> client side. 2 things will help us debug this issue:
>> 1) thread apply all bt inside gdb (with debuginfo rpms/debs installed )
>> 2) Complete statedump of the mount at two intervals preferably 10 seconds
>> apart. It becomes difficult to find out which ones are stuck vs the ones
>> that are on-going when we have just one statedump. If we have two, we can
>> find which frames are common in both of the statedumps and then take a
>> closer look there.
>>
>> Feel free to ping me on #gluster-dev, nick: pranithk if you have the process
>> hung in that state and you guys don't mind me do a live debugging with you
>> guys. This option is the best of the lot!
>>
>> Thanks a lot baul, Oleksandr for the debugging so far!
>>
>> Pranith
>>
>>
>> On 01/25/2016 01:03 PM, baul jianguo wrote:
>>>
>>> 3.5.7 also hangs.only the flush op hung. Yes,off the
>>> performance.client-io-threads ,no hang.
>>>
>>> The hang does not relate the client kernel version.
>>>
>>> One client statdump about flush op,any abnormal?
>>>
>>> [global.callpool.stack.12]
>>>
>>> uid=0
>>>
>>> gid=0
>>>
>>> pid=14432
>>>
>>> unique=16336007098
>>>
>>> lk-owner=77cb199aa36f3641
>>>
>>> op=FLUSH
>>>
>>> type=

Re: [Gluster-devel] [Gluster-users] GlusterFS FUSE client hangs on rsyncing lots of file

2016-01-28 Thread Pranith Kumar Karampuri



On 01/28/2016 02:59 PM, baul jianguo wrote:

http://pastebin.centos.org/38941/
client statedump,only the pid 27419,168030,208655 hang,you can search
this pid in the statedump file。

Could you take one more statedump please?

Pranith


On Wed, Jan 27, 2016 at 4:35 PM, Pranith Kumar Karampuri
 wrote:

Hi,
   If the hang appears on enabling client side io-threads then it could
be because of some race that is seen when io-threads is enabled on the
client side. 2 things will help us debug this issue:
1) thread apply all bt inside gdb (with debuginfo rpms/debs installed )
2) Complete statedump of the mount at two intervals preferably 10 seconds
apart. It becomes difficult to find out which ones are stuck vs the ones
that are on-going when we have just one statedump. If we have two, we can
find which frames are common in both of the statedumps and then take a
closer look there.

Feel free to ping me on #gluster-dev, nick: pranithk if you have the process
hung in that state and you guys don't mind me do a live debugging with you
guys. This option is the best of the lot!

Thanks a lot baul, Oleksandr for the debugging so far!

Pranith


On 01/25/2016 01:03 PM, baul jianguo wrote:

3.5.7 also hangs.only the flush op hung. Yes,off the
performance.client-io-threads ,no hang.

The hang does not relate the client kernel version.

One client statdump about flush op,any abnormal?

[global.callpool.stack.12]

uid=0

gid=0

pid=14432

unique=16336007098

lk-owner=77cb199aa36f3641

op=FLUSH

type=1

cnt=6



[global.callpool.stack.12.frame.1]

ref_count=1

translator=fuse

complete=0



[global.callpool.stack.12.frame.2]

ref_count=0

translator=datavolume-write-behind

complete=0

parent=datavolume-read-ahead

wind_from=ra_flush

wind_to=FIRST_CHILD (this)->fops->flush

unwind_to=ra_flush_cbk



[global.callpool.stack.12.frame.3]

ref_count=1

translator=datavolume-read-ahead

complete=0

parent=datavolume-open-behind

wind_from=default_flush_resume

wind_to=FIRST_CHILD(this)->fops->flush

unwind_to=default_flush_cbk



[global.callpool.stack.12.frame.4]

ref_count=1

translator=datavolume-open-behind

complete=0

parent=datavolume-io-threads

wind_from=iot_flush_wrapper

wind_to=FIRST_CHILD(this)->fops->flush

unwind_to=iot_flush_cbk



[global.callpool.stack.12.frame.5]

ref_count=1

translator=datavolume-io-threads

complete=0

parent=datavolume

wind_from=io_stats_flush

wind_to=FIRST_CHILD(this)->fops->flush

unwind_to=io_stats_flush_cbk



[global.callpool.stack.12.frame.6]

ref_count=1

translator=datavolume

complete=0

parent=fuse

wind_from=fuse_flush_resume

wind_to=xl->fops->flush

unwind_to=fuse_err_cbk



On Sun, Jan 24, 2016 at 5:35 AM, Oleksandr Natalenko
 wrote:

With "performance.client-io-threads" set to "off" no hangs occurred in 3
rsync/rm rounds. Could that be some fuse-bridge lock race? Will bring
that
option to "on" back again and try to get full statedump.

On четвер, 21 січня 2016 р. 14:54:47 EET Raghavendra G wrote:

On Thu, Jan 21, 2016 at 10:49 AM, Pranith Kumar Karampuri <

pkara...@redhat.com> wrote:

On 01/18/2016 02:28 PM, Oleksandr Natalenko wrote:

XFS. Server side works OK, I'm able to mount volume again. Brick is
30%
full.

Oleksandr,

Will it be possible to get the statedump of the client, bricks

output next time it happens?


https://github.com/gluster/glusterfs/blob/master/doc/debugging/statedump.m
d#how-to-generate-statedump

We also need to dump inode information. To do that you've to add
"all=yes"
to /var/run/gluster/glusterdump.options before you issue commands to get
statedump.


Pranith


On понеділок, 18 січня 2016 р. 15:07:18 EET baul jianguo wrote:

What is your brick file system? and the glusterfsd process and all
thread status?
I met same issue when client app such as rsync stay in D status,and
the brick process and relate thread also be in the D status.
And the brick dev disk util is 100% .

On Sun, Jan 17, 2016 at 6:13 AM, Oleksandr Natalenko

 wrote:

Wrong assumption, rsync hung again.

On субота, 16 січня 2016 р. 22:53:04 EET Oleksandr Natalenko wrote:

One possible reason:

cluster.lookup-optimize: on
cluster.readdir-optimize: on

I've disabled both optimizations, and at least as of now rsync
still
does
its job with no issues. I would like to find out what option causes
such
a
behavior and why. Will test more.

On пʼятниця, 15 січня 2016 р. 16:09:51 EET Oleksandr Natalenko
wrote:

Another observation: if rsyncing is resumed after hang, rsync
itself
hangs a lot faster because it does stat of already copied files.
So,
the
reason may be not writing itself, but massive stat on GlusterFS
volume
as well.

15.01.2016 09:40, Oleksandr Natalenko написав:

While doing rsync over millions of files from ordinary partition
to
GlusterFS volume, just after approx. first 2 million rsync hang
happens, and the following info appears in dmesg:

===
[17075038.924481] INFO: task rsync:10310 blocked for more than
120
seconds.
[17075038.931948] "echo 0 >
/proc/sys/kernel/hun

Re: [Gluster-devel] [Gluster-users] GlusterFS FUSE client hangs on rsyncing lots of file

2016-01-28 Thread baul jianguo
http://pastebin.centos.org/38941/
client statedump,only the pid 27419,168030,208655 hang,you can search
this pid in the statedump file。

On Wed, Jan 27, 2016 at 4:35 PM, Pranith Kumar Karampuri
 wrote:
> Hi,
>   If the hang appears on enabling client side io-threads then it could
> be because of some race that is seen when io-threads is enabled on the
> client side. 2 things will help us debug this issue:
> 1) thread apply all bt inside gdb (with debuginfo rpms/debs installed )
> 2) Complete statedump of the mount at two intervals preferably 10 seconds
> apart. It becomes difficult to find out which ones are stuck vs the ones
> that are on-going when we have just one statedump. If we have two, we can
> find which frames are common in both of the statedumps and then take a
> closer look there.
>
> Feel free to ping me on #gluster-dev, nick: pranithk if you have the process
> hung in that state and you guys don't mind me do a live debugging with you
> guys. This option is the best of the lot!
>
> Thanks a lot baul, Oleksandr for the debugging so far!
>
> Pranith
>
>
> On 01/25/2016 01:03 PM, baul jianguo wrote:
>>
>> 3.5.7 also hangs.only the flush op hung. Yes,off the
>> performance.client-io-threads ,no hang.
>>
>> The hang does not relate the client kernel version.
>>
>> One client statdump about flush op,any abnormal?
>>
>> [global.callpool.stack.12]
>>
>> uid=0
>>
>> gid=0
>>
>> pid=14432
>>
>> unique=16336007098
>>
>> lk-owner=77cb199aa36f3641
>>
>> op=FLUSH
>>
>> type=1
>>
>> cnt=6
>>
>>
>>
>> [global.callpool.stack.12.frame.1]
>>
>> ref_count=1
>>
>> translator=fuse
>>
>> complete=0
>>
>>
>>
>> [global.callpool.stack.12.frame.2]
>>
>> ref_count=0
>>
>> translator=datavolume-write-behind
>>
>> complete=0
>>
>> parent=datavolume-read-ahead
>>
>> wind_from=ra_flush
>>
>> wind_to=FIRST_CHILD (this)->fops->flush
>>
>> unwind_to=ra_flush_cbk
>>
>>
>>
>> [global.callpool.stack.12.frame.3]
>>
>> ref_count=1
>>
>> translator=datavolume-read-ahead
>>
>> complete=0
>>
>> parent=datavolume-open-behind
>>
>> wind_from=default_flush_resume
>>
>> wind_to=FIRST_CHILD(this)->fops->flush
>>
>> unwind_to=default_flush_cbk
>>
>>
>>
>> [global.callpool.stack.12.frame.4]
>>
>> ref_count=1
>>
>> translator=datavolume-open-behind
>>
>> complete=0
>>
>> parent=datavolume-io-threads
>>
>> wind_from=iot_flush_wrapper
>>
>> wind_to=FIRST_CHILD(this)->fops->flush
>>
>> unwind_to=iot_flush_cbk
>>
>>
>>
>> [global.callpool.stack.12.frame.5]
>>
>> ref_count=1
>>
>> translator=datavolume-io-threads
>>
>> complete=0
>>
>> parent=datavolume
>>
>> wind_from=io_stats_flush
>>
>> wind_to=FIRST_CHILD(this)->fops->flush
>>
>> unwind_to=io_stats_flush_cbk
>>
>>
>>
>> [global.callpool.stack.12.frame.6]
>>
>> ref_count=1
>>
>> translator=datavolume
>>
>> complete=0
>>
>> parent=fuse
>>
>> wind_from=fuse_flush_resume
>>
>> wind_to=xl->fops->flush
>>
>> unwind_to=fuse_err_cbk
>>
>>
>>
>> On Sun, Jan 24, 2016 at 5:35 AM, Oleksandr Natalenko
>>  wrote:
>>>
>>> With "performance.client-io-threads" set to "off" no hangs occurred in 3
>>> rsync/rm rounds. Could that be some fuse-bridge lock race? Will bring
>>> that
>>> option to "on" back again and try to get full statedump.
>>>
>>> On четвер, 21 січня 2016 р. 14:54:47 EET Raghavendra G wrote:

 On Thu, Jan 21, 2016 at 10:49 AM, Pranith Kumar Karampuri <

 pkara...@redhat.com> wrote:
>
> On 01/18/2016 02:28 PM, Oleksandr Natalenko wrote:
>>
>> XFS. Server side works OK, I'm able to mount volume again. Brick is
>> 30%
>> full.
>
> Oleksandr,
>
>Will it be possible to get the statedump of the client, bricks
>
> output next time it happens?
>
>
> https://github.com/gluster/glusterfs/blob/master/doc/debugging/statedump.m
> d#how-to-generate-statedump

 We also need to dump inode information. To do that you've to add
 "all=yes"
 to /var/run/gluster/glusterdump.options before you issue commands to get
 statedump.

> Pranith
>
>> On понеділок, 18 січня 2016 р. 15:07:18 EET baul jianguo wrote:
>>>
>>> What is your brick file system? and the glusterfsd process and all
>>> thread status?
>>> I met same issue when client app such as rsync stay in D status,and
>>> the brick process and relate thread also be in the D status.
>>> And the brick dev disk util is 100% .
>>>
>>> On Sun, Jan 17, 2016 at 6:13 AM, Oleksandr Natalenko
>>>
>>>  wrote:

 Wrong assumption, rsync hung again.

 On субота, 16 січня 2016 р. 22:53:04 EET Oleksandr Natalenko wrote:
>
> One possible reason:
>
> cluster.lookup-optimize: on
> cluster.readdir-optimize: on
>
> I've disabled both optimizations, and at least as of now rsync
> still
> does
> its job with no issues. I would like to find out what option causes
> such
> a
> behavior and why.

Re: [Gluster-devel] patch #10954

2016-01-28 Thread Ravishankar N

On 01/28/2016 12:50 PM, Venky Shankar wrote:

Yes, that should be good. Better to have just one version of the routine. Also, 
I
think Ravi found a bug in brick_up_status() [or the _1 version?].
http://review.gluster.org/12913 fixed it upstream already. It wasn't 
sent to 3.7.
I think the patch http://review.gluster.org/13276 in 3.7  probably 
hand-copied brick_up_status() from an earlier git HEAD




  So, that should
also be incorporated.

You'll probably get a conflict during backport as the routine was hand copied.


>



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-infra] Different version of run-tests.sh in jenkin slaves?

2016-01-28 Thread Niels de Vos
On Thu, Jan 28, 2016 at 12:00:41PM +0530, Raghavendra Talur wrote:
> Ok, RCA:
> 
> In NetBSD cores are being generated in /d/backends/*/*.core
> run-tests.sh looks only for "/core*" when looking for cores.
> 
> So, at the end of test run when regression.sh looks for core everywhere, it
> finds one and errors out.
> 
> Should think of a solution which is generic. Will update.

regression.sh is maintained on GitHub. All slaves should have this
repository checked out as /opt/qa. Please make sure any changes to this
script are pushed into the repo too:
https://github.com/gluster/glusterfs-patch-acceptance-tests/

Niels

> 
> 
> On Thu, Jan 28, 2016 at 11:37 AM, Raghavendra Talur 
> wrote:
> 
> >
> >
> > On Thu, Jan 28, 2016 at 11:17 AM, Atin Mukherjee 
> > wrote:
> >
> >> Are we running a different version of run-tests.sh in jenkin slaves. The
> >> reason of suspection is beacuse in last couple of runs [1] & [2] in
> >> NetBSD I am seeing no failures apart from bad tests but the regression
> >> voted failure and I can not make out any valid reason out of it.
> >>
> >> [1]
> >>
> >> https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/13756/consoleFull
> >> [2]
> >>
> >> https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/13755/consoleFull
> >
> >
> >
> > I checked the slave machine now.
> > regression.sh file is different but the run-tests.sh script is same.
> >
> > A wild guess here, is it possible that the core generation takes time and
> > when we check for a core right after a test is run it is not present yet?
> > Does anyone know how to work around that?
> >
> >
> >>
> >> ~Atin
> >> ___
> >> Gluster-infra mailing list
> >> gluster-in...@gluster.org
> >> http://www.gluster.org/mailman/listinfo/gluster-infra
> >>
> >
> >

> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel



signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-infra] Different version of run-tests.sh in jenkin slaves?

2016-01-28 Thread Emmanuel Dreyfus
On Thu, Jan 28, 2016 at 12:17:58PM +0530, Raghavendra Talur wrote:
> Where do I find config in NetBSD which decides which location to dump core
> in?

I crafted the patch below, bbut it is probably much simplier to just
set kern.defcorename to /%n-%p.core on all VM slaves. I will do it.

diff --git a/xlators/storage/posix/src/posix.c 
b/xlators/storage/posix/src/posix.c
index 272d08f..2fd2d7d 100644
--- a/xlators/storage/posix/src/posix.c
+++ b/xlators/storage/posix/src/posix.c
@@ -29,6 +29,10 @@
 #include 
 #endif /* HAVE_LINKAT */
 
+#ifdef __NetBSD__
+#include 
+#endif /* __NetBSD__ */
+
 #include "glusterfs.h"
 #include "checksum.h"
 #include "dict.h"
@@ -6631,6 +6635,8 @@ init (xlator_t *this)
 _private->path_max = pathconf(_private->base_path, _PC_PATH_MAX);
 if (_private->path_max != -1 &&
 _XOPEN_PATH_MAX + _private->base_path_length > _private->path_max) 
{
+char corename[] = "/%n-%p.core";
+
 ret = chdir(_private->base_path);
 if (ret) {
 gf_msg (this->name, GF_LOG_ERROR, 0,
@@ -6639,7 +6645,15 @@ init (xlator_t *this)
 _private->base_path);
 goto out;
 }
+
 #ifdef __NetBSD__
+/* 
+ * Make sure cores go to the root and not in current 
+ * directory
+ */
+(void)sysctlbyname("proc.curproc.corename", NULL, NULL, 
+   corename, strlen(corename) + 1);
+
 /*
  * At least on NetBSD, the chdir() above uncovers a
  * race condition which cause file lookup to fail


-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-infra] Different version of run-tests.sh in jenkin slaves?

2016-01-28 Thread Emmanuel Dreyfus
On Thu, Jan 28, 2016 at 12:10:49PM +0530, Atin Mukherjee wrote:
> So does that mean we never analyzed any core reported by NetBSD
> regression failure? That's strange.

We got the cores from / but not from d/backends/*/ as I understand.

I am glad someone figured out the mystery. 

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-infra] Different version of run-tests.sh in jenkin slaves?

2016-01-28 Thread Emmanuel Dreyfus
On Thu, Jan 28, 2016 at 12:17:58PM +0530, Raghavendra Talur wrote:
> Where do I find config in NetBSD which decides which location to dump core
> in?

sysctl kern.defcorename for the default location and name. It can be
overriden per process using  sysctl proc.$$.corename

> Any particular reason you added /d/backends/*/*.core to list of path to
> search for core?

Yes, this is required for standard compliance of the exposed glusterfs
filesystem in the case of low system PATH_MAX. See in posix.c:

/*  
 * _XOPEN_PATH_MAX is the longest file path len we MUST 
 * support according to POSIX standard. When prepended
 * by the brick base path it may exceed backed filesystem
 * capacity (which MAY be bigger than _XOPEN_PATH_MAX). If
 * this is the case, chdir() to the brick base path and
 * use relative paths when they are too long. See also
 * MAKE_REAL_PATH in posix-handle.h   
  */  
_private->path_max = pathconf(_private->base_path, _PC_PATH_MAX);
if (_private->path_max != -1 &&   
_XOPEN_PATH_MAX + _private->base_path_length > _private->path_max) {
ret = chdir(_private->base_path); 
if (ret) {
gf_msg (this->name, GF_LOG_ERROR, 0,
P_MSG_BASEPATH_CHDIR_FAILED,
"chdir() to \"%s\" failed",
_private->base_path);
goto out;
}
And the core goes in current directory by default. We could use
sysctl(3) to change that if we need.


-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel