On Wed, 2014-06-18 at 13:09 +0530, Lalatendu Mohanty wrote: > On 06/17/2014 02:25 PM, Susant Palai wrote: > > Hi Franco: > > The following patches address the ENOTEMPTY issue. > > > > 1. http://review.gluster.org/#/c/7733/ > > 2. http://review.gluster.org/#/c/7599/ > > > > I think the above patches will be available in 3.5.1 which will be a minor > > upgrade.(Need ack from Niels de Vos.) > > > > Hi Lala, > > Can you provide the steps to downgrade to 3.4 from 3.5 ? > > > > Thanks :) > > If you are using a RPM based distribution, "yum downgrade" command > should work if yum have access to 3.5 and 3.4 repos. In particular I > have not tested the downgrade scenario from 3.5 to 3.4. I would suggest > that you stop your volume and kill gluster processes while downgrading.
I did try installing 3.4 but the volume wouldn't start: [2014-06-16 02:53:16.886995] I [glusterfsd.c:1910:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.4.3 (/usr/sbin/glusterd --pid-file=/var/run/glusterd.pid) [2014-06-16 02:53:16.889605] I [glusterd.c:961:init] 0-management: Using /var/lib/glusterd as working directory [2014-06-16 02:53:16.891580] I [socket.c:3480:socket_init] 0-socket.management: SSL support is NOT enabled [2014-06-16 02:53:16.891600] I [socket.c:3495:socket_init] 0-socket.management: using system polling thread [2014-06-16 02:53:16.891675] E [rpc-transport.c:253:rpc_transport_load] 0-rpc-transport: /usr/lib64/glusterfs/3.4.3/rpc-transport/rdma.so: cannot open shared object file: No such file or directory [2014-06-16 02:53:16.891691] W [rpc-transport.c:257:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine [2014-06-16 02:53:16.891700] W [rpcsvc.c:1389:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed [2014-06-16 02:53:16.892457] I [glusterd.c:354:glusterd_check_gsync_present] 0-glusterd: geo-replication module not installed in the system [2014-06-16 02:53:17.087325] E [glusterd-store.c:1333:glusterd_restore_op_version] 0-management: wrong op-version (3) retreived [2014-06-16 02:53:17.087352] E [glusterd-store.c:2510:glusterd_restore] 0-management: Failed to restore op_version [2014-06-16 02:53:17.087365] E [xlator.c:390:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again [2014-06-16 02:53:17.087375] E [graph.c:292:glusterfs_graph_init] 0-management: initializing translator failed [2014-06-16 02:53:17.087383] E [graph.c:479:glusterfs_graph_activate] 0-graph: init failed [2014-06-16 02:53:17.087534] W [glusterfsd.c:1002:cleanup_and_exit] (-->/usr/sbin/glusterd(main+0x5d2) [0x406802] (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xb7) [0x4051b7] (-->/usr/sbin/glusterd(glusterfs_process_volfp+0x103) [0x4050c3]))) 0-: received signum (0), shutting down > > Thanks, > Lala > > > > > > ----- Original Message ----- > > From: "Franco Broi" <franco.b...@iongeo.com> > > To: "Susant Palai" <spa...@redhat.com> > > Cc: "Pranith Kumar Karampuri" <pkara...@redhat.com>, > > gluster-users@gluster.org, "Raghavendra Gowdappa" <rgowd...@redhat.com>, > > kdhan...@redhat.com, vsomy...@redhat.com, nbala...@redhat.com > > Sent: Monday, 16 June, 2014 5:47:55 AM > > Subject: Re: [Gluster-users] glusterfsd process spinning > > > > > > Is it possible to downgrade to 3.4 from 3.5? I can't afford to spend any > > more time testing 3.5 and it doesn't seem to work as well as 3.4. > > > > Cheers, > > > > On Wed, 2014-06-04 at 01:51 -0400, Susant Palai wrote: > >> From the logs it seems files are present on data(21,22,23,24) which are > >> on nas6 while missing on data(17,18,19,20) which are on nas5 > >> (interesting). There is an existing issue where directories does not show > >> up on mount point if they are not present on first_up_subvol(longest > >> living brick) and the current issue looks more similar. Well will look at > >> the client logs for more information. > >> > >> Susant. > >> > >> ----- Original Message ----- > >> From: "Franco Broi" <franco.b...@iongeo.com> > >> To: "Pranith Kumar Karampuri" <pkara...@redhat.com> > >> Cc: "Susant Palai" <spa...@redhat.com>, gluster-users@gluster.org, > >> "Raghavendra Gowdappa" <rgowd...@redhat.com>, kdhan...@redhat.com, > >> vsomy...@redhat.com, nbala...@redhat.com > >> Sent: Wednesday, 4 June, 2014 10:32:37 AM > >> Subject: Re: [Gluster-users] glusterfsd process spinning > >> > >> On Wed, 2014-06-04 at 10:19 +0530, Pranith Kumar Karampuri wrote: > >>> On 06/04/2014 08:07 AM, Susant Palai wrote: > >>>> Pranith can you send the client and bricks logs. > >>> I have the logs. But I believe for this issue of directory not listing > >>> entries, it would help more if we have the contents of that directory on > >>> all the directories in the bricks + their hash values in the xattrs. > >> Strange thing is, all the invisible files are on the one server (nas6), > >> the other seems ok. I did rm -Rf of /data2/franco/dir* and was left with > >> this one directory - there were many hundreds which were removed > >> successfully. > >> > >> I've attached listings and xattr dumps. > >> > >> Cheers, > >> > >> Volume Name: data2 > >> Type: Distribute > >> Volume ID: d958423f-bd25-49f1-81f8-f12e4edc6823 > >> Status: Started > >> Number of Bricks: 8 > >> Transport-type: tcp > >> Bricks: > >> Brick1: nas5-10g:/data17/gvol > >> Brick2: nas5-10g:/data18/gvol > >> Brick3: nas5-10g:/data19/gvol > >> Brick4: nas5-10g:/data20/gvol > >> Brick5: nas6-10g:/data21/gvol > >> Brick6: nas6-10g:/data22/gvol > >> Brick7: nas6-10g:/data23/gvol > >> Brick8: nas6-10g:/data24/gvol > >> Options Reconfigured: > >> nfs.drc: on > >> cluster.min-free-disk: 5% > >> network.frame-timeout: 10800 > >> nfs.export-volumes: on > >> nfs.disable: on > >> cluster.readdir-optimize: on > >> > >> Gluster process Port Online > >> Pid > >> ------------------------------------------------------------------------------ > >> Brick nas5-10g:/data17/gvol 49152 Y > >> 6553 > >> Brick nas5-10g:/data18/gvol 49153 Y > >> 6564 > >> Brick nas5-10g:/data19/gvol 49154 Y > >> 6575 > >> Brick nas5-10g:/data20/gvol 49155 Y > >> 6586 > >> Brick nas6-10g:/data21/gvol 49160 Y > >> 20608 > >> Brick nas6-10g:/data22/gvol 49161 Y > >> 20613 > >> Brick nas6-10g:/data23/gvol 49162 Y > >> 20614 > >> Brick nas6-10g:/data24/gvol 49163 Y > >> 20621 > >> > >> Task Status of Volume data2 > >> ------------------------------------------------------------------------------ > >> There are no active volume tasks > >> > >> > >> > >>> Pranith > >>>> Thanks, > >>>> Susant~ > >>>> > >>>> ----- Original Message ----- > >>>> From: "Pranith Kumar Karampuri" <pkara...@redhat.com> > >>>> To: "Franco Broi" <franco.b...@iongeo.com> > >>>> Cc: gluster-users@gluster.org, "Raghavendra Gowdappa" > >>>> <rgowd...@redhat.com>, spa...@redhat.com, kdhan...@redhat.com, > >>>> vsomy...@redhat.com, nbala...@redhat.com > >>>> Sent: Wednesday, 4 June, 2014 7:53:41 AM > >>>> Subject: Re: [Gluster-users] glusterfsd process spinning > >>>> > >>>> hi Franco, > >>>> CC Devs who work on DHT to comment. > >>>> > >>>> Pranith > >>>> > >>>> On 06/04/2014 07:39 AM, Franco Broi wrote: > >>>>> On Wed, 2014-06-04 at 07:28 +0530, Pranith Kumar Karampuri wrote: > >>>>>> Franco, > >>>>>> Thanks for providing the logs. I just copied over the logs > >>>>>> to my > >>>>>> machine. Most of the logs I see are related to "No such File or > >>>>>> Directory" I wonder what lead to this. Do you have any idea? > >>>>> No but I'm just looking at my 3.5 Gluster volume and it has a directory > >>>>> that looks empty but can't be deleted. When I look at the directories on > >>>>> the servers there are definitely files in there. > >>>>> > >>>>> [franco@charlie1 franco]$ rmdir /data2/franco/dir1226/dir25 > >>>>> rmdir: failed to remove `/data2/franco/dir1226/dir25': Directory not > >>>>> empty > >>>>> [franco@charlie1 franco]$ ls -la /data2/franco/dir1226/dir25 > >>>>> total 8 > >>>>> drwxrwxr-x 2 franco support 60 May 21 03:58 . > >>>>> drwxrwxr-x 3 franco support 24 Jun 4 09:37 .. > >>>>> > >>>>> [root@nas6 ~]# ls -la /data*/gvol/franco/dir1226/dir25 > >>>>> /data21/gvol/franco/dir1226/dir25: > >>>>> total 2081 > >>>>> drwxrwxr-x 13 1348 200 13 May 21 03:58 . > >>>>> drwxrwxr-x 3 1348 200 3 May 21 03:58 .. > >>>>> drwxrwxr-x 2 1348 200 2 May 16 12:05 dir13017 > >>>>> drwxrwxr-x 2 1348 200 2 May 16 12:05 dir13018 > >>>>> drwxrwxr-x 2 1348 200 3 May 16 12:05 dir13020 > >>>>> drwxrwxr-x 2 1348 200 3 May 16 12:05 dir13021 > >>>>> drwxrwxr-x 2 1348 200 3 May 16 12:05 dir13022 > >>>>> drwxrwxr-x 2 1348 200 2 May 16 12:05 dir13024 > >>>>> drwxrwxr-x 2 1348 200 2 May 16 12:05 dir13027 > >>>>> drwxrwxr-x 2 1348 200 3 May 16 12:05 dir13028 > >>>>> drwxrwxr-x 2 1348 200 2 May 16 12:06 dir13029 > >>>>> drwxrwxr-x 2 1348 200 2 May 16 12:06 dir13031 > >>>>> drwxrwxr-x 2 1348 200 3 May 16 12:06 dir13032 > >>>>> > >>>>> /data22/gvol/franco/dir1226/dir25: > >>>>> total 2084 > >>>>> drwxrwxr-x 13 1348 200 13 May 21 03:58 . > >>>>> drwxrwxr-x 3 1348 200 3 May 21 03:58 .. > >>>>> drwxrwxr-x 2 1348 200 2 May 16 12:05 dir13017 > >>>>> drwxrwxr-x 2 1348 200 2 May 16 12:05 dir13018 > >>>>> drwxrwxr-x 2 1348 200 2 May 16 12:05 dir13020 > >>>>> drwxrwxr-x 2 1348 200 2 May 16 12:05 dir13021 > >>>>> drwxrwxr-x 2 1348 200 2 May 16 12:05 dir13022 > >>>>> ..... > >>>>> > >>>>> Maybe Gluster is losing track of the files?? > >>>>> > >>>>>> Pranith > >>>>>> > >>>>>> On 06/02/2014 02:48 PM, Franco Broi wrote: > >>>>>>> Hi Pranith > >>>>>>> > >>>>>>> Here's a listing of the brick logs, looks very odd especially the size > >>>>>>> of the log for data10. > >>>>>>> > >>>>>>> [root@nas3 bricks]# ls -ltrh > >>>>>>> total 2.6G > >>>>>>> -rw------- 1 root root 381K May 13 12:15 data12-gvol.log-20140511 > >>>>>>> -rw------- 1 root root 430M May 13 12:15 data11-gvol.log-20140511 > >>>>>>> -rw------- 1 root root 328K May 13 12:15 data9-gvol.log-20140511 > >>>>>>> -rw------- 1 root root 2.0M May 13 12:15 data10-gvol.log-20140511 > >>>>>>> -rw------- 1 root root 0 May 18 03:43 data10-gvol.log-20140525 > >>>>>>> -rw------- 1 root root 0 May 18 03:43 data11-gvol.log-20140525 > >>>>>>> -rw------- 1 root root 0 May 18 03:43 data12-gvol.log-20140525 > >>>>>>> -rw------- 1 root root 0 May 18 03:43 data9-gvol.log-20140525 > >>>>>>> -rw------- 1 root root 0 May 25 03:19 data10-gvol.log-20140601 > >>>>>>> -rw------- 1 root root 0 May 25 03:19 data11-gvol.log-20140601 > >>>>>>> -rw------- 1 root root 0 May 25 03:19 data9-gvol.log-20140601 > >>>>>>> -rw------- 1 root root 98M May 26 03:04 data12-gvol.log-20140518 > >>>>>>> -rw------- 1 root root 0 Jun 1 03:37 data10-gvol.log > >>>>>>> -rw------- 1 root root 0 Jun 1 03:37 data11-gvol.log > >>>>>>> -rw------- 1 root root 0 Jun 1 03:37 data12-gvol.log > >>>>>>> -rw------- 1 root root 0 Jun 1 03:37 data9-gvol.log > >>>>>>> -rw------- 1 root root 1.8G Jun 2 16:35 data10-gvol.log-20140518 > >>>>>>> -rw------- 1 root root 279M Jun 2 16:35 data9-gvol.log-20140518 > >>>>>>> -rw------- 1 root root 328K Jun 2 16:35 data12-gvol.log-20140601 > >>>>>>> -rw------- 1 root root 8.3M Jun 2 16:35 data11-gvol.log-20140518 > >>>>>>> > >>>>>>> Too big to post everything. > >>>>>>> > >>>>>>> Cheers, > >>>>>>> > >>>>>>> On Sun, 2014-06-01 at 22:00 -0400, Pranith Kumar Karampuri wrote: > >>>>>>>> ----- Original Message ----- > >>>>>>>>> From: "Pranith Kumar Karampuri" <pkara...@redhat.com> > >>>>>>>>> To: "Franco Broi" <franco.b...@iongeo.com> > >>>>>>>>> Cc: gluster-users@gluster.org > >>>>>>>>> Sent: Monday, June 2, 2014 7:01:34 AM > >>>>>>>>> Subject: Re: [Gluster-users] glusterfsd process spinning > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> ----- Original Message ----- > >>>>>>>>>> From: "Franco Broi" <franco.b...@iongeo.com> > >>>>>>>>>> To: "Pranith Kumar Karampuri" <pkara...@redhat.com> > >>>>>>>>>> Cc: gluster-users@gluster.org > >>>>>>>>>> Sent: Sunday, June 1, 2014 10:53:51 AM > >>>>>>>>>> Subject: Re: [Gluster-users] glusterfsd process spinning > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> The volume is almost completely idle now and the CPU for the brick > >>>>>>>>>> process has returned to normal. I've included the profile and I > >>>>>>>>>> think it > >>>>>>>>>> shows the latency for the bad brick (data12) is unusually high, > >>>>>>>>>> probably > >>>>>>>>>> indicating the filesystem is at fault after all?? > >>>>>>>>> I am not sure if we can believe the outputs now that you say the > >>>>>>>>> brick > >>>>>>>>> returned to normal. Next time it is acting up, do the same > >>>>>>>>> procedure and > >>>>>>>>> post the result. > >>>>>>>> On second thought may be its not a bad idea to inspect the log files > >>>>>>>> of the bricks in nas3. Could you post them. > >>>>>>>> > >>>>>>>> Pranith > >>>>>>>> > >>>>>>>>> Pranith > >>>>>>>>>> On Sun, 2014-06-01 at 01:01 -0400, Pranith Kumar Karampuri wrote: > >>>>>>>>>>> Franco, > >>>>>>>>>>> Could you do the following to get more information: > >>>>>>>>>>> > >>>>>>>>>>> "gluster volume profile <volname> start" > >>>>>>>>>>> > >>>>>>>>>>> Wait for some time, this will start gathering what operations are > >>>>>>>>>>> coming > >>>>>>>>>>> to > >>>>>>>>>>> all the bricks" > >>>>>>>>>>> Now execute "gluster volume profile <volname> info" > > >>>>>>>>>>> /file/you/should/reply/to/this/mail/with > >>>>>>>>>>> > >>>>>>>>>>> Then execute: > >>>>>>>>>>> gluster volume profile <volname> stop > >>>>>>>>>>> > >>>>>>>>>>> Lets see if this throws any light on the problem at hand > >>>>>>>>>>> > >>>>>>>>>>> Pranith > >>>>>>>>>>> ----- Original Message ----- > >>>>>>>>>>>> From: "Franco Broi" <franco.b...@iongeo.com> > >>>>>>>>>>>> To: gluster-users@gluster.org > >>>>>>>>>>>> Sent: Sunday, June 1, 2014 9:02:48 AM > >>>>>>>>>>>> Subject: [Gluster-users] glusterfsd process spinning > >>>>>>>>>>>> > >>>>>>>>>>>> Hi > >>>>>>>>>>>> > >>>>>>>>>>>> I've been suffering from continual problems with my gluster > >>>>>>>>>>>> filesystem > >>>>>>>>>>>> slowing down due to what I thought was congestion on a single > >>>>>>>>>>>> brick > >>>>>>>>>>>> being caused by a problem with the underlying filesystem running > >>>>>>>>>>>> slow > >>>>>>>>>>>> but I've just noticed that the glusterfsd process for that > >>>>>>>>>>>> particular > >>>>>>>>>>>> brick is running at 100%+, even when the filesystem is almost > >>>>>>>>>>>> idle. > >>>>>>>>>>>> > >>>>>>>>>>>> I've done a couple of straces of the brick and another on the > >>>>>>>>>>>> same > >>>>>>>>>>>> server, does the high number of futex errors give any clues as > >>>>>>>>>>>> to what > >>>>>>>>>>>> might be wrong? > >>>>>>>>>>>> > >>>>>>>>>>>> % time seconds usecs/call calls errors syscall > >>>>>>>>>>>> ------ ----------- ----------- --------- --------- > >>>>>>>>>>>> ---------------- > >>>>>>>>>>>> 45.58 0.027554 0 191665 20772 futex > >>>>>>>>>>>> 28.26 0.017084 0 137133 readv > >>>>>>>>>>>> 26.04 0.015743 0 66259 epoll_wait > >>>>>>>>>>>> 0.13 0.000077 3 23 writev > >>>>>>>>>>>> 0.00 0.000000 0 1 epoll_ctl > >>>>>>>>>>>> ------ ----------- ----------- --------- --------- > >>>>>>>>>>>> ---------------- > >>>>>>>>>>>> 100.00 0.060458 395081 20772 total > >>>>>>>>>>>> > >>>>>>>>>>>> % time seconds usecs/call calls errors syscall > >>>>>>>>>>>> ------ ----------- ----------- --------- --------- > >>>>>>>>>>>> ---------------- > >>>>>>>>>>>> 99.25 0.334020 133 2516 epoll_wait > >>>>>>>>>>>> 0.40 0.001347 0 4090 26 futex > >>>>>>>>>>>> 0.35 0.001192 0 5064 readv > >>>>>>>>>>>> 0.00 0.000000 0 20 writev > >>>>>>>>>>>> ------ ----------- ----------- --------- --------- > >>>>>>>>>>>> ---------------- > >>>>>>>>>>>> 100.00 0.336559 11690 26 total > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Cheers, > >>>>>>>>>>>> > >>>>>>>>>>>> _______________________________________________ > >>>>>>>>>>>> Gluster-users mailing list > >>>>>>>>>>>> Gluster-users@gluster.org > >>>>>>>>>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users > >>>>>>>>>>>> > _______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users