Re: [Gluster-devel] Troubleshooting and Diagnostic tools for Gluster
I have a script written to analyze the log message of gluster process. It actually scans the log file and identifies the log messages with ERROR and WARNING levels. It lists the functions (with either ERROR or WARNING logs) and their percentage of occcurance. It also lists the MSGIDs for ERROR and WARNING logs and their percentage of occurance. A sample o/p the script: [root@hal9000 ~]# ./log_analyzer.sh /var/log/glusterfs/mnt-glusterfs.log Number Percentage Function 7 0.49 __socket_rwv 4 0.28 mgmt_getspec_cbk 4 0.28 gf_timer_call_after 3 0.21 rpc_clnt_reconfig 2 0.14 fuse_thread_proc 2 0.14 fini 2 0.14 cleanup_and_exit 1 0.07 _ios_dump_thread 1 0.07 fuse_init 1 0.07 fuse_graph_setup = Error Functions 7 0.49 __socket_rwv 2 0.14 cleanup_and_exit Number Percentage MSGID 958 67.99 109066 424 30.09 109036 3 0.21 114057 3 0.21 114047 3 0.21 114046 3 0.21 114035 3 0.21 114020 3 0.21 114018 3 0.21 108031 2 0.14 101190 1 0.07 7962 1 0.07 108006 1 0.07 108005 1 0.07 108001 1 0.07 100030 = Error MSGIDs 1 0.07 108006 1 0.07 108001 It can be found here. https://github.com/raghavendrabhat/threaded-io/blob/master/log_analyzer.sh. Do you think it can be added to the repo? Regards, Raghavendra On Wed, Jan 27, 2016 at 3:44 AM, Aravindawrote: > Hi, > > I am happy to share the `glustertool` project, which is a > infrastructure for adding more tools for Gluster. > > https://github.com/aravindavk/glustertool > > Following tools available with the initial release.(`glustertool > [ARGS..]`) > > 1. gfid - To get GFID of given path(Mount or Backend) > 2. changelogparser - To parse the Gluster Changelog > 3. xtime - To get Xtime from brick backend > 4. stime - To get Stime from brick backend > 5. volmark - To get Volmark details from Gluster mount > > rpm/deb packages are not yet available, install this using `sudo > python setup.py install` > > Once installed, run `glustertool list` to see list of tools available. > `glustertool doc TOOLNAME` shows documentation about the tool and > `glustertool --help` shows the usage of the tool. > > More tools can be added to this collection easily using `newtool` > utility available in this repo. > > # ./newtool > > Read more about adding tools here > https://github.com/aravindavk/glustertool/blob/master/CONTRIBUTING.md > > You can create an issue in github requesting more tools for Gluster > https://github.com/aravindavk/glustertool/issues > > Comments & Suggestions Welcome > > regards > Aravinda > > On 10/23/2015 11:42 PM, Vijay Bellur wrote: > >> On Friday 23 October 2015 04:16 PM, Aravinda wrote: >> >>> Hi Gluster developers, >>> >>> In this mail I am proposing troubleshooting documentation and >>> Gluster Tools infrastructure. >>> >>> Tool to search in documentation >>> === >>> We recently added message Ids to each error messages in Gluster. Some >>> of the error messages are self explanatory. But some error messages >>> requires manual intervention to fix the issue. How about identifying >>> the error messages which requires more explanation and creating >>> documentation for the same. Even though the information about some >>> errors available in documentation, it is very difficult to search and >>> relate to the error message. It will be very useful if we create a >>> tool which looks for documentation and tells us exactly what to do. >>> >>> For example,(Illustrativepurpose only) >>> glusterdoc --explain GEOREP0003 >>> >>> SSH configuration issue. This error is seen when Pem keys from all >>> master nodes are not distributed properly to Slave >>> nodes. Use Geo-replication create command with force option to >>> redistribute the keys. If issue stillpersists, look for any errors >>> while running hook scripts inGlusterd log file. >>> >>> >>> Note: Inspired from rustc --explain command >>> https://twitter.com/jaredforsyth/status/626960244707606528 >>> >>> If we don't know the message id, we can still search from the >>> available documentation like, >>> >>> glusterdoc --search >>> >>> These commands can be programmatically consumed, for example >>> `--json` will return the output in JSON format. This enables UI >>> developers to automatically show help messages when they display >>> errors. >>> >>> Gluster Tools infrastructure >>> >>> Are our Gluster log files sufficient for root causing the issues? Is >>> that error caused due to miss configuration? Geo-replication status is >>> showing faulty. Where to find the reason for Faulty? >>> >>> Sac(surs AT redhat.com) mentioned that heis working on gdeploy and many >>> developers >>> are using their owntools. How about providing common infrastructure(say >>> gtool/glustertool) to host all these tools. >>> >>> >> Would this be a repository with individual tools being git submodules or >> something similar? Is there also a plan to bundle the set of tools into a >> binary
Re: [Gluster-devel] Throttling xlator on the bricks
There is already a patch submitted for moving TBF part to libglusterfs. It is under review. http://review.gluster.org/#/c/12413/ Regards, Raghavendra On Mon, Jan 25, 2016 at 2:26 AM, Venky Shankarwrote: > On Mon, Jan 25, 2016 at 11:06:26AM +0530, Ravishankar N wrote: > > Hi, > > > > We are planning to introduce a throttling xlator on the server (brick) > > process to regulate FOPS. The main motivation is to solve complaints > about > > AFR selfheal taking too much of CPU resources. (due to too many fops for > > entry > > self-heal, rchecksums for data self-heal etc.) > > > > The throttling is achieved using the Token Bucket Filter algorithm (TBF). > > TBF > > is already used by bitrot's bitd signer (which is a client process) in > > gluster to regulate the CPU intensive check-sum calculation. By putting > the > > logic on the brick side, multiple clients- selfheal, bitrot, rebalance or > > even the mounts themselves can avail the benefits of throttling. > > [Providing current TBF implementation link for completeness] > > > https://github.com/gluster/glusterfs/blob/master/xlators/features/bit-rot/src/bitd/bit-rot-tbf.c > > Also, it would be beneficial to have the core TBF implementation as part of > libglusterfs so as to be consumable by the server side xlator component to > throttle dispatched FOPs and for daemons to throttle anything that's > outside > "brick" boundary (such as cpu, etc..). > > > > > The TBF algorithm in a nutshell is as follows: There is a bucket which is > > filled > > at a steady (configurable) rate with tokens. Each FOP will need a fixed > > amount > > of tokens to be processed. If the bucket has that many tokens, the FOP is > > allowed and that many tokens are removed from the bucket. If not, the > FOP is > > queued until the bucket is filled. > > > > The xlator will need to reside above io-threads and can have different > > buckets, > > one per client. There has to be a communication mechanism between the > client > > and > > the brick (IPC?) to tell what FOPS need to be regulated from it, and the > no. > > of > > tokens needed etc. These need to be re configurable via appropriate > > mechanisms. > > Each bucket will have a token filler thread which will fill the tokens in > > it. > > The main thread will enqueue heals in a list in the bucket if there > aren't > > enough tokens. Once the token filler detects some FOPS can be serviced, > it > > will > > send a cond-broadcast to a dequeue thread which will process (stack wind) > > all > > the FOPS that have the required no. of tokens from all buckets. > > > > This is just a high level abstraction: requesting feedback on any aspect > of > > this feature. what kind of mechanism is best between the client/bricks > for > > tuning various parameters? What other requirements do you foresee? > > > > Thanks, > > Ravi > > > ___ > > Gluster-devel mailing list > > Gluster-devel@gluster.org > > http://www.gluster.org/mailman/listinfo/gluster-devel > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] distributed files/directories and [cm]time updates
Hi Xavier, There is a patch sent for review which implements the metadata cache in the posix layer. What the changes do is this: Whenever there is a fresh lookup on a object (file/directory/symlink), posix xlator saves the stat attributes of that object in its cache. As of now, whenever there is a fop on a object, posix tries to build HANDLE of the object by looking into gfid based backend (i.e. .glusterfs directory) and doing stat to check if the gfid exists. The patch makes chages to posix to check into its own cache first and return if it can find the attributes. If not, then look into actual gfid backend. But as of now, there is no cache invalidation. Whenever there is a setattr() fop to change the attributes of a object, the new stat info is saved in the cache once the fop is successful on disk. The patch can be found here. (http://review.gluster.org/#/c/12157/). Regards, Raghavendra On Tue, Jan 26, 2016 at 2:51 AM, Xavier Hernandezwrote: > Hi Pranith, > > On 26/01/16 03:47, Pranith Kumar Karampuri wrote: > >> hi, >>Traditionally gluster has been using ctime/mtime of the >> files/dirs on the bricks as stat output. Problem we are seeing with this >> approach is that, software which depends on it gets confused when there >> are differences in these times. Tar especially gives "file changed as we >> read it" whenever it detects ctime differences when stat is served from >> different bricks. The way we have been trying to solve it is to serve >> the stat structures from same brick in afr, max-time in dht. But it >> doesn't avoid the problem completely. Because there is no way to change >> ctime at the moment(lutimes() only allows mtime, atime), there is little >> we can do to make sure ctimes match after self-heals/xattr >> updates/rebalance. I am wondering if anyone of you solved these problems >> before, if yes how did you go about doing it? It seems like applications >> which depend on this for backups get confused the same way. The only way >> out I see it is to bring ctime to an xattr, but that will need more iops >> and gluster has to keep updating it on quite a few fops. >> > > I did think about this when I was writing ec at the beginning. The idea > was that the point in time at which each fop is executed were controlled by > the client by adding an special xattr to each regular fop. Of course this > would require support inside the storage/posix xlator. At that time, adding > the needed support to other xlators seemed too complex for me, so I decided > to do something similar to afr. > > Anyway, the idea was like this: for example, when a write fop needs to be > sent, dht/afr/ec sets the current time in a special xattr, for example > 'glusterfs.time'. It can be done in a way that if the time is already set > by a higher xlator, it's not modified. This way DHT could set the time in > fops involving multiple afr subvolumes. For other fops, would be afr who > sets the time. It could also be set directly by the top most xlator (fuse), > but that time could be incorrect because lower xlators could delay the fop > execution and reorder it. This would need more thinking. > > That xattr will be received by storage/posix. This xlator will determine > what times need to be modified and will change them. In the case of a > write, it can decide to modify mtime and, maybe, atime. For a mkdir or > create, it will set the times of the new file/directory and also the mtime > of the parent directory. It depends on the specific fop being processed. > > mtime, atime and ctime (or even others) could be saved in a special posix > xattr instead of relying on the file system attributes that cannot be > modified (at least for ctime). > > This solution doesn't require extra fops, So it seems quite clean to me. > The additional I/O needed in posix could be minimized by implementing a > metadata cache in storage/posix that would read all metadata on lookup and > update it on disk only at regular intervals and/or on invalidation. All > fops would read/write into the cache. This would even reduce the number of > I/O we are currently doing for each fop. > > Xavi > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] glusterfs-3.6.7 released
Hi, glusterfs-3.6.7 has been released and the packages for RHEL/Fedora/Centos can be found here.http://download.gluster.org/pub/gluster/glusterfs/3.6/LATEST/ Requesting people running 3.6.x to please try it out and let us know if there are any issues. This release supposedly fixes the bugs listed below since 3.6.6 was made available. Thanks to all who submitted patches, reviewed the changes. 1283690 - core dump in protocol/client:client_submit_request 1283144 - glusterfs does not register with rpcbind on restart 1277823 - [upgrade] After upgrade from 3.5 to 3.6, probing a new 3.6 node is moving the peer to rejected state 1277822 - glusterd: probing a new node(>=3.6) from 3.5 cluster is moving the peer to rejected state Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] netbsd failures in 3.6 release
Hi, We have been observing netbsd failures in 3.6 branch since few months and I have been merging the patches by ignoring netbsd failures. Last few 3.6 releases were made without considering 3.6 failures. IIRC there was a discussion about it back then when netbsd tests started failing and it was discussed that we shall ignore 3.6 netbsd errors. I am not sure if it was discussed over IRC or as part of some patch (over gerrit). Emmanuel? Do you recollect any discussions about it? But I think it would be better to discuss about it here and see what can be done. Please provide feedback. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] REMINDER: Weekly gluster community meeting to start in 30 minutes
Hi All, In 30 minutes from now we will have the regular weekly Gluster Community meeting. Meeting details: - location: #gluster-meeting on Freenode IRC - date: every Wednesday - time: 12:00 UTC, 14:00 CEST, 17:30 IST (in your terminal, run: date -d "12:00 UTC") - agenda: https://public.pad.fsfe.org/p/gluster-community-meetings Currently the following items are listed: * Roll Call * Status of last week's action items * Gluster 3.7 * Gluster 3.8 * Gluster 3.6 * Gluster 3.5 * Gluster 4.0 * Open Floor - bring your own topic! The last topic has space for additions. If you have a suitable topic to discuss, please add it to the agenda. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] REMINDER: Weekly gluster community meeting to start in 30 minutes
Hi All, In 30 minutes from now we will have the regular weekly Gluster Community meeting. Meeting details: - location: #gluster-meeting on Freenode IRC - date: every Wednesday - time: 12:00 UTC, 14:00 CEST, 17:30 IST (in your terminal, run: date -d "12:00 UTC") - agenda: https://public.pad.fsfe.org/p/gluster-community-meetings Currently the following items are listed: * Roll Call * Status of last week's action items * Gluster 3.7 * Gluster 3.8 * Gluster 3.6 * Gluster 3.5 * Gluster 4.0 * Open Floor - bring your own topic! The last topic has space for additions. If you have a suitable topic to discuss, please add it to the agenda. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client
Hi Oleksandr, You are right. The description should have said it as the limit on the number of inodes in the lru list of the inode cache. I have sent a patch for that. http://review.gluster.org/#/c/12242/ Regards, Raghavendra Bhat On Thu, Sep 24, 2015 at 1:44 PM, Oleksandr Natalenko < oleksa...@natalenko.name> wrote: > I've checked statedump of volume in question and haven't found lots of > iobuf as mentioned in that bugreport. > > However, I've noticed that there are lots of LRU records like this: > > === > [conn.1.bound_xl./bricks/r6sdLV07_vd0_mail/mail.lru.1] > gfid=c4b29310-a19d-451b-8dd1-b3ac2d86b595 > nlookup=1 > fd-count=0 > ref=0 > ia_type=1 > === > > In fact, there are 16383 of them. I've checked "gluster volume set help" > in order to find something LRU-related and have found this: > > === > Option: network.inode-lru-limit > Default Value: 16384 > Description: Specifies the maximum megabytes of memory to be used in the > inode cache. > === > > Is there error in description stating "maximum megabytes of memory"? > Shouldn't it mean "maximum amount of LRU records"? If no, is that true, > that inode cache could grow up to 16 GiB for client, and one must lower > network.inode-lru-limit value? > > Another thought: we've enabled write-behind, and the default > write-behind-window-size value is 1 MiB. So, one may conclude that with > lots of small files written, write-behind buffer could grow up to > inode-lru-limit×write-behind-window-size=16 GiB? Who could explain that to > me? > > 24.09.2015 10:42, Gabi C write: > >> oh, my bad... >> coulb be this one? >> >> https://bugzilla.redhat.com/show_bug.cgi?id=1126831 [2] >> Anyway, on ovirt+gluster w I experienced similar behavior... >> > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] glusterfs 3.6.6 released
Hi, glusterfs-3.6.6 has been released and the packages for RHEL/Fedora/Centos can be found here. http://download.gluster.org/pub/gluster/glusterfs/3.6/LATEST/ Requesting people running 3.6.x to please try it out and let us know if there are any issues. This release supposedly fixes the bugs listed below since 3.6.5 was made available. Thanks to all who submitted patches, reviewed the changes. 1259578 - [3.6.x] quota usage gets miscalculated when loc->gfid is NULL 1247972 - quota/marker: lk_owner is null while acquiring inodelk in rename operation 1252072 - POSIX ACLs as used by a FUSE mount can not use more than 32 groups 1256245 - AFR: gluster v restart force or brick process restart doesn't heal the files 1258069 - gNFSd: NFS mount fails with "Remote I/O error" 1173437 - [RFE] changes needed in snapshot info command's xml output. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] [posix-compliance] unlink and access to file through open fd
On 09/04/2015 12:43 PM, Raghavendra Gowdappa wrote: All, Posix allows access to file through open fds even if name associated with file is deleted. While this works for glusterfs for most of the cases, there are some corner cases where we fail. 1. Reboot of brick: === With the reboot of brick, fd is lost. unlink would've deleted both gfid and path links to file and we would loose the file. As a solution, perhaps we should create an hardlink to the file (say in .glusterfs) which gets deleted only when last fd is closed? 2. Graph switch: = The issue is captured in bz 1259995 [1]. Pasting the content from bz verbatim: Consider following sequence of operations: 1. fd = open ("/mnt/glusterfs/file"); 2. unlink ("/mnt/glusterfs/file"); 3. Do a graph-switch, lets say by adding a new brick to volume. 4. migration of fd to new graph fails. This is because as part of migration we do a lookup and open. But, lookup fails as file is already deleted and hence migration fails and fd is marked bad. In fact this test case is already present in our regression tests, though the test checks whether the fd is just marked as bad. But the expectation of filing this bug is that migration should succeed. This is possible since there is an fd opened on brick through old-graph and hence can be duped using dup syscall. Of course the solution outlined here doesn't cover the case where file is not present on brick at all. For eg., a new brick was added to replica set and that new brick doesn't contain the file. Now, since the file is deleted, how do replica heals that file to another brick etc. But atleast this can be solved for those cases where file was present on a brick and fd was already opened. Du, For this 2nd example (where the file is opened, unlinked and a graph swatch happens), there was a patch submitted long back. http://review.gluster.org/#/c/5428/ Regards, Raghavendra Bhat 3. Open-behind and unlink from a different client: == While open-behind handles unlink from the same client (through which open was performed), if unlink and open are done from two different clients, file is lost. I cannot think of any good solution for this. I wanted to know whether these problems are real enough to channel our efforts to fix these issues. Comments are welcome in terms of solutions or other possible scenarios which can lead to this issue. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1259995 regards, Raghavendra. ___ Gluster-users mailing list gluster-us...@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] glusterfs-3.6.5 released
Hi, glusterfs-3.6.5 has been released and the packages for RHEL/Fedora/Centos can be found here. http://download.gluster.org/pub/gluster/glusterfs/3.6/LATEST/ The Ubuntu packages can be found here: https://launchpad.net/~gluster/+archive/ubuntu/glusterfs-3.6. Requesting people running 3.6.x to please try it out and let us know if there are any issues. This release supposedly fixes the bugs listed below since 3.6.4 was made available. Thanks to all who submitted patches, reviewed the changes. 1247959 - Statfs is hung because of frame loss in quota 1247970 - huge mem leak in posix xattrop 1234096 - rmtab file is a bottleneck when lot of clients are accessing a volume through NFS 1254421 - glusterd fails to get the inode size for a brick 1247964 - Disperse volume: Huge memory leak of glusterfsd process 1218732 - gluster snapshot status --xml gives back unexpected non xml output 1250836 - [upgrade] After upgrade from 3.5 to 3.6 onwards version, bumping up op-version failed 1244117 - unix domain sockets on Gluster/NFS are created as fifo/pipe 1243700 - GlusterD crashes when management encryption is enabled 1235601 - tar on a glusterfs mount displays file changed as we read it even though the file was not changed Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [release-3.6] compile error: 'GF_REPLACE_OP_START' undeclared
On 08/18/2015 12:39 PM, Avra Sengupta wrote: + Adding Raghavendra Bhat. When is the next GA planned on this branch? And can we take patches in this branch while this is being investigated. Regards, Avra I am planning to make the release by the end of this week. I can accept the patches if it is fixing some critical bug. But it would be better if the issue being investigated is fixed. Regards, Raghavendra Bhat On 08/18/2015 12:07 PM, Avra Sengupta wrote: Still hitting this on freebsd and netbsd smoke runs on release 3.6 branch. Are we merging patches on release 3.6 branch for now even with these failures. I have two such patches that need to be merged. Regards, Avra On 07/06/2015 02:32 PM, Niels de Vos wrote: On Mon, Jul 06, 2015 at 02:19:07PM +0530, Raghavendra Bhat wrote: On 07/06/2015 01:39 PM, Niels de Vos wrote: On Mon, Jul 06, 2015 at 12:09:28PM +0530, Raghavendra Bhat wrote: On 07/06/2015 09:52 AM, Kaushal M wrote: I checked on NetBSD-7.0_BETA and FreeBSD-10.1. I couldn't reproduce this. I'll try on NetBSD-6 next. ~kaushal I think it has to be included before 3.6.4 is made G.A. I can wait till the fix for this issue is merged before making 3.6.4. Does it sound ok? Or should I go ahead with 3.6.4 and make a quick 3.6.5 with this fix? I only care about getting http://review.gluster.org/11335 merged :-) This is a patch I promised to take into release-3.5. It would be nicer to have this change included in the release-3.6 branch before I merge the 3.5 backport. At the moment, 3.5.5 is waiting on this patch. But I do not think you really need to delay 3.6.4 off for that one. It should be fine if it lands in 3.6.5. (The compile error looks more like a 3.6.4 blocker.) Niels Niels, The patch you mentioned has received the acks and also has passed the linux regression tests. But it seem to have failed netbsd regression tests. Yes, at least the smoke tests on NetBSD and FreeBSD fail with the compile error mentioned in the subject of this email :) Thanks, Niels Regards, Raghavendra Bhat Regards, Raghavendra Bhat On Mon, Jul 6, 2015 at 8:38 AM, Kaushal M kshlms...@gmail.com wrote: Krutika hit this last week, and let us (GlusterD maintiners) know of it. I volunteered to look into this, but couldn't find time. I'll do it now. ~kaushal On Sun, Jul 5, 2015 at 10:43 PM, Atin Mukherjee atin.mukherje...@gmail.com wrote: I remember Krutika reporting it few days back. So it seems like its not fixed yet. If there is no taker I will send a patch tomorrow. -Atin Sent from one plus one On Jul 5, 2015 9:58 PM, Niels de Vos nde...@redhat.com wrote: Hi, it seems that the current release-3.6 branch does not compile on FreedBSD and NetBSD (not sure why it compiles on CentOS-6). These errors are thrown: --- glusterd_la-glusterd-op-sm.lo --- CC glusterd_la-glusterd-op-sm.lo /home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c: In function 'glusterd_op_start_rb_timer': /home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:3685:19: error: 'GF_REPLACE_OP_START' undeclared (first use in this function) /home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:3685:19: note: each undeclared identifier is reported only once for each function it appears in /home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c: In function 'glusterd_bricks_select_status_volume': /home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:5800:34: warning: unused variable 'snapd' *** [glusterd_la-glusterd-op-sm.lo] Error code 1 Could someone send a (pointer to the) backport that addresses this? Thanks, Niels On Sun, Jul 05, 2015 at 08:59:32AM -0700, Gluster Build System (Code Review) wrote: Gluster Build System has posted comments on this change. Change subject: nfs: make it possible to disable nfs.mount-rmtab .. Patch Set 1: -Verified Build Failed http://build.gluster.org/job/compare-bug-version-and-git-branch/9953/ : SUCCESS http://build.gluster.org/job/freebsd-smoke/8551/ : FAILURE http://build.gluster.org/job/smoke/19820/ : SUCCESS http://build.gluster.org/job/netbsd6-smoke/7808/ : FAILURE -- To view, visit http://review.gluster.org/11335 To unsubscribe, visit http://review.gluster.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I40c4d8d754932f86fb2b1b2588843390464c773d Gerrit-PatchSet: 1 Gerrit-Project: glusterfs Gerrit-Branch: release-3.6 Gerrit-Owner: Niels de Vos nde...@redhat.com Gerrit-Reviewer: Gluster Build System jenk...@build.gluster.com Gerrit-Reviewer: Kaleb KEITHLEY kkeit...@redhat.com Gerrit-Reviewer: NetBSD Build System jenk...@build.gluster.org Gerrit-Reviewer: Niels de Vos nde...@redhat.com Gerrit-Reviewer: Raghavendra Bhat rab...@redhat.com Gerrit-Reviewer: jiffin tony Thottan
Re: [Gluster-devel] v3.6.3 doesn't respect default ACLs?
On 08/10/2015 09:56 PM, Niels de Vos wrote: On Wed, Jul 29, 2015 at 04:00:48PM +0530, Raghavendra Bhat wrote: On 07/27/2015 08:30 PM, Glomski, Patrick wrote: I built a patched version of 3.6.4 and the problem does seem to be fixed on a test server/client when I mounted with those flags (acl, resolve-gids, and gid-timeout). Seeing as it was a test system, I can't really provide anything meaningful as to the performance hit seen without the gid-timeout option. Thank you for implementing it so quickly, though! Is there any chance of getting this fix incorporated in the upcoming 3.6.5 release? Patrick I am planning to include this fix in 3.6.5. This fix is still under review. Once it is accepted in master, it cab be backported to release-3.6 branch. I will wait till then and make 3.6.5. I dont think there is a tracker bug for 3.6.5 yet? Or at least I could not find it by an alias. https://bugzilla.redhat.com/show_bug.cgi?id=1252072 is used to get the backport in release-3.6.x, please review and merge :-) Thanks, Niels This is the 3.6.5 tracker bug. Will merge the patch once regression tests are passed. https://bugzilla.redhat.com/show_bug.cgi?id=1250544. Regards, Raghavendra Bhat Regards, Raghavendra Bhat On Thu, Jul 23, 2015 at 6:27 PM, Niels de Vos nde...@redhat.com mailto:nde...@redhat.com wrote: On Tue, Jul 21, 2015 at 10:30:04PM +0200, Niels de Vos wrote: On Wed, Jul 08, 2015 at 03:20:41PM -0400, Glomski, Patrick wrote: Gluster devs, I'm running gluster v3.6.3 (both server and client side). Since my application requires more than 32 groups, I don't mount with ACLs on the client. If I mount with ACLs between the bricks and set a default ACL on the server, I think I'm right in stating that the server should respect that ACL whenever a new file or folder is made. I would expect that the ACL gets in herited on the brick. When a new file is created without the default ACL, things seem to be wrong. You mention that creating the file directly on the brick has the correct ACL, so there must be some Gluster component interfering. You reminded me on IRC about this email, and that helped a lot. Its very easy to get distracted when trying to investigate things from the mailinglists. I had a brief look, and I think we could reach a solution. An ugly patch for initial testing is ready. Well... it compiles. I'll try to run some basic tests tomorrow and see if it improves things and does not crash immediately. The change can be found here: http://review.gluster.org/11732 It basically adds a resolve-gids mount option for the FUSE client. This causes the fuse daemon to call getgrouplist() and retrieve all the groups for the UID that accesses the mountpoint. Without this option, the behavior is not changed, and /proc/$PID/status is used to get up to 32 groups (the $PID is the process that accesses the mountpoint). You probably want to also mount with gid-timeout=N where N is seconds that the group cache is valid. In the current master branch this is set to 300 seconds (like the sssd default), but if the groups of a used rarely change, this value can be increased. Previous versions had a lower timeout which could cause resolving the groups on almost each network packet that arrives (HUGE performance impact). When using this option, you may also need to enable server.manage-gids. This option allows using more than ~93 groups on the bricks. The network packets can only contain ~93 groups, when server.manage-gids is enabled, the groups are not sent in the network packets, but are resolved on the bricks with getgrouplist(). The patch linked above had been tested, corrected and updated. The change works for me on a test-system. A backport that you should be able to include in a package for 3.6 can be found here: http://termbin.com/f3cj Let me know if you are not familiar with rebuilding patched packages, and I can build a test-version for you tomorrow. On glusterfs-3.6, you will want to pass a gid-timeout mount option too. The option enables caching of the resolved groups that the uid belongs too, if caching is not enebled (or expires quickly), you will probably notice a preformance hit. Newer version of GlusterFS set the timeout to 300 seconds (like the default timeout sssd uses). Please test and let me know if this fixes your use case. Thanks, Niels Cheers, Niels Maybe an example is in order: We first set up a test directory with setgid bit so that our new subdirectories inherit the group. [root@gfs01a hpc_shared]# mkdir test; cd test; chown pglomski.users .; chmod 2770 .; getfacl
[Gluster-devel] release schedule for glusterfs
Hi, In previous community meeting it was discussed to come up with a schedule for glusterfs releases. It was discussed that each of the supported release branches (3.5, 3.6 and 3.7) will make a new release every month. The previous releases of them happened at below dates. glusterfs-3.5.5 - 9th July glusterfs-3.6.4 - 13th July glusterfs-3.7.3 - 29th July. Is it ok to slightly align those dates? i.e. on 10th of every month 3.5 based release would happen (in general the oldest supported and most stable release branch). On 20th of every month 3.6 based release would happen (In general, the release branch which is being stabilized). And on 30th of every month 3.7 based release would happen (in general, the latest release branch). Please provide feedback. Once a schedule is finalized we can put that information in gluster.org. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] release schedule for glusterfs
On 08/05/2015 05:57 PM, Humble Devassy Chirammal wrote: Hi Ragavendra, This LGTM . However Is there any guide line on : How many beta releases hit for each minor release ? and the gap between these releases ? --Humble I am not sure about the beta releases. As per my understanding there are no beta releases happening in release-3.5 branch and also the latest release-3.7 branch. I was doing beta releases for release-3.6 branch. But I am also thinking of moving away from it and make 3.6.5 directly (and also future release-3.6 releases). Regards, Raghavendra Bhat On Wed, Aug 5, 2015 at 5:12 PM, Raghavendra Bhat rab...@redhat.com mailto:rab...@redhat.com wrote: Hi, In previous community meeting it was discussed to come up with a schedule for glusterfs releases. It was discussed that each of the supported release branches (3.5, 3.6 and 3.7) will make a new release every month. The previous releases of them happened at below dates. glusterfs-3.5.5 - 9th July glusterfs-3.6.4 - 13th July glusterfs-3.7.3 - 29th July. Is it ok to slightly align those dates? i.e. on 10th of every month 3.5 based release would happen (in general the oldest supported and most stable release branch). On 20th of every month 3.6 based release would happen (In general, the release branch which is being stabilized). And on 30th of every month 3.7 based release would happen (in general, the latest release branch). Please provide feedback. Once a schedule is finalized we can put that information in gluster.org http://gluster.org. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org mailto:Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] v3.6.3 doesn't respect default ACLs?
On 07/27/2015 08:30 PM, Glomski, Patrick wrote: I built a patched version of 3.6.4 and the problem does seem to be fixed on a test server/client when I mounted with those flags (acl, resolve-gids, and gid-timeout). Seeing as it was a test system, I can't really provide anything meaningful as to the performance hit seen without the gid-timeout option. Thank you for implementing it so quickly, though! Is there any chance of getting this fix incorporated in the upcoming 3.6.5 release? Patrick I am planning to include this fix in 3.6.5. This fix is still under review. Once it is accepted in master, it cab be backported to release-3.6 branch. I will wait till then and make 3.6.5. Regards, Raghavendra Bhat On Thu, Jul 23, 2015 at 6:27 PM, Niels de Vos nde...@redhat.com mailto:nde...@redhat.com wrote: On Tue, Jul 21, 2015 at 10:30:04PM +0200, Niels de Vos wrote: On Wed, Jul 08, 2015 at 03:20:41PM -0400, Glomski, Patrick wrote: Gluster devs, I'm running gluster v3.6.3 (both server and client side). Since my application requires more than 32 groups, I don't mount with ACLs on the client. If I mount with ACLs between the bricks and set a default ACL on the server, I think I'm right in stating that the server should respect that ACL whenever a new file or folder is made. I would expect that the ACL gets in herited on the brick. When a new file is created without the default ACL, things seem to be wrong. You mention that creating the file directly on the brick has the correct ACL, so there must be some Gluster component interfering. You reminded me on IRC about this email, and that helped a lot. Its very easy to get distracted when trying to investigate things from the mailinglists. I had a brief look, and I think we could reach a solution. An ugly patch for initial testing is ready. Well... it compiles. I'll try to run some basic tests tomorrow and see if it improves things and does not crash immediately. The change can be found here: http://review.gluster.org/11732 It basically adds a resolve-gids mount option for the FUSE client. This causes the fuse daemon to call getgrouplist() and retrieve all the groups for the UID that accesses the mountpoint. Without this option, the behavior is not changed, and /proc/$PID/status is used to get up to 32 groups (the $PID is the process that accesses the mountpoint). You probably want to also mount with gid-timeout=N where N is seconds that the group cache is valid. In the current master branch this is set to 300 seconds (like the sssd default), but if the groups of a used rarely change, this value can be increased. Previous versions had a lower timeout which could cause resolving the groups on almost each network packet that arrives (HUGE performance impact). When using this option, you may also need to enable server.manage-gids. This option allows using more than ~93 groups on the bricks. The network packets can only contain ~93 groups, when server.manage-gids is enabled, the groups are not sent in the network packets, but are resolved on the bricks with getgrouplist(). The patch linked above had been tested, corrected and updated. The change works for me on a test-system. A backport that you should be able to include in a package for 3.6 can be found here: http://termbin.com/f3cj Let me know if you are not familiar with rebuilding patched packages, and I can build a test-version for you tomorrow. On glusterfs-3.6, you will want to pass a gid-timeout mount option too. The option enables caching of the resolved groups that the uid belongs too, if caching is not enebled (or expires quickly), you will probably notice a preformance hit. Newer version of GlusterFS set the timeout to 300 seconds (like the default timeout sssd uses). Please test and let me know if this fixes your use case. Thanks, Niels Cheers, Niels Maybe an example is in order: We first set up a test directory with setgid bit so that our new subdirectories inherit the group. [root@gfs01a hpc_shared]# mkdir test; cd test; chown pglomski.users .; chmod 2770 .; getfacl . # file: . # owner: pglomski # group: users # flags: -s- user::rwx group::rwx other::--- New subdirectories share the group, but the umask leads to them being group read-only. [root@gfs01a test]# mkdir a; getfacl a # file: a # owner: root # group: users # flags: -s- user::rwx group::r-x other::r-x Setting default ACLs on the server allows group write to new directories made
Re: [Gluster-devel] gluster vol start is failing when glusterfs is compiled with debug enable .
On 07/22/2015 09:50 AM, Atin Mukherjee wrote: On 07/22/2015 12:50 AM, Anand Nekkunti wrote: Hi All gluster vol start is failing when glusterfs is compiled with debug enable . Link: :https://bugzilla.redhat.com/show_bug.cgi?id=1245331 *brick start is failing with fallowing error:* 2015-07-21 19:01:59.408729] I [MSGID: 100030] [glusterfsd.c:2296:main] 0-/usr/local/sbin/glusterfsd: Started running /usr/local/sbin/glusterfsd version 3.8dev (args: /usr/local/sbin/glusterfsd -s 192.168.0.4 --volfile-id VOL.192.168.0.4.tmp-BRICK1 -p /var/lib/glusterd/vols/VOL/run/192.168.0.4-tmp-BRICK1.pid -S /var/run/gluster/0a4faf3d8d782840484629176ecf307a.socket --brick-name /tmp/BRICK1 -l /var/log/glusterfs/bricks/tmp-BRICK1.log --xlator-option *-posix.glusterd-uuid=4ec09b0c-6043-40f0-bc1a-5cc312d49a78 --brick-port 49152 --xlator-option VOL-server.listen-port=49152) [2015-07-21 19:02:00.075574] I [MSGID: 101190] [event-epoll.c:627:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2015-07-21 19:02:00.078905] W [MSGID: 101095] [xlator.c:189:xlator_dynload] 0-xlator: /usr/local/lib/libgfdb.so.0: undefined symbol: gf_sql_str2sync_t [2015-07-21 19:02:00.078947] E [MSGID: 101002] [graph.y:211:volume_type] 0-parser: Volume 'VOL-changetimerecorder', line 16: type 'features/changetimerecorder' is not valid or not found on this machine [2015-07-21 19:02:00.079020] E [MSGID: 101019] [graph.y:319:volume_end] 0-parser: type not specified for volume VOL-changetimerecorder [2015-07-21 19:02:00.079150] E [MSGID: 100026] [glusterfsd.c:2151:glusterfs_process_volfp] 0-: failed to construct the graph [2015-07-21 19:02:00.079399] W [glusterfsd.c:1214:cleanup_and_exit] (--/usr/local/sbin/glusterfsd(mgmt_getspec_cbk+0x343) [0x40df64] --/usr/local/sbin/glusterfsd(glusterfs_process_volfp+0x1a2) [0x409b58] --/usr/local/sbin/glusterfsd(cleanup_and_exit+0x77) [0x407a6f] ) 0-: received signum (0), shutting down I am not able to hit this though. This seems to be the case of inline functions being considerd as undefined symblols. There has been a discussion about it in the mailing list. https://www.gluster.org/pipermail/gluster-devel/2015-June/045942.html Regards, Raghavendra Bhat ThanksRegards Anand.N ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] on patch #11553
On 07/07/2015 12:30 PM, Raghavendra G wrote: + vijay mallikarjuna for quotad has similar concerns + Raghavendra Bhat for snapd might've similar concerns. Snapd also uses protocol/server at the top of the graph. So the fix for protocol/server should be good enough. Regards, Raghavendra Bhat On Tue, Jul 7, 2015 at 12:02 PM, Raghavendra Gowdappa rgowd...@redhat.com mailto:rgowd...@redhat.com wrote: +gluster-devel - Original Message - From: Raghavendra Gowdappa rgowd...@redhat.com mailto:rgowd...@redhat.com To: Krishnan Parthasarathi kpart...@redhat.com mailto:kpart...@redhat.com Cc: Nithya Balachandran nbala...@redhat.com mailto:nbala...@redhat.com, Anoop C S achir...@redhat.com mailto:achir...@redhat.com Sent: Tuesday, 7 July, 2015 11:32:01 AM Subject: on patch #11553 KP, Though the crash because of lack of init while fops are in progress is solved, concerns addressed by [1] are still valid. Basically what we need to guarantee is that when is it safe to wind fops through a particular subvol of protocol/server. So, if some xlators are doing things in events like CHILD_UP (like trash), server_setvolume should wait for CHILD_UP on a particular subvol before accepting a client. So, [1] is necessary but following changes need to be made: 1. protocol/server _can_ have multiple subvol as children. In that case we should track whether the exported subvol has received CHILD_UP and only after a successful CHILD_UP on that subvol connections to that subvol can be accepted. 2. It is valid (though not a common thing on brick process) that some subvols can be up and some might be down. So, child readiness should be localised to that subvol instead of tracking readiness at protocol/server level. So, please revive [1] and send it with corrections and I'll merge it. [1] http://review.gluster.org/11553 regards, Raghavendra. ___ Gluster-devel mailing list Gluster-devel@gluster.org mailto:Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel -- Raghavendra G ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] healing of bad objects (marked by scrubber)
Adding the correct gluster-devel id. Regards, Raghavendra Bhat On 07/08/2015 11:38 AM, Raghavendra Bhat wrote: Hi, In bit-rot feature, the scrubber marks the corrupted (objects whose data has gone bad) as bad objects (via extended attribute). If the volume is a replicate volume and a object in one of the replicas goes bad. In this case, the client is able to see the data via the good copy present in the other replica. But as of now, the self-heal does not heal the bad objects. So the method to heal the bad object is to remove the bad object directly from the backend and let self-heal take care of healing it from the good copy. The above method has a problem. The bit-rot-stub xlator sitting in the brick graph, remembers an object as bad in its inode context (either when the object was being marked bad by scrubber, or during the first lookup of the object if it was already marked bad). Bit-rot-stub uses that info to block any read/write operations on such bad objects. So it blocks any kind of operation attempted by self-heal as well to correct the object (the object was deleted directly in the backend, so the in memory inode will still be present and considered valid). There are 2 methods that I think can solve the issue. 1) In server_lookup_cbk, if the lookup of a object fails due to ENOENT *AND* the lookup is a revalidate lookup, then forget the inode associated with that object (not just unlinking the dentry, forget the inode as well iff there are no more dentries associated with the inode). Atleast this way, the inode would be forgotten, and later when self-heal wants to correct the object, it has to create a new object (the object was removed directly from the backend), which has to happen with the creation of a new in memory inode and read/write operations by self-heal daemon will not be blocked. I have sent a patch for review for the above method: http://review.gluster.org/#/c/11489/ OR 2) Do not block write operations coming on the bad object if the operation is coming from self-heal and allow it to completely heal the file and once healing is done, remove the bad-object information from the inode context. The requests coming from self-heal demon can be identified by checking the pid of it (it has -ve pid). But if the self-heal happening from the glusterfs client itself, I am not sure whether self-heal happens with a -ve pid for the frame or the same pid as that of the frame of the original fop which triggered the self-heal. Pranith? Can you clarify this? Please provide feedback. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [release-3.6] compile error: 'GF_REPLACE_OP_START' undeclared
On 07/06/2015 01:39 PM, Niels de Vos wrote: On Mon, Jul 06, 2015 at 12:09:28PM +0530, Raghavendra Bhat wrote: On 07/06/2015 09:52 AM, Kaushal M wrote: I checked on NetBSD-7.0_BETA and FreeBSD-10.1. I couldn't reproduce this. I'll try on NetBSD-6 next. ~kaushal I think it has to be included before 3.6.4 is made G.A. I can wait till the fix for this issue is merged before making 3.6.4. Does it sound ok? Or should I go ahead with 3.6.4 and make a quick 3.6.5 with this fix? I only care about getting http://review.gluster.org/11335 merged :-) This is a patch I promised to take into release-3.5. It would be nicer to have this change included in the release-3.6 branch before I merge the 3.5 backport. At the moment, 3.5.5 is waiting on this patch. But I do not think you really need to delay 3.6.4 off for that one. It should be fine if it lands in 3.6.5. (The compile error looks more like a 3.6.4 blocker.) Niels Niels, The patch you mentioned has received the acks and also has passed the linux regression tests. But it seem to have failed netbsd regression tests. Regards, Raghavendra Bhat Regards, Raghavendra Bhat On Mon, Jul 6, 2015 at 8:38 AM, Kaushal M kshlms...@gmail.com wrote: Krutika hit this last week, and let us (GlusterD maintiners) know of it. I volunteered to look into this, but couldn't find time. I'll do it now. ~kaushal On Sun, Jul 5, 2015 at 10:43 PM, Atin Mukherjee atin.mukherje...@gmail.com wrote: I remember Krutika reporting it few days back. So it seems like its not fixed yet. If there is no taker I will send a patch tomorrow. -Atin Sent from one plus one On Jul 5, 2015 9:58 PM, Niels de Vos nde...@redhat.com wrote: Hi, it seems that the current release-3.6 branch does not compile on FreedBSD and NetBSD (not sure why it compiles on CentOS-6). These errors are thrown: --- glusterd_la-glusterd-op-sm.lo --- CC glusterd_la-glusterd-op-sm.lo /home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c: In function 'glusterd_op_start_rb_timer': /home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:3685:19: error: 'GF_REPLACE_OP_START' undeclared (first use in this function) /home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:3685:19: note: each undeclared identifier is reported only once for each function it appears in /home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c: In function 'glusterd_bricks_select_status_volume': /home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:5800:34: warning: unused variable 'snapd' *** [glusterd_la-glusterd-op-sm.lo] Error code 1 Could someone send a (pointer to the) backport that addresses this? Thanks, Niels On Sun, Jul 05, 2015 at 08:59:32AM -0700, Gluster Build System (Code Review) wrote: Gluster Build System has posted comments on this change. Change subject: nfs: make it possible to disable nfs.mount-rmtab .. Patch Set 1: -Verified Build Failed http://build.gluster.org/job/compare-bug-version-and-git-branch/9953/ : SUCCESS http://build.gluster.org/job/freebsd-smoke/8551/ : FAILURE http://build.gluster.org/job/smoke/19820/ : SUCCESS http://build.gluster.org/job/netbsd6-smoke/7808/ : FAILURE -- To view, visit http://review.gluster.org/11335 To unsubscribe, visit http://review.gluster.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I40c4d8d754932f86fb2b1b2588843390464c773d Gerrit-PatchSet: 1 Gerrit-Project: glusterfs Gerrit-Branch: release-3.6 Gerrit-Owner: Niels de Vos nde...@redhat.com Gerrit-Reviewer: Gluster Build System jenk...@build.gluster.com Gerrit-Reviewer: Kaleb KEITHLEY kkeit...@redhat.com Gerrit-Reviewer: NetBSD Build System jenk...@build.gluster.org Gerrit-Reviewer: Niels de Vos nde...@redhat.com Gerrit-Reviewer: Raghavendra Bhat rab...@redhat.com Gerrit-Reviewer: jiffin tony Thottan jthot...@redhat.com Gerrit-HasComments: No ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [release-3.6] compile error: 'GF_REPLACE_OP_START' undeclared
On 07/06/2015 09:52 AM, Kaushal M wrote: I checked on NetBSD-7.0_BETA and FreeBSD-10.1. I couldn't reproduce this. I'll try on NetBSD-6 next. ~kaushal I think it has to be included before 3.6.4 is made G.A. I can wait till the fix for this issue is merged before making 3.6.4. Does it sound ok? Or should I go ahead with 3.6.4 and make a quick 3.6.5 with this fix? Regards, Raghavendra Bhat On Mon, Jul 6, 2015 at 8:38 AM, Kaushal M kshlms...@gmail.com wrote: Krutika hit this last week, and let us (GlusterD maintiners) know of it. I volunteered to look into this, but couldn't find time. I'll do it now. ~kaushal On Sun, Jul 5, 2015 at 10:43 PM, Atin Mukherjee atin.mukherje...@gmail.com wrote: I remember Krutika reporting it few days back. So it seems like its not fixed yet. If there is no taker I will send a patch tomorrow. -Atin Sent from one plus one On Jul 5, 2015 9:58 PM, Niels de Vos nde...@redhat.com wrote: Hi, it seems that the current release-3.6 branch does not compile on FreedBSD and NetBSD (not sure why it compiles on CentOS-6). These errors are thrown: --- glusterd_la-glusterd-op-sm.lo --- CC glusterd_la-glusterd-op-sm.lo /home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c: In function 'glusterd_op_start_rb_timer': /home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:3685:19: error: 'GF_REPLACE_OP_START' undeclared (first use in this function) /home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:3685:19: note: each undeclared identifier is reported only once for each function it appears in /home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c: In function 'glusterd_bricks_select_status_volume': /home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:5800:34: warning: unused variable 'snapd' *** [glusterd_la-glusterd-op-sm.lo] Error code 1 Could someone send a (pointer to the) backport that addresses this? Thanks, Niels On Sun, Jul 05, 2015 at 08:59:32AM -0700, Gluster Build System (Code Review) wrote: Gluster Build System has posted comments on this change. Change subject: nfs: make it possible to disable nfs.mount-rmtab .. Patch Set 1: -Verified Build Failed http://build.gluster.org/job/compare-bug-version-and-git-branch/9953/ : SUCCESS http://build.gluster.org/job/freebsd-smoke/8551/ : FAILURE http://build.gluster.org/job/smoke/19820/ : SUCCESS http://build.gluster.org/job/netbsd6-smoke/7808/ : FAILURE -- To view, visit http://review.gluster.org/11335 To unsubscribe, visit http://review.gluster.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I40c4d8d754932f86fb2b1b2588843390464c773d Gerrit-PatchSet: 1 Gerrit-Project: glusterfs Gerrit-Branch: release-3.6 Gerrit-Owner: Niels de Vos nde...@redhat.com Gerrit-Reviewer: Gluster Build System jenk...@build.gluster.com Gerrit-Reviewer: Kaleb KEITHLEY kkeit...@redhat.com Gerrit-Reviewer: NetBSD Build System jenk...@build.gluster.org Gerrit-Reviewer: Niels de Vos nde...@redhat.com Gerrit-Reviewer: Raghavendra Bhat rab...@redhat.com Gerrit-Reviewer: jiffin tony Thottan jthot...@redhat.com Gerrit-HasComments: No ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] tests/bugs/snapshot/bug-1109889.t - snapd crash
On 07/03/2015 03:37 PM, Atin Mukherjee wrote: http://build.gluster.org/job/rackspace-regression-2GB-triggered/11898/consoleFull has caused a crash in snapd with the following bt: This seem to have crashed in server_setvolume (i.e. before the graph could be properly made available for i/o. snapview-server xlator is yet to come into the picture). But still I will try to reproduce it on my local setup and see what might be causing this. Regards, Raghavendra Bhat #0 0x7f11e2ed3ded in gf_client_put (client=0x0, detached=0x0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/client_t.c:294 #1 0x7f11d4eeac96 in server_setvolume (req=0x7f11c000195c) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/protocol/server/src/server-handshake.c:710 #2 0x7f11e2c1e05c in rpcsvc_handle_rpc_call (svc=0x7f11d001b160, trans=0x7f11cac0, msg=0x7f11c0001810) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpcsvc.c:698 #3 0x7f11e2c1e3cf in rpcsvc_notify (trans=0x7f11cac0, mydata=0x7f11d001b160, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f11c0001810) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpcsvc.c:792 #4 0x7f11e2c23ad7 in rpc_transport_notify (this=0x7f11cac0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f11c0001810) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-transport.c:538 #5 0x7f11d841787b in socket_event_poll_in (this=0x7f11cac0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2285 #6 0x7f11d8417dd1 in socket_event_handler (fd=13, idx=3, data=0x7f11cac0, poll_in=1, poll_out=0, poll_err=0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2398 #7 0x7f11e2ed79ec in event_dispatch_epoll_handler (event_pool=0x13bb040, event=0x7f11d4eb9e70) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:570 #8 0x7f11e2ed7dda in event_dispatch_epoll_worker (data=0x7f11d000dc10) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:673 #9 0x7f11e213e9d1 in start_thread () from ./lib64/libpthread.so.0 #10 0x7f11e1aa88fd in clone () from ./lib64/libc.so.6 ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] glusterfs-3.6.4beta2 released
Hi, glusterfs-3.6.4beta1 has been released and the packages for RHEL/Fedora/Centos can be found here. http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.6.4beta2/ Requesting people running 3.6.x to please try it out and let us know if there are any issues. This release supposedly fixes the bugs listed below since 3.6.4beta1 was made available. Thanks to all who submitted patches, reviewed the changes. 1230242 - `ls' on a directory which has files with mismatching gfid's does not list anything 1230259 - Honour afr self-heal volume set options from clients 1122290 - Issues reported by Cppcheck static analysis tool 1227670 - wait for sometime before accessing the activated snapshot 1225745 - [AFR-V2] - afr_final_errno() should treat op_ret 0 also as success 1223891 - readdirp return 64bits inodes even if enable-ino32 is set 1206429 - Maintainin local transaction peer list in op-sm framework 1217419 - DHT:Quota:- brick process crashed after deleting .glusterfs from backend 1225072 - OpenSSL multi-threading changes break build in RHEL5 (3.6.4beta1) 1215419 - Autogenerated files delivered in tarball 1224624 - cli: Excessive logging 1217423 - glusterfsd crashed after directory was removed from the mount point, while self-heal and rebalance were running on the volume Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] xattr creation failure in posix_lookup
Hi, In posix_lookup, it allocates a dict for storing the values of the extended attributes and other hint keys set into the xdata of call path (i.e. wind path) by higher xlators (such as quick-read, bit-rot-stub etc). But if the creation of new dict fails, then a NULL dict is returned in the callback path. There might be many xlators for which the key-value information present in the dict might be very important for making certain decisions (Ex: In bit-rot-stub it tries to fetch an extended attribute which tells whether the object is bad or not. If the the key is present in the dict means the object is bad and the xlator updates the same in the inode context. Later when there is any read/modify operations on that object, the fop is failed instead of allowing to continue). Now suppose in posix_lookup the dict creation fails, then posix simply proceeds with the lookup operation and if other stat operations succeeded, then lookup will return success with NULL dict. if (xdata (op_ret == 0)) { xattr = posix_xattr_fill (this, real_path, loc, NULL, -1, xdata, buf); } The above piece of code in posix_lookup creates a new dict called @xattr. The return value of posix_xattr_fill is not checked. So in this case, as per the bit-rot-stub example mentioned above, there is a possibility that the object being looked up is a bad object (marked by the scrubber). And since lookup succeeded, but the bad-object xattr is not obtained in the callback (dict itself being NULL), bit-rot-stub xlator does not mark that object as bad and might allow further read/write requests coming, thus allowing bad data to be served. There might be other xlators as well dependent upon the xattrs being returned in lookup. Should we fail lookup if the dict creation fails? Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] bad file access (bit-rot + AFR)
On 06/27/2015 03:28 PM, Venky Shankar wrote: On 06/27/2015 02:32 PM, Raghavendra Bhat wrote: Hi, There is a patch that is submitted for review to deny access to objects which are marked as bad by scrubber (i.e. the data of the object might have been corrupted in the backend). http://review.gluster.org/#/c/11126/10 http://review.gluster.org/#/c/11389/4 The above 2 patch sets solve the problem of denying access to the bad objects (they have passed regression and received a +1 from venky). But in our testing we found that there is a race window (depending upon the scrubber frequency the race window can be larger) where there is a possibility of self-heal daemon healing the contents of the bad file before scrubber can mark it as bad. I am not sure if the data truly gets corrupted in the backend, there is a chance of hitting this issue. But in our testing to simulate backend corruption we modify the contents of the file directly in the backend. Now in this case, before the scrubber can mark the object as bad, the self-heal daemon kicks in and heals the contents of the bad file to the good copy. Or before the scrubber marks the file as bad, if the client accesses it AFR finds that there is a mismatch in metadata (since we modified the contents of the file in the backend) and does data and metadata self-healing, thus copying the contents of the bad copy to good copy. And from now onwards the clients accessing that object always gets bad data. I understand from Ravi (ranaraya@) that AFR-v2 would chose the biggest file as the source, provided that afr xattrs are clean (AFR-v1 would give back EIO). If a file is modified directly from the brick but leaves the size unchanged, contents can be served from either copy. For self-heal to detect anomalies, there needs to be verification (checksum/signature) at each stage of it's operation. But this might be too heavy on the I/O side. We could still cache mtime [but update on client I/O] after pre-check, but this still would not catch bit flips (unless a filesystem scrub is done). Thoughts? Yes. Even if wants to verify just before healing the file, the time taken to verify the checksum might be large if the file size is large. It might affect the self-heal performance. Regards, Raghavendra Bhat Pranith?Do you have any solution for this? Venky and me are trying to come up with a solution for this. But does this issue block the above patches in anyway? (Those 2 patches are still needed to deny access to objects once they are marked as bad by scrubber). Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] bad file access (bit-rot + AFR)
Hi, There is a patch that is submitted for review to deny access to objects which are marked as bad by scrubber (i.e. the data of the object might have been corrupted in the backend). http://review.gluster.org/#/c/11126/10 http://review.gluster.org/#/c/11389/4 The above 2 patch sets solve the problem of denying access to the bad objects (they have passed regression and received a +1 from venky). But in our testing we found that there is a race window (depending upon the scrubber frequency the race window can be larger) where there is a possibility of self-heal daemon healing the contents of the bad file before scrubber can mark it as bad. I am not sure if the data truly gets corrupted in the backend, there is a chance of hitting this issue. But in our testing to simulate backend corruption we modify the contents of the file directly in the backend. Now in this case, before the scrubber can mark the object as bad, the self-heal daemon kicks in and heals the contents of the bad file to the good copy. Or before the scrubber marks the file as bad, if the client accesses it AFR finds that there is a mismatch in metadata (since we modified the contents of the file in the backend) and does data and metadata self-healing, thus copying the contents of the bad copy to good copy. And from now onwards the clients accessing that object always gets bad data. Pranith?Do you have any solution for this? Venky and me are trying to come up with a solution for this. But does this issue block the above patches in anyway? (Those 2 patches are still needed to deny access to objects once they are marked as bad by scrubber). Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] spurious failure with test-case ./tests/basic/tier/tier.t
On 06/26/2015 04:00 PM, Ravishankar N wrote: On 06/26/2015 03:57 PM, Vijaikumar M wrote: Hi Upstream regression failure with test-case ./tests/basic/tier/tier.t My patch# 11315 regression failed twice with test-case./tests/basic/tier/tier.t. Anyone seeing this issue with other patches? Yes, one of my patches failed today too: http://build.gluster.org/job/rackspace-regression-2GB-triggered/11461/consoleFull -Ravi Even I had faced failure in tier.t couple of times. Regards, Raghavendra Bhat http://build.gluster.org/job/rackspace-regression-2GB-triggered/11396/consoleFull http://build.gluster.org/job/rackspace-regression-2GB-triggered/11456/consoleFull Thanks, Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Valgrind + glusterfs
On 06/25/2015 09:57 AM, Pranith Kumar Karampuri wrote: hi, Does anyone know why glusterfs hangs with valgrind? Pranith Yes. I have faced it too. It used work before. But recently its not working. glusterfs hangs when run with valgrind. Not sure why it is hanging. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Bad file access in bit-rot-detection
Hi, As part of Bit-rot detection feature a file that has its data changed due to some backend errors is marked as a bad file by the scrubber (sets an extended attribute indicating its a bad file). Now, the access to the bad file has to be denied (to prevent wrong data being served). In bit-rot-stub xlator (the xlator which does object versioning and sends notifications to BitD upon object modification) the check for whether the file is bad or not can be done in lookup where if the xattr is set, then the object can be marked as bad within its inode context as well. But the problem is what if the object was not marked as bad at the time of lookup and later it was marked bad. Now when a fop such as open, readv or writev comes, the fops should not be allowed. If its fuse client from which the file is being accessed, then probably its ok to rely only on lookups (to check if its bad or not), as fuse sends lookups before sending fops. But for NFS, once the lookup is done and filehandle is available further lookups are not sent. In that case relying only on lookup to check if its a bad file or not is suffecient. Below 3 solutions in bit-rot-stub xlator seem to address the above issue. 1) Whenever a fop such as open, readv or writev comes, check in the inode context if its a bad file or not. If not, then send a getxattr of bad file xattr on that file. If its present, then set the bad file attribute in the inode context and fail the fop. But for above operation, a getxattr call has to be sent downwards for almost each open or readv or writev. If the file is identified as bad, then getxattr might not be necessary. But for good files extra getxattr might affect the performance. OR 2) Set a key in xdata whenever open, readv, or writev comes (in bit-rot-stub xlator) and send it downwards. The posix xlator can look into the xdata and if the key for bad file identification is present, then it can do getxattr as part of open or readv or writev itself and send the response back in xdata itself. Not sure whether the above method is ok or not as it overloads open, readv and writev. Apart from that, the getxattr disk operation is still done. OR 3) Once the file is identified as bad, the scrubber marks it as bad (via setxattr) by sending a call to to bit-rot-stub xlator. Bit-rot-stub xlator marks the file as bad in the inode context once it receives the notification from scrubber that a file is bad. This saves those getxattr calls being made from other fops (either in bit-rot-stub xlator or posix xlator). But the trick is what if the inode gets forgotten or the brick restarts. But I think in that case, checking in lookup call is suffecient (as in both inode forgets and brick restarts, a lookup will definitely come if there is an accss to that file). Please provide feedback on above 3 methods. If there are any other solutions which might solve this issue, they are welcome. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] spurious regression status
On Wednesday 06 May 2015 10:53 PM, Vijay Bellur wrote: On 05/06/2015 06:52 AM, Pranith Kumar Karampuri wrote: hi, Please backport the patches that fix spurious regressions to 3.7 as well. This is the status of regressions now: * ./tests/bugs/quota/bug-1035576.t (Wstat: 0 Tests: 24 Failed: 2) * Failed tests: 20-21 * http://build.gluster.org/job/rackspace-regression-2GB-triggered/8329/consoleFull * ./tests/bugs/snapshot/bug-1112559.t: 1 new core files * http://build.gluster.org/job/rackspace-regression-2GB-triggered/8308/consoleFull * One more occurrence - * Failed tests: 9, 11 * http://build.gluster.org/job/rackspace-regression-2GB-triggered/8430/consoleFull Rafi - this seems to be a test unit contributed by you. Can you please look into this one? * ./tests/geo-rep/georep-rsync-changelog.t (Wstat: 256 Tests: 3 Failed: 0) * Non-zero exit status: 1 * http://build.gluster.org/job/rackspace-regression-2GB-triggered/8168/console Aravinda/Kotresh - any update on this? If we do not intend enabling geo-replication tests in regression runs for now, this should go off the list. * ./tests/basic/quota-anon-fd-nfs.t (failed-test: 21) * Happens in: master (http://build.gluster.org/job/rackspace-regression-2GB-triggered/8147/consoleFull) http://build.gluster.org/job/rackspace-regression-2GB-triggered/8147/consoleFull%29 * Being investigated by: ? Sachin - does this happen anymore or should we move it off the list? * tests/features/glupy.t * nuked tests 7153, 7167, 7169, 7173, 7212 Emmanuel's investigation should help us here. Thanks! * tests/basic/volume-snapshot-clone.t * http://review.gluster.org/#/c/10053/ * Came back on April 9 * http://build.gluster.org/job/rackspace-regression-2GB-triggered/6658/ Rafi - does this happen anymore? If fixed due to subsequent commits, we should look at dropping this test from is_bad_test() in run-tests.sh. * tests/basic/uss.t * https://bugzilla.redhat.com/show_bug.cgi?id=1209286 * http://review.gluster.org/#/c/10143/ * Came back on April 9 * http://build.gluster.org/job/rackspace-regression-2GB-triggered/6660/ * ./tests/bugs/glusterfs/bug-867253.t (Wstat: 0 Tests: 9 Failed: 1) * Failed test: 8 Raghu - does this happen anymore? If fixed due to subsequent commits, we should look at dropping this test from is_bad_test() in run-tests.sh. -Vijay I tried to reproduce the issue and it did not happen in my setup. So I am planning to get a slave machine and test it there. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] spurious regression status
On Thursday 07 May 2015 10:50 AM, Sachin Pandit wrote: - Original Message - From: Vijay Bellur vbel...@redhat.com To: Pranith Kumar Karampuri pkara...@redhat.com, Gluster Devel gluster-devel@gluster.org, Rafi Kavungal Chundattu Parambil rkavu...@redhat.com, Aravinda avish...@redhat.com, Sachin Pandit span...@redhat.com, Raghavendra Bhat rab...@redhat.com, Kotresh Hiremath Ravishankar khire...@redhat.com Sent: Wednesday, May 6, 2015 10:53:01 PM Subject: Re: [Gluster-devel] spurious regression status On 05/06/2015 06:52 AM, Pranith Kumar Karampuri wrote: hi, Please backport the patches that fix spurious regressions to 3.7 as well. This is the status of regressions now: * ./tests/bugs/quota/bug-1035576.t (Wstat: 0 Tests: 24 Failed: 2) * Failed tests: 20-21 * http://build.gluster.org/job/rackspace-regression-2GB-triggered/8329/consoleFull * ./tests/bugs/snapshot/bug-1112559.t: 1 new core files * http://build.gluster.org/job/rackspace-regression-2GB-triggered/8308/consoleFull * One more occurrence - * Failed tests: 9, 11 * http://build.gluster.org/job/rackspace-regression-2GB-triggered/8430/consoleFull Rafi - this seems to be a test unit contributed by you. Can you please look into this one? * ./tests/geo-rep/georep-rsync-changelog.t (Wstat: 256 Tests: 3 Failed: 0) * Non-zero exit status: 1 * http://build.gluster.org/job/rackspace-regression-2GB-triggered/8168/console Aravinda/Kotresh - any update on this? If we do not intend enabling geo-replication tests in regression runs for now, this should go off the list. * ./tests/basic/quota-anon-fd-nfs.t (failed-test: 21) * Happens in: master (http://build.gluster.org/job/rackspace-regression-2GB-triggered/8147/consoleFull) http://build.gluster.org/job/rackspace-regression-2GB-triggered/8147/consoleFull%29 * Being investigated by: ? Sachin - does this happen anymore or should we move it off the list? quota-anon-fd.t failure is consistent in NetBSD, whereas in linux apart from test failure mentioned in etherpad I did not see this failure again in the regression runs. However, I remember Pranith talking about hitting this issue again. * tests/features/glupy.t * nuked tests 7153, 7167, 7169, 7173, 7212 Emmanuel's investigation should help us here. Thanks! * tests/basic/volume-snapshot-clone.t * http://review.gluster.org/#/c/10053/ * Came back on April 9 * http://build.gluster.org/job/rackspace-regression-2GB-triggered/6658/ Rafi - does this happen anymore? If fixed due to subsequent commits, we should look at dropping this test from is_bad_test() in run-tests.sh. * tests/basic/uss.t * https://bugzilla.redhat.com/show_bug.cgi?id=1209286 * http://review.gluster.org/#/c/10143/ * Came back on April 9 * http://build.gluster.org/job/rackspace-regression-2GB-triggered/6660/ * ./tests/bugs/glusterfs/bug-867253.t (Wstat: 0 Tests: 9 Failed: 1) * Failed test: 8 Raghu - does this happen anymore? If fixed due to subsequent commits, we should look at dropping this test from is_bad_test() in run-tests.sh. -Vijay As per the Jenkins output uss.t is failing in this test case TEST stat $M0/.history/snap6/aaa And its failing with the below error. stat: cannot stat `/mnt/glusterfs/0/.history/snap6/aaa': No such file or directory Its bit strange as before doing this check the file is created in the mount point and then the snapshot is taken. I am not sure whether its not able to reach the file itself or its parent directory (which represents the snapshot of the volume i.e. in this case its /mnt/glusterfs/0/.history/snap6). So I have sent a patch to check for the parent directory (i.e. stat on it). It will help us get more information. http://review.gluster.org/10671 Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] crypt xlator bug
On Thursday 02 April 2015 01:00 PM, Pranith Kumar Karampuri wrote: On 04/02/2015 12:27 AM, Raghavendra Talur wrote: On Wed, Apr 1, 2015 at 10:34 PM, Justin Clift jus...@gluster.org mailto:jus...@gluster.org wrote: On 1 Apr 2015, at 10:57, Emmanuel Dreyfus m...@netbsd.org mailto:m...@netbsd.org wrote: Hi crypt.t was recently broken in NetBSD regression. The glusterfs returns a node with file type invalid to FUSE, and that breaks the test. After running a git bisect, I found the offending commit after which this behavior appeared: 8a2e2b88fc21dc7879f838d18cd0413dd88023b7 mem-pool: invalidate memory on GF_FREE to aid debugging This means the bug has always been there, but this debugging aid caused it to be reliable. Sounds like that commit is a good win then. :) Harsha/Pranith/Lala, your names are on the git blame for crypt.c... any ideas? :) I found one issue that local is not allocated using GF_CALLOC and with a mem-type. This is a patch which *might* fix it. diff --git a/xlators/encryption/crypt/src/crypt-mem-types.h b/xlators/encryption/crypt/src/crypt-mem-types.h index 2eab921..c417b67 100644 --- a/xlators/encryption/crypt/src/crypt-mem-types.h +++ b/xlators/encryption/crypt/src/crypt-mem-types.h @@ -24,6 +24,7 @@ enum gf_crypt_mem_types_ { gf_crypt_mt_key, gf_crypt_mt_iovec, gf_crypt_mt_char, +gf_crypt_mt_local, gf_crypt_mt_end, }; diff --git a/xlators/encryption/crypt/src/crypt.c b/xlators/encryption/crypt/src/crypt.c index ae8cdb2..63c0977 100644 --- a/xlators/encryption/crypt/src/crypt.c +++ b/xlators/encryption/crypt/src/crypt.c @@ -48,7 +48,7 @@ static crypt_local_t *crypt_alloc_local(call_frame_t *frame, xlator_t *this, { crypt_local_t *local = NULL; - local = mem_get0(this-local_pool); +local = GF_CALLOC (sizeof (*local), 1, gf_crypt_mt_local); local is using the memory from pool earlier(i.e. with mem_get0()). Which seems ok to me. Changing it this way will include memory allocation in fop I/O path which is why xlators generally use the mem-pool approach. Pranith I think, crypt xlator should do a mem_put of local after doing STACK_UNWIND like other xlators which also use mem_get for local (such as AFR). I am suspecting crypt not doing mem_put might be the reason for the bug mentioned. Regards, Raghavendra Bat if (!local) { gf_log(this-name, GF_LOG_ERROR, out of memory); return NULL; Niels should be able to recognize if this is sufficient fix or not. Thanks, Raghavendra Talur + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift http://twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org mailto:Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel -- *Raghavendra Talur * ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] crypt xlator bug
On Thursday 02 April 2015 05:50 PM, Jeff Darcy wrote: I think, crypt xlator should do a mem_put of local after doing STACK_UNWIND like other xlators which also use mem_get for local (such as AFR). I am suspecting crypt not doing mem_put might be the reason for the bug mentioned. My understanding was that mem_put should be called automatically from FRAME_DESTROY, which is itself called from STACK_DESTROY when the fop completes (e.g. at FUSE or GFAPI). On the other hand, I see that AFR and others call mem_put themselves, without zeroing the local pointer. In my (possibly no longer relevant) experience, freeing local myself without zeroing the pointer would lead to a double free, and I don't see why that's not the case here. What am I missing? As per my understanding, the xlators which get local by mem_get should be doing below things in callback funtion just before unwinding: 1) save frame-local pointer (i.e. local = frame-local); 2) STACK_UNWIND 3) mem_put (local) After STACK_UNWIND and before mem_put any reference to fd or inode or dict that might be present in the local should be unrefed (also any allocated resources that are present in local should be freed). So mem_put is done at last. To avoid double free in FRAME_DESTROY, frame-local is set to NULL before doing STACK_UNWIND. I suspect not doing 1 of the above three operations (may be either 1st or 3rd) in crypt xlator might be the reason for the bug. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] glusterfs-3.6.3beta2 released
Hi glusterfs-3.6.3beta2 has been released and can be found here. http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.6.3beta2/ This beta release supposedly fixes the bugs listed below since 3.6.3beta1 was made available. Thanks to all who submitted the patches, reviewed the changes. 1187526 - Disperse volume mounted through NFS doesn't list any files/directories 1188471 - When the volume is in stopped state/all the bricks are down mount of the volume hangs 1201484 - glusterfs-3.6.2 fails to build on Ubuntu Precise: 'RDMA_OPTION_ID_REUSEADDR' undeclared 1202212 - Performance enhancement for RDMA 1189023 - Directories not visible anymore after add-brick, new brick dirs not part of old bricks 1202673 - Perf: readdirp in replicated volumes causes performance degrade 1203081 - Entries in indices/xattrop directory not removed appropriately 1203648 - Quota: Build ancestry in the lookup 1199936 - readv on /var/run/6b8f1f2526c6af8a87f1bb611ae5a86f.socket failed when NFS is disabled 1200297 - cli crashes when listing quota limits with xml output 1201622 - Convert quota size from n-to-h order before using it 1194141 - AFR : failure in self-heald.t 1201624 - Spurious failure of tests/bugs/quota/bug-1038598.t 1194306 - Do not count files which did not need index heal in the first place as successfully healed 1200258 - Quota: features.quota-deem-statfs is on even after disabling quota. 1165938 - Fix regression test spurious failures 1197598 - NFS logs are filled with system.posix_acl_access messages 1199577 - mount.glusterfs uses /dev/stderr and fails if the device does not exist 1197598 - NFS logs are filled with system.posix_acl_access messages 1188066 - logging improvements in marker translator 1191537 - With afrv2 + ext4, lookups on directories with large offsets could result in duplicate/missing entries 1165129 - libgfapi: use versioned symbols in libgfapi.so for compatibility 1179136 - glusterd: Gluster rebalance status returns failure 1176756 - glusterd: remote locking failure when multiple synctask transactions are run 1188064 - log files get flooded when removexattr() can't find a specified key or value 1165938 - Fix regression test spurious failures 1192522 - index heal doesn't continue crawl on self-heal failure 1193970 - Fix spurious ssl-authz.t regression failure (backport) Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [PATCH ANNOUNCE] BitRot : Object signing
Hi, These are the patches. http://review.gluster.org/#/c/9705/ http://review.gluster.org/#/c/9706/ http://review.gluster.org/#/c/9707/ http://review.gluster.org/#/c/9708/ http://review.gluster.org/#/c/9709/ http://review.gluster.org/#/c/9710/ http://review.gluster.org/#/c/9711/ http://review.gluster.org/#/c/9712/ Regards, Raghavendra Bhat On Thursday 19 February 2015 07:34 PM, Venky Shankar wrote: Hi folks, Listed below is the initial patchset for the upcoming bitrot detection feature targeted for GlusterFS 3.7. As of now, these set of patches implement object signing. Myself and Raghavendra (rabhat@) are working on pending items (scrubber, etc..) and would be sending those patches shortly. Since this is the initial patch set, it might be prone to bugs (as we speak rabhat@ is chasing a memory leak :-)). There is an upcoming event on Google+ Hangout regarding bitrot on Tuesday, 24th March. The hangout session would cover implementation details (algorithm, flow, etc..) and would be beneficial for anyone from code reviewers, users or generally interested parties. Please plan attend if possible: http://goo.gl/dap9rF As usual, comments/suggestions are more than welcome. Thanks, Venky (overclk on #freenode) ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] glusterfs-3.6.3beta1 released
Hi glusterfs-3.6.3beta1 has been released and can be found here. http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.6.3beta1/ This beta release supposedly fixes the bugs listed below since 3.6.2 was made available. Thanks to all who submitted the patches, reviewed the changes. 1138897 - NetBSD port 1184527 - Some newly created folders have root ownership although created by unprivileged user 1181977 - gluster vol clear 1159471 - rename operation leads to core dump 1173528 - Change in volume heal info command output 1186119 - tar on a gluster directory gives message file changed as we read it even though no updates to file in progress 1183716 - Force replace 1178590 - Enable quota(default) leads to heal directory's xattr failed. 1182490 - Internal ec xattrs are allowed to be modified 1187547 - self-heal-algorithm with option full doesn't heal sparse files correctly 1174170 - Glusterfs outputs a lot of warnings and errors when quota is enabled 1186119 - tar on a gluster directory gives message file changed as we read it even though no updates to file in progress Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] glusterfs-3.6.2beta2
Hi glusterfs-3.6.2beta2 has been released and can be found here. http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.6.2beta2/ This beta release supposedly fixes the bugs listed below 3.6.2beta1 was made available. Thanks to all who submitted the patches, reviewed the changes. 1180404 - nfs server restarts when a snapshot is deactivated 1180411 - CIFS:[USS]: glusterfsd OOM killed when 255 snapshots were browsed at CIFS mount and Control+C is issued 1180070 - [AFR] getfattr on fuse mount gives error : Software caused connection abort 1175753 - [readdir-ahead]: indicate EOF for readdirp 1175752 - [USS]: On a successful lookup, snapd logs are filled with Warnings dict OR key (entry-point) is NULL 1175749 - glusterfs client crashed while migrating the fds 1179658 - Add brick fails if parent dir of new brick and existing brick is same and volume was accessed using libgfapi and smb. 1146524 - glusterfs.spec.in - synch minor diffs with fedora dist-git glusterfs.spec 1175744 - [USS]: Unable to access .snaps after snapshot restore after directories were deleted and recreated 1175742 - [USS]: browsing .snaps directory with CIFS fails with Invalid argument 1175739 - [USS]: Non root user who has no access to a directory, from NFS mount, is able to access the files under .snaps under that directory 1175758 - [USS] : Rebalance process tries to connect to snapd and in case when snapd crashes it might affect rebalance process 1175765 - USS]: When snapd is crashed gluster volume stop/delete operation fails making the cluster in inconsistent state 1173528 - Change in volume heal info command output 1166515 - [Tracker] RDMA support in glusterfs 1166505 - mount fails for nfs protocol in rdma volumes 1138385 - [DHT:REBALANCE]: Rebalance failures are seen with error message remote operation failed: File exists 1177418 - entry self-heal in 3.5 and 3.6 are not compatible 1170954 - Fix mutex problems reported by coverity scan 1177899 - nfs: ls shows Permission denied with root-squash 1175738 - [USS]: data unavailability for a period of time when USS is enabled/disabled 1175736 - [USS]:After deactivating a snapshot trying to access the remaining activated snapshots from NFS mount gives 'Invalid argument' error 1175735 - [USS]: snapd process is not killed once the glusterd comes back 1175733 - [USS]: If the snap name is same as snap-directory than cd to virtual snap directory fails 1175756 - [USS] : Snapd crashed while trying to access the snapshots under .snaps directory 1175755 - SNAPSHOT[USS]:gluster volume set for uss doesnot check any boundaries 1175732 - [SNAPSHOT]: nouuid is appended for every snapshoted brick which causes duplication if the original brick has already nouuid 1175730 - [USS]: creating file/directories under .snaps shows wrong error message 1175754 - [SNAPSHOT]: before the snap is marked to be deleted if the node goes down than the snaps are propagated on other nodes and glusterd hungs 1159484 - ls -alR can not heal the disperse volume 1138897 - NetBSD port 1175728 - [USS]: All uss related logs are reported under /var/log/glusterfs, it makes sense to move it into subfolder 1170548 - [USS] : don't display the snapshots which are not activated 1170921 - [SNAPSHOT]: snapshot should be deactivated by default when created 1175694 - [SNAPSHOT]: snapshoted volume is read only but it shows rw attributes in mount 1161885 - Possible file corruption on dispersed volumes 1170959 - EC_MAX_NODES is defined incorrectly 1175645 - [USS]: Typo error in the description for USS under gluster volume set help 1171259 - mount.glusterfs does not understand -n option Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] handling statfs call in USS
On Monday 29 December 2014 01:19 PM, RAGHAVENDRA TALUR wrote: On Sun, Dec 28, 2014 at 5:03 PM, Vijay Bellur vbel...@redhat.com wrote: On 12/24/2014 02:30 PM, Raghavendra Bhat wrote: Hi, I have a doubt. In user serviceable snapshots as of now statfs call is not implemented. There are 2 ways how statfs can be handled. 1) Whenever snapview-client xlator gets statfs call on a path that belongs to snapshot world, it can send the statfs call to the main volume itself, with the path and the inode being set to the root of the main volume. OR 2) It can redirect the call to the snapshot world (the snapshot demon which talks to all the snapshots of that particular volume) and send back the reply that it has obtained. Each entry in .snaps can be thought of as a specially mounted read-only filesystem and doing a statfs in such a filesystem should generate statistics associated with that. So approach 2. seems more appropriate. I agree with Vijay here. Treating each entry in .snaps as a specially mounted read-only filesystem will be required to send proper error codes to Samba. Yeah makes sense. But one challenge is if someone does statfs on .snaps directory itself, then what should be done? Because .snaps is a virtual directory. I can think of 2 ways 1) Make snapview-server xlator return 0s when it receives statfs on .snaps so that the o/p is similar the one that is obtained when statfs is done on /proc OR if the above o/p is not right, 2) If statfs comes on .snaps, then wind the call to regular volume itself. Anything beyond .snaps will be sent to the snapshot world. Regards, Raghavendra Bhat -Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] 3.6.2beta1
Hi, glusterfs-3.6.2beta1 has been released and the rpms can be found here. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] 3.6.2beta1
On Friday 26 December 2014 12:22 PM, Raghavendra Bhat wrote: Hi, glusterfs-3.6.2beta1 has been released and the rpms can be found here. Regards, Raghavendra Bhat ___ Gluster-users mailing list gluster-us...@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users Oops. Sorry. Missed the link http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.6.2beta1/ Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] handling statfs call in USS
Hi, I have a doubt. In user serviceable snapshots as of now statfs call is not implemented. There are 2 ways how statfs can be handled. 1) Whenever snapview-client xlator gets statfs call on a path that belongs to snapshot world, it can send the statfs call to the main volume itself, with the path and the inode being set to the root of the main volume. OR 2) It can redirect the call to the snapshot world (the snapshot demon which talks to all the snapshots of that particular volume) and send back the reply that it has obtained. Please provide feedback. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] explicit lookup of inods linked via readdirp
On Thursday 18 December 2014 12:58 PM, Raghavendra Gowdappa wrote: - Original Message - From: Raghavendra Bhat rab...@redhat.com To: Gluster Devel gluster-devel@gluster.org Cc: Anand Avati aav...@redhat.com Sent: Thursday, December 18, 2014 12:31:41 PM Subject: [Gluster-devel] explicit lookup of inods linked via readdirp Hi, In fuse I saw, that as part of resolving a inode, an explicit lookup is done on it if the inode is found to be linked via readdirp (At the time of linking in readdirp, fuse sets a flag in the inode context). It is done because, many xlators such as afr depend upon lookup call for many things such as healing. Yes. But the lookup is a nameless lookup and hence is not sufficient enough. Some of the functionalities that get affected AFAIK are: 1. dht cannot create/heal directories and their layouts. 2. afr cannot identify gfid mismatch of a file across its subvolumes, since to identify a gfid mismatch we need a name. From what I heard, afr relies on crawls done by self-heal daemon for named-lookups. But dht is worst hit in terms of maintaining directory structure on newly added bricks (this problem is slightly different, since we don't hit this because of nameless lookup after readdirp. Instead it is because of a lack of named-lookup on the file after a graph switch. Neverthless I am clubbing both because a named lookup would've solved the issue). I've a feeling that different components have built their own way of handling what is essentially same issue. Its better we devise a single comprehensive solution. But that logic is not there in gfapi. I am thinking of introducing that mechanism in gfapi as well, where as part of resolve it checks if the inode is linked from readdirp. And if so it will do an explicit lookup on that inode. As you've mentioned a lookup gives a chance to afr to heal the file. So, its needed in gfapi too. However you've to speak to afr folks to discuss whether nameless lookup is sufficient enough. As per my understanding, this change in gfapi creates same chances as that of fuse. When I tried with fuse, where I had a file that need to be healed, doing ls, and cat file actually triggered a selfheal on it. So even with gfapi, the change creates same chances of healing as that of fuse. Regards, Raghavendra Bhat NOTE: It can be done in NFS server as well. Dht in NFS setup is also hit because of lack of named-lookups resulting in non-healing of directories on newly added brick. Please provide feedback. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] patches for 3.6.2
On Tuesday 23 December 2014 11:09 AM, Atin Mukherjee wrote: Can you please take in http://review.gluster.org/#/c/9328/ for 3.6.2? ~Atin On 12/19/2014 02:05 PM, Raghavendra Bhat wrote: Hi, glusterfs-3.6.2beta1 has been released. I am planning to make 3.6.2 before end of this year. If there are some patches that has to go in for 3.6.2, please send them by EOD 23-12-2014 (i.e. coming Tuesday) so that I can make a 3.6.2 release sooner. As of now, these are the bugs in new or assigned state. https://bugzilla.redhat.com/buglist.cgi?bug_status=NEWbug_status=ASSIGNEDclassification=Communityf1=blockedlist_id=3106878o1=substringproduct=GlusterFSquery_format=advancedv1=1163723 Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel Sure. Will do it. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] patches for 3.6.2
Hi, glusterfs-3.6.2beta1 has been released. I am planning to make 3.6.2 before end of this year. If there are some patches that has to go in for 3.6.2, please send them by EOD 23-12-2014 (i.e. coming Tuesday) so that I can make a 3.6.2 release sooner. As of now, these are the bugs in new or assigned state. https://bugzilla.redhat.com/buglist.cgi?bug_status=NEWbug_status=ASSIGNEDclassification=Communityf1=blockedlist_id=3106878o1=substringproduct=GlusterFSquery_format=advancedv1=1163723 Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] 3.6.1 issue
On Tuesday 16 December 2014 10:59 PM, David F. Robinson wrote: Gluster 3.6.1 seems to be having an issue creating symbolic links. To reproduce this issue, I downloaded the file dakota-6.1-public.src_.tar.gz from https://dakota.sandia.gov/download.html # gunzip dakota-6.1-public.src_.tar.gz # tar -xf dakota-6.1-public.src_.tar # cd dakota-6.1.0.src/examples/script_interfaces/TankExamples/DakotaList # ls -al *_### Results from my old storage system (non gluster)_* corvidpost5:TankExamples/DakotaList ls -al total 12 drwxr-x--- 2 dfrobins users 112 Dec 16 12:12 ./ drwxr-x--- 6 dfrobins users 117 Dec 16 12:12 ../ *lrwxrwxrwx 1 dfrobins users 25 Dec 16 12:12 EvalTank.py - ../tank_model/EvalTank.py* lrwxrwxrwx 1 dfrobins users 24 Dec 16 12:12 FEMTank.py - ../tank_model/FEMTank.py* -rwx--x--- 1 dfrobins users 734 Nov 7 11:05 RunTank.sh* -rw--- 1 dfrobins users 1432 Nov 7 11:05 dakota_PandL_list.in -rw--- 1 dfrobins users 1860 Nov 7 11:05 dakota_Ponly_list.in *_### Results from gluster (broken links that have no permissions)_* corvidpost5:TankExamples/DakotaList ls -al total 5 drwxr-x--- 2 dfrobins users 166 Dec 12 08:43 ./ drwxr-x--- 6 dfrobins users 445 Dec 12 08:43 ../ *-- 1 dfrobins users0 Dec 12 08:43 EvalTank.py -- 1 dfrobins users0 Dec 12 08:43 FEMTank.py* -rwx--x--- 1 dfrobins users 734 Nov 7 11:05 RunTank.sh* -rw--- 1 dfrobins users 1432 Nov 7 11:05 dakota_PandL_list.in -rw--- 1 dfrobins users 1860 Nov 7 11:05 dakota_Ponly_list.in === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com http://www.corvidtechnologies.com ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel Hi David, Can you please provide the log files? You can find them in /var/log/glusterfs. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] telldir/seekdir portability fixes
On Wednesday 17 December 2014 02:21 PM, Emmanuel Dreyfus wrote: Hello Any chance http://review.gluster.org/9071 gets merged (and http://review.gluster.org/9084 for release-3.6)? It has been waiting for review for more than a month now. I tried to push the above patch. But it failed with merge conflict. Can you please rebase and sent it? Regards, Raghavendra Bhat This is the remaining of a fix that has been partially done in http://review.gluster.org/8933, and that one has been operating without a hitch for a while. Without the fix, self heal breaks on NetBSD if it needs to iterate on a directory (that is: content is more than 128k). That is a big roadblock. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] explicit lookup of inods linked via readdirp
Hi, In fuse I saw, that as part of resolving a inode, an explicit lookup is done on it if the inode is found to be linked via readdirp (At the time of linking in readdirp, fuse sets a flag in the inode context). It is done because, many xlators such as afr depend upon lookup call for many things such as healing. But that logic is not there in gfapi. I am thinking of introducing that mechanism in gfapi as well, where as part of resolve it checks if the inode is linked from readdirp. And if so it will do an explicit lookup on that inode. NOTE: It can be done in NFS server as well. Please provide feedback. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] snapshot restore and USS
On Monday 01 December 2014 04:51 PM, Raghavendra G wrote: On Fri, Nov 28, 2014 at 6:48 PM, RAGHAVENDRA TALUR raghavendra.ta...@gmail.com mailto:raghavendra.ta...@gmail.com wrote: On Thu, Nov 27, 2014 at 2:59 PM, Raghavendra Bhat rab...@redhat.com mailto:rab...@redhat.com wrote: Hi, With USS to access snapshots, we depend on last snapshot of the volume (or the latest snapshot) to resolve some issues. Ex: Say there is a directory called dir within the root of the volume and USS is enabled. Now when .snaps is accessed from dir (i.e. /dir/.snaps), first a lookup is sent on /dir which snapview-client xlator passes onto the normal graph till posix xlator of the brick. Next the lookup comes on /dir/.snaps. snapview-client xlator now redirects this call to the snap daemon (since .snaps is a virtual directory to access the snapshots). The lookup comes to snap daemon with parent gfid set to the gfid of /dir and the basename being set to .snaps. Snap daemon will first try to resolve the parent gfid by trying to find the inode for that gfid. But since that gfid was not looked up before in the snap daemon, it will not be able to find the inode. So now to resolve it, snap daemon depends upon the latest snapshot. i.e. it tries to look up the gfid of /dir in the latest snapshot and if it can get the gfid, then lookup on /dir/.snaps is also successful. From the user point of view, I would like to be able to enter into the .snaps anywhere. To be able to do that, we can turn the dependency upside down, instead of listing all snaps in the .snaps dir, lets just show whatever snapshots had that dir. Currently readdir in snap-view server is listing _all_ the snapshots. However if you try to do ls on a snapshot which doesn't contain this directory (say dir/.snaps/snap3), I think it returns ESTALE/ENOENT. So, to get what you've explained above, readdir(p) should filter out those snapshots which doesn't contain this directory (to do that, it has to lookup dir on each of the snapshots). Raghavendra Bhat explained the problem and also a possible solution to me in person. There are some pieces missing in the problem description as explained in the mail (but not in the discussion we had). The problem explained here occurs when you restore a snapshot (say snap3) where the directory got created, but deleted before next snapshot. So, directory doesn't exist in snap2 and snap4, but exists only in snap3. Now, when you restore snap3, ls on dir/.snaps should show nothing. Now, what should be result of lookup (gfid-of-dir, .snaps) should be? 1. we can blindly return a virtual inode, assuming there is atleast one snapshot contains dir. If fops come on specific snapshots (eg., dir/.snaps/snap4), they'll anyways fail with ENOENT (since dir is not present on any snaps). 2. we can choose to return ENOENT if we figure out that dir is not present on any snaps. The problem we are trying to solve here is how to achieve 2. One simple solution is to lookup for gfid-of-dir on all the snapshots and if every lookup fails with ENOENT, we can return ENOENT. The other solution is to just lookup in snapshots before and after (if both are present, otherwise just in latest snapshot). If both fail, then we can be sure that no snapshots contain that directory. Rabhat, Correct me if I've missed out anything :). If a readdir on .snaps entered from a non root directory has to show the list of only those snapshots where the directory (or rather gfid of the directory) is present, then the way to achieve will be bit costly. When readdir comes on .snaps entered from a non root directory (say ls /dir/.snaps), following operations have to be performed 1) In a array we have the names of all the snapshots. So, do a nameless lookup on the gfid of /dir on all the snapshots 2) Based on which snapshots have sent success to the above lookup, build a new array or list of snapshots. 3) Then send the above new list as the readdir entries. But the above operation it costlier. Because, just to serve one readdir request we have to make a lookup on each snapshot (if there are 256 snapshots, then we have to make 256 lookup calls via network). One more thing is resource usage. As of now any snapshot will be initied (i.e. via gfapi a connection is established with the corresponding snapshot volume, which is equivalent to a mounted volume.) when that snapshot is accessed (from fops point of view a lookup comes on the snapshot entry, say ls /dir/.snaps/snap1). Now to serve readdir all the snapshots will be accessed and all the snapshots are initialized. This means there can be 256 instances of gfapi connections with each instance having its own inode table and other resources). After readdir if a snapshot is not accessed, so many resources
[Gluster-devel] snapshot restore and USS
Hi, With USS to access snapshots, we depend on last snapshot of the volume (or the latest snapshot) to resolve some issues. Ex: Say there is a directory called dir within the root of the volume and USS is enabled. Now when .snaps is accessed from dir (i.e. /dir/.snaps), first a lookup is sent on /dir which snapview-client xlator passes onto the normal graph till posix xlator of the brick. Next the lookup comes on /dir/.snaps. snapview-client xlator now redirects this call to the snap daemon (since .snaps is a virtual directory to access the snapshots). The lookup comes to snap daemon with parent gfid set to the gfid of /dir and the basename being set to .snaps. Snap daemon will first try to resolve the parent gfid by trying to find the inode for that gfid. But since that gfid was not looked up before in the snap daemon, it will not be able to find the inode. So now to resolve it, snap daemon depends upon the latest snapshot. i.e. it tries to look up the gfid of /dir in the latest snapshot and if it can get the gfid, then lookup on /dir/.snaps is also successful. But, there can be some confusion in the case of snapshot restore. Say there are 5 snapshots (snap1, snap2, snap3, snap4, snap5) for a volume vol. Now say the volume is restored to snap3. If there was a directory called /a at the time of taking snap3 and was later removed, then after snapshot restore accessing .snaps from that directory (in fact all the directories which were present while taking snap3) might cause problems. Because now the original volume is nothing but the snap3 and snap daemon when gets the lookup on /a/.snaps, it tries to find the gfid of /a in the latest snapshot (which is snap5) and if a was removed after taking snap3, then the lookup of /a in snap5 fails and thus the lookup of /a/.snaps will also fail. Possible Solution: One of the possible solution that can be helpful in this case is, whenever glusterd sends the list of snapshots to snap daemon after snapshot restore, send the list in such a way that the snapshot which is previous to the restored snapshot is sent as the latest snapshot (in the example above, since snap3 is restored, glusterd should send snap2 as the latest snapshot to snap daemon). But in the above solution also, there is a problem. If there are only 2 snapshots (snap1, snap2) and the volume is restored to the first snapshot (snap1), there is no previous snapshot to look at. And glusterd will send only one name in the list which is snap2 but it is in a future state than the volume. A patch has been submitted for the review to handle this (http://review.gluster.org/#/c/9094/). And in the patch because of the above confusions snapd tries to consult the adjacent snapshots of the restored snapshot to resolve the gfids. As per the 5 snapshots example, it tries to look at snap2 and snap4 (i.e. look into snap2 first, if it fails then look into snap4). If there is no previous snapshot, then look at the next snapshot (2 snapshots example). If there is no next snapshot, then look at the previous snapshot. Please provide feed back about how this issue can be handled. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] documentation on inode and dentry management
Hi, I have sent a patch to add the info on how glusterfs manages inodes and dentries. http://review.gluster.org/#/c/8815/ Please review it and provide feedback to improve it. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] inode linking in GlusterFS NFS server
Hi, As per my understanding nfs server is not doing inode linking in readdirp callback. Because of this there might be some errors while dealing with virtual inodes (or gfids). As of now meta, gfid-access and snapview-server (used for user serviceable snapshots) xlators makes use of virtual inodes with random gfids. The situation is this: Say User serviceable snapshot feature has been enabled and there are 2 snapshots (snap1 and snap2). Let /mnt/nfs be the nfs mount. Now the snapshots can be accessed by entering .snaps directory. Now if snap1 directory is entered and *ls -l* is done (i.e. cd /mnt/nfs/.snaps/snap1 and then ls -l), the readdirp fop is sent to the snapview-server xlator (which is part of a daemon running for the volume), which talks to the corresponding snapshot volume and gets the dentry list. Before unwinding it would have generated random gfids for those dentries. Now nfs server upon getting readdirp reply, will associate the gfid with the filehandle created for the entry. But without linking the inode, it would send the readdirp reply back to nfs client. Now next time when nfs client makes some operation on one of those filehandles, nfs server tries to resolve it by finding the inode for the gfid present in the filehandle. But since the inode was not linked in readdirp, inode_find operation fails and it tries to do a hard resolution by sending the lookup operation on that gfid to the normal main graph. (The information on whether the call should be sent to main graph or snapview-server would be present in the inode context. But here the lookup has come on a gfid with a newly created inode where the context is not there yet. So the call would be sent to the main graph itself). But since the gfid is a randomly generated virtual gfid (not present on disk), the lookup operation fails giving error. As per my understanding this can happen with any xlator that deals with virtual inodes (by generating random gfids). I can think of these 2 methods to handle this: 1) do inode linking for readdirp also in nfs server 2) If lookup operation fails, snapview-client xlator (which actually redirects the fops on snapshot world to snapview-server by looking into the inode context) should check if the failed lookup is a nameless lookup. If so, AND the gfid of the inode is NULL AND lookup has come from main graph, then instead of unwinding the lookup with failure, send it to snapview-server which might be able to find the inode for the gfid (as the gfid was generated by itself, it should be able to find the inode for that gfid unless and until it has been purged from the inode table). Please let me know if I have missed anything. Please provide feedback. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] inode linking in GlusterFS NFS server
On Tuesday 08 July 2014 01:21 AM, Anand Avati wrote: On Mon, Jul 7, 2014 at 12:48 PM, Raghavendra Bhat rab...@redhat.com mailto:rab...@redhat.com wrote: Hi, As per my understanding nfs server is not doing inode linking in readdirp callback. Because of this there might be some errors while dealing with virtual inodes (or gfids). As of now meta, gfid-access and snapview-server (used for user serviceable snapshots) xlators makes use of virtual inodes with random gfids. The situation is this: Say User serviceable snapshot feature has been enabled and there are 2 snapshots (snap1 and snap2). Let /mnt/nfs be the nfs mount. Now the snapshots can be accessed by entering .snaps directory. Now if snap1 directory is entered and *ls -l* is done (i.e. cd /mnt/nfs/.snaps/snap1 and then ls -l), the readdirp fop is sent to the snapview-server xlator (which is part of a daemon running for the volume), which talks to the corresponding snapshot volume and gets the dentry list. Before unwinding it would have generated random gfids for those dentries. Now nfs server upon getting readdirp reply, will associate the gfid with the filehandle created for the entry. But without linking the inode, it would send the readdirp reply back to nfs client. Now next time when nfs client makes some operation on one of those filehandles, nfs server tries to resolve it by finding the inode for the gfid present in the filehandle. But since the inode was not linked in readdirp, inode_find operation fails and it tries to do a hard resolution by sending the lookup operation on that gfid to the normal main graph. (The information on whether the call should be sent to main graph or snapview-server would be present in the inode context. But here the lookup has come on a gfid with a newly created inode where the context is not there yet. So the call would be sent to the main graph itself). But since the gfid is a randomly generated virtual gfid (not present on disk), the lookup operation fails giving error. As per my understanding this can happen with any xlator that deals with virtual inodes (by generating random gfids). I can think of these 2 methods to handle this: 1) do inode linking for readdirp also in nfs server 2) If lookup operation fails, snapview-client xlator (which actually redirects the fops on snapshot world to snapview-server by looking into the inode context) should check if the failed lookup is a nameless lookup. If so, AND the gfid of the inode is NULL AND lookup has come from main graph, then instead of unwinding the lookup with failure, send it to snapview-server which might be able to find the inode for the gfid (as the gfid was generated by itself, it should be able to find the inode for that gfid unless and until it has been purged from the inode table). Please let me know if I have missed anything. Please provide feedback. That's right. NFS server should be linking readdirp_cbk inodes just like FUSE or protocol/server. It has been OK without virtual gfids thus far. I did the changes to link inodes in readdirp_cbk in nfs server. It seems to work fine. Should we need the second change also? (i.e chage in the snapview-client to redirect the fresh nameless lookups to snapview-server). With nfs server linking the inodes in readdirp, I think second change might not be needed. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] spurious failure (bug-1112559.t)
Hi, I think the regression test bug-1112559.t is causing some spurious failures. I see some regression jobs being failed due to it. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Need clarification regarding the force option for snapshot delete.
On Friday 27 June 2014 10:47 AM, Raghavendra Talur wrote: Inline. - Original Message - From: Atin Mukherjee amukh...@redhat.com To: Sachin Pandit span...@redhat.com, Gluster Devel gluster-devel@gluster.org, gluster-us...@gluster.org Sent: Thursday, June 26, 2014 3:30:31 PM Subject: Re: [Gluster-devel] Need clarification regarding the force option for snapshot delete. On 06/26/2014 01:58 PM, Sachin Pandit wrote: Hi all, We had some concern regarding the snapshot delete force option, That is the reason why we thought of getting advice from everyone out here. Currently when we give gluster snapshot delete snapname, It gives a notification saying that mentioned snapshot will be deleted, Do you still want to continue (y/n)?. As soon as he presses y it will delete the snapshot. Our new proposal is, When a user issues snapshot delete command without force then the user should be given a notification saying to use force option to delete the snap. In that case gluster snapshot delete snapname becomes useless apart from throwing a notification. If we can ensure snapshot delete all works only with force option then we can have gluster snapshot delete volname to work as it is now. ~Atin Agree with Atin here, asking user to execute same command with force appended is not right. When snapshot delete command is issued with force option then the user should be given a notification saying Mentioned snapshot will be deleted, Do you still want to continue (y/n). The reason we thought of bringing this up is because we have planned to introduce a command gluster snapshot delete all which deletes all the snapshot in a system, and gluster snapshot delete volume volname which deletes all the snapshots in the mentioned volume. If user accidentally issues any one of the above mentioned command and press y then he might lose few or more snapshot present in volume/system. (Thinking it will ask for notification for each delete). It will be good to have this feature, asking for y for every delete. When force is used we don't ask confirmation for each. Similar to rm -f. If that is not feasible as of now, is something like this better? Case 1 : Single snap [root@snapshot-24 glusterfs]# gluster snapshot delete snap1 Deleting snap will erase all the information about the snap. Do you still want to continue? (y/n) y [root@snapshot-24 glusterfs]# Case 2: Delete all system snaps [root@snapshot-24 glusterfs]# gluster snapshot delete all Deleting N snaps stored on the system Do you still want to continue? (y/n) y [root@snapshot-24 glusterfs]# Case 3: Delete all volume snaps [root@snapshot-24 glusterfs]# gluster snapshot delete volume volname Deleting N snaps for the volume volname Do you still want to continue? (y/n) y [root@snapshot-24 glusterfs]# Idea here being, if the Warnings to different commands are different then users may pause for moment to read and check the message. We can even list the snaps to be deleted even if we don't ask for confirmation for each. Raghavendra Talur Agree with Raghavendra Talur. It would be better to ask the user without force option. The above method suggested by Talur seems to be neat. Regards, Raghavendra Bhat Do you think notification would be more than enough, or do we need to introduce a force option ? -- Current procedure: -- [root@snapshot-24 glusterfs]# gluster snapshot delete snap1 Deleting snap will erase all the information about the snap. Do you still want to continue? (y/n) Proposed procedure: --- [root@snapshot-24 glusterfs]# gluster snapshot delete snap1 Please use the force option to delete the snap. [root@snapshot-24 glusterfs]# gluster snapshot delete snap1 force Deleting snap will erase all the information about the snap. Do you still want to continue? (y/n) -- We are looking forward for the feedback on this. Thanks, Sachin Pandit. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] regarding inode-unref on root inode
On Tuesday 24 June 2014 08:17 PM, Pranith Kumar Karampuri wrote: Does anyone know why inode_unref is no-op for root inode? I see the following code in inode.c static inode_t * __inode_unref (inode_t *inode) { if (!inode) return NULL; if (__is_root_gfid(inode-gfid)) return inode; ... } I think its done with the intention that, root inode should *never* ever get removed from the active inodes list. (not even accidentally). So unref on root-inode is a no-op. Dont know whether there are any other reasons. Regards, Raghavendra Bhat Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] glupy test failing
Hi, I am seeing glupy.t test being failed in some testcases. It is failing in my local machine as well (with latest master). Is it a genuine failure or a spurious one? /tests/features/glupy.t(Wstat: 0 Tests: 6 Failed: 2) Failed tests: 2, 6 As per the logfile of the fuse mount done in the testcase this is the error: [2014-06-20 14:15:53.038826] I [MSGID: 100030] [glusterfsd.c:1998:main] 0-glusterfs: Started running glusterfs version 3.5qa2 (args: glusterfs -f /d/backends/glupytest.vol /mnt/glusterfs/0) [2014-06-20 14:15:53.059484] E [glupy.c:2382:init] 0-vol-glupy: Python import failed [2014-06-20 14:15:53.059575] E [xlator.c:425:xlator_init] 0-vol-glupy: Initialization of volume 'vol-glupy' failed, review your volfile again [2014-06-20 14:15:53.059587] E [graph.c:322:glusterfs_graph_init] 0-vol-glupy: initializing translator failed [2014-06-20 14:15:53.059595] E [graph.c:525:glusterfs_graph_activate] 0-graph: init failed [2014-06-20 14:15:53.060045] W [glusterfsd.c:1182:cleanup_and_exit] (-- 0-: received signum (0), shutting down [2014-06-20 14:15:53.060090] I [fuse-bridge.c:5561:fini] 0-fuse: Unmounting '/mnt/glusterfs/0'. [2014-06-20 14:19:01.867378] I [MSGID: 100030] [glusterfsd.c:1998:main] 0-glusterfs: Started running glusterfs version 3.5qa2 (args: glusterfs -f /d/backends/glupytest.vol /mnt/glusterfs/0) [2014-06-20 14:19:01.897158] E [glupy.c:2382:init] 0-vol-glupy: Python import failed [2014-06-20 14:19:01.897241] E [xlator.c:425:xlator_init] 0-vol-glupy: Initialization of volume 'vol-glupy' failed, review your volfile again [2014-06-20 14:19:01.897252] E [graph.c:322:glusterfs_graph_init] 0-vol-glupy: initializing translator failed [2014-06-20 14:19:01.897260] E [graph.c:525:glusterfs_graph_activate] 0-graph: init failed [2014-06-20 14:19:01.897635] W [glusterfsd.c:1182:cleanup_and_exit] (-- 0-: received signum (0), shutting down [2014-06-20 14:19:01.897677] I [fuse-bridge.c:5561:fini] 0-fuse: Unmounting '/mnt/glusterfs/0'. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] autodelete in snapshots
On Wednesday 04 June 2014 11:23 AM, Rajesh Joseph wrote: - Original Message - From: M S Vishwanath Bhat msvb...@gmail.com To: Rajesh Joseph rjos...@redhat.com Cc: Vijay Bellur vbel...@redhat.com, Seema Naik sen...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Tuesday, June 3, 2014 5:55:27 PM Subject: Re: [Gluster-devel] autodelete in snapshots On 3 June 2014 15:21, Rajesh Joseph rjos...@redhat.com wrote: - Original Message - From: M S Vishwanath Bhat msvb...@gmail.com To: Vijay Bellur vbel...@redhat.com Cc: Seema Naik sen...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Tuesday, June 3, 2014 1:02:08 AM Subject: Re: [Gluster-devel] autodelete in snapshots On 2 June 2014 20:22, Vijay Bellur vbel...@redhat.com wrote: On 04/23/2014 05:50 AM, Vijay Bellur wrote: On 04/20/2014 11:42 PM, Lalatendu Mohanty wrote: On 04/16/2014 11:39 AM, Avra Sengupta wrote: The whole purpose of introducing the soft-limit is, that at any point of time the number of snaps should not exceed the hard limit. If we trigger auto-delete on hitting hard-limit, then the purpose itself is lost, because at that point we would be taking a snap, making the limit hard-limit + 1, and then triggering auto-delete, which violates the sanctity of the hard-limit. Also what happens when we are at hard-limit + 1, and another snap is issued, while auto-delete is yet to process the first delete. At that point we end up at hard-limit + 1. Also what happens if for a particular snap the auto-delete fails. We should see the hard-limit, as something set by the admin keeping in mind the resource consumption and at no-point should we cross this limit, come what may. If we hit this limit, the create command should fail asking the user to delete snaps using the snapshot delete command. The two options Raghavendra mentioned are applicable for the soft-limit only, in which cases on hitting the soft-limit 1. Trigger auto-delete or 2. Log a warning-message, for the user saying the number of snaps is exceeding the snap-limit and display the number of available snaps Now which of these should happen also depends on the user, because the auto-delete option is configurable. So if the auto-delete option is set as true, auto-delete should be triggered and the above message should also be logged. But if the option is set as false, only the message should be logged. This is the behaviour as designed. Adding Rahul, and Seema in the mail, to reflect upon the behaviour as well. Regards, Avra This sounds correct. However we need to make sure that the usage or documentation around this should be good enough , so that users understand the each of the limits correctly. It might be better to avoid the usage of the term soft-limit. soft-limit as used in quota and other places generally has an alerting connotation. Something like auto-deletion-limit might be better. I still see references to soft-limit and auto deletion seems to get triggered upon reaching soft-limit. Why is the ability to auto delete not configurable? It does seem pretty nasty to go about deleting snapshots without obtaining explicit consent from the user. I agree with Vijay here. It's not good to delete a snap (even though it is oldest) without the explicit consent from user. FYI It took me more than 2 weeks to figure out that my snaps were getting autodeleted after reaching soft-limit. For all I know I had not done anything and my snap restore were failing. I propose to remove the terms soft and hard limit. I believe there should be a limit (just limit) after which all snapshot creates should fail with proper error messages. And there can be a water-mark after which user should get warning messages. So below is my proposal. auto-delete + snap-limit: If the snap-limit is set to n , next snap create (n+1th) will succeed only if if auto-delete is set to on/true/1 and oldest snap will get deleted automatically. If autodelete is set to off/false/0 , (n+1)th snap create will fail with proper error message from gluster CLI command. But again by default autodelete should be off. snap-water-mark : This should come in picture only if autodelete is turned off. It should not have any meaning if auto-delete is turned ON. Basically it's usage is to give the user warning that limit almost being reached and it is time for admin to decide which snaps should be deleted (or which should be kept) *my two cents* -MS The reason for having a hard-limit is to stop snapshot creation once we reached this limit. This helps to have a control over the resource consumption. Therefore if we only have this limit (as snap-limit) then there is no question of auto-delete. Auto-delete can only be triggered once the count crosses the limit. Therefore we introduced the concept of soft-limit and a hard-limit. As the name suggests once the hard-limit is reached no more snaps will be created. Perhaps I could have been more clearer. auto-delete value does come into
[Gluster-devel] inode lru limit
Hi, Currently the lru-limit of the inode table in brick processes is 16384. There is a option to configure it to some other value. The protocol/server uses inode_lru_limit variable present in its private structure while creating the inode table (whose default value is 16384). When the option is reconfigured via volume set option the protocol/server's inode_lru_limit variable present in its private structure is changed. But the actual size of the inode table still remains same as old one. Only when the brick is restarted the newly set value comes into picture. Is it ok? Should we change the inode table's lru_limit variable also as part of reconfigure? If so, then probably we might have to remove the extra inodes present in the lru list by calling inode_table_prune. Please provide feedback Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel