Re: [Gluster-devel] Multi-network support proposal
On 13 February 2015 18:27:54 CET, Jeff Darcy jda...@redhat.com wrote: This is a proposal for a lightweight version of multi-network support, somewhat limited in functionality but implementable quickly because it seems unlikely that anyone will be able to spend much time on a full version. Here's the corresponding feature page: http://www.gluster.org/community/documentation/index.php/Features/SplitNetwork There are three things a user must be able to do: * Create new named networks (a default network always exists) * Associate specific host interface names/addresses with networks * Associate volume roles (e.g. I/O, rebalance) with networks The crux of this proposal is to do what we can by rewriting volfiles between when they're fetched and when they're used to construct a translator graph. Each protocol/client translator within a volfile originally uses a server's *canonical* name - on the default network, used for peer probe and other glusterd-to-glusterd operations. For each such translator, the volfile consumer must do the following: (1) Determine its own role (e.g. client, self-heal daemon) (2) Use the {volume,role} tuple to find the right *network* to use (3) Use the {network,canonical_name} tuple to find the right *interface* to use (4) Replace the server's canonical name (remote-host option) with the interface name For example, consider the following (don't worry about the syntax yet). gluster volume create fubar server1:/brick server2:/brick gluster network create client-net gluster network associate client-net server1 server1-public gluster network associate client-net server2 server2-public gluster volume set-role fubar mounts client-net According to step (2) above, any native mount of fubar will know to use client-net. Therefore, according to step (3) it should make the following substitutions: server1 = server1-public server2 = server2-public By contrast, glusterd would still use the plain old server1 and server2 on the default network. Three kinds of enhancements are being left to the future: (a) Determine the network to use based on *client* identity as well as {volume,role} (b) Allow a volume to be used only through certain networks (c) Modify other parameters (e.g. SSL usage/credentials) based on network or interface Any other thoughts? Perhaps bandwidth allocation? /Anders -- Sent from my Android device with K-9 Mail. Please excuse my brevity. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] GlusterFS 3.7 updates needed
On 2015-01-27 11:49, Vijay Bellur wrote: Hi All, As we approach feature freeze for 3.7 (28th Feb), it would great to have a status update for features being worked on for the release. Not exactly a new feature, but something I would like to see happen: [Gluster-devel] Glusterd daemon management code refactoring http://www.gluster.org/pipermail/gluster-devel/2014-December/043180.html /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Glusterd daemon management code refactoring
On 2014-12-09 16:23, Anders Blomdell wrote: On 2014-12-09 13:18, Krishnan Parthasarathi wrote: All, I would like to propose refactoring of the code managing various daemons in glusterd. Unlike other high(er) level proposals about feature design, this one is at the implementation level. Please go through the details of the proposal below and share your thoughts/suggestions on the approach. ### Introduction Glusterd manages GlusterFS daemons providing services like NFS, Proactive self-heal, Quota, User servicable snapshots etc. Following are some of the aspects that come under daemon management. - Connection Management - unix domain sockets based channel for internal communication - Methods - connect, disconnect, notify - Process Management - pidfile to detect if the daemon is running - Environment; run-dir, svc-dir, log-dir etc. - Methods - start, stop, status, kill - Daemon-specific Management Currently, the daemon management code is fragmented and doesn't elicit the structure described above. This results in further fragmentation since new developers may not identify common patterns, worse even, they won't be able to do anything about it. This proposal aims to do the following, - Provide an abstract data type that encapsulates what is common among daemons that are managed by glusterd. - 'Port' existing code to make use of the abstract type. This would help in making this change self-documented to an extent. - Prescribe a way to maintain per-feature daemon code separate to glusterd's common code. ### Abstract data types struct conn_mgmt { struct rpc_clnt *rpc; int (*connect) (struct conn_mgmt *self); int (*disconnect) (struct conn_mgmt self); int (*notify) (struct conn_mgmt *self, rpc_clnt_event_t *rpc_event); } Great, one place to fix IPv6/IPv4 coexistence :-) Just for cross referencing: https://bugzilla.redhat.com/show_bug.cgi?id=1117886 struct proc_mgmt { char svcdir[PATH_MAX]; char rundir[PATH_MAX]; char logdir[PATH_MAX]; char pidfile[PATH_MAX]; char logfile[PATH_MAX]; char volid[UUID_CANONICAL_FORM_LEN]; int (*start) (struct proc_mgmt *self, int flags); int (*stop) (struct proc_mgmt *self, int flags); int (*is_running) (struct proc_mgmt *self); int (*kill) (struct proc_mgmt *self, int flags); } Feature authors can define data type representing their service by implementing the above 'abstract' class. For e.g, struct my_service { char name[PATH_MAX]; /* my_service specific data members and methods */ /* The methods in the following structures should be implemented by respective feature authors */ struct conn_mgmt conn; struct proc_mgmt proc; } ### Code structure guidelines Each feature that introduces a daemon would implement the abstract data type. The implementations should be in separate files named appropriately. The intent is to avoid feature specific code to leak into common glusterd codebase. glusterd-utils.c is testament to such practices in the past. For e.g, [kp@trantor glusterd]$ tree . └── src ├── glusterd-conn-mgmt.c ├── glusterd-conn-mgmt.h ├── glusterd-proc-mgmt.c ├── glusterd-proc-mgmt.h ├── my-feature-service.c └── my-feature-service.h [kp@trantor glusterd]$ cat src/my-feature-service.h #include glusterd-conn-mgmt.h #include glusterd-proc-mgmt.h ... [rest of the code elided] ### Bibliography - Object-oriented design patterns in the kernel, part 1 - http://lwn.net/Articles/444910/ - Object-oriented design patterns in the kernel, part 2 - http://lwn.net/Articles/446317/ thanks, kp ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Glusterd daemon management code refactoring
On 2014-12-12 10:26, Krishnan Parthasarathi wrote: Anders, ### Abstract data types struct conn_mgmt { struct rpc_clnt *rpc; int (*connect) (struct conn_mgmt *self); int (*disconnect) (struct conn_mgmt self); int (*notify) (struct conn_mgmt *self, rpc_clnt_event_t *rpc_event); } Great, one place to fix IPv6/IPv4 coexistence :-) Just for cross referencing: https://bugzilla.redhat.com/show_bug.cgi?id=1117886 I am glad that this refactoring has positive side-effects that we didn't imagine :-) One thing to lookout for is 'gluster poll', which is quite special, it should probably try all addresses returned by getaddrinfo, for small installations ( 20 machines, say) one could give all machines the same alias name, and mounting with that alias name would work automagically even with some machines down, since mounting would try until it found a responding machine, but the 'gluster poll' on that alias name would probably break things if not anticipated before. Just my 5 ¢. /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] memory leaks
On 2014-11-04 10:38, Emmanuel Dreyfus wrote: Hi FWIW, there are still memory leaks in glusterfs 3.6.0. My favourite test is building NetBSD on a replicated volume, and it fails because the machine runs out of swap. After building for 14 hours and now idle, client gusterfsd grew to 2024 MB, one server glusterfsd grew to 1164 MB, the other one to 1290 MB Don't you mean glusterfs (the fuse client?). At least I have the same issues with glusterfs on fedora20, it looks like the client gains some fat as files and/or directories are added. Any info on how to debug memory issues on the client side (valgrind goes belly-up with tests of reasonable size). For instance is there any way to turn off internal memory pools (if there are any), in order to closely follow memory allocations with a suitable LD_PRELOAD malloc/free/... wrapper? /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Dependency issue while installing glusterfs-3.6beta with vdsm
On 2014-10-31 13:18, Kaleb S. KEITHLEY wrote: On 10/31/2014 08:15 AM, Humble Chirammal wrote: I can create a gluster pool, but when trying to create an image I get the error Libvirt version does not support storage cloning. Will continue tomorrow. As you noticed, this looks to be a libvirt version specific issue. Qemu's that does not touch gluster works OK, so the installation is OK right now :-) I read it as, 'compat package works' ! It works, but I think most of us think it's a hack. Yes, but workable in the interim :-) I'm going to cobble up a libgfapi with versioned symbols, without the SO_NAME bump. Since we're not going to package 3.6.0 anyway, we have a bit of breathing room. Could we get the compat package into master/release-3.6 in the interim to ease tracking/testing? Thanks for all the work so far. /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Dependency issue while installing glusterfs-3.6beta with vdsm
On 2014-10-28 20:33, Niels de Vos wrote: On Tue, Oct 28, 2014 at 05:52:38PM +0100, Anders Blomdell wrote: On 2014-10-28 17:30, Niels de Vos wrote: On Tue, Oct 28, 2014 at 08:42:00AM -0400, Kaleb S. KEITHLEY wrote: On 10/28/2014 07:48 AM, Darshan Narayana Murthy wrote: Hi, Installation of glusterfs-3.6beta with vdsm (vdsm-4.14.8.1-0.fc19.x86_64) fails on f19 f20 because of dependency issues with qemu packages. I installed vdsm-4.14.8.1-0.fc19.x86_64 which installs glusterfs-3.5.2-1.fc19.x86_64 as dependency. Now when I try to update glusterfs by downloading rpms from : http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.6.0beta3/Fedora/fedora-19/ It fails with following error: Error: Package: 2:qemu-system-lm32-1.4.2-15.fc19.x86_64 (@updates) Requires: libgfapi.so.0()(64bit) Removing: glusterfs-api-3.5.2-1.fc19.x86_64 (@updates) libgfapi.so.0()(64bit) Updated By: glusterfs-api-3.6.0-0.5.beta3.fc19.x86_64 (/glusterfs-api-3.6.0-0.5.beta3.fc19.x86_64) ~libgfapi.so.7()(64bit) Available: glusterfs-api-3.4.0-0.5.beta2.fc19.x86_64 (fedora) libgfapi.so.0()(64bit) Full output at: http://ur1.ca/ikvk8 For having snapshot and geo-rep management through ovirt, we need glusterfs-3.6 to be installed with vdsm, which is currently failing. Can you please provide your suggestions to resolve this issue. Hi, Starting in 3.6 we have bumped the SO_VERSION of libgfapi. You need to install glusterfs-api-devel-3.6.0... first and build vdsm. But we are (or were) not planning to release glusterfs-3.6.0 on f19 and f20... Off hand I don't believe there's anything in glusterfs-api-3.6.0 that vdsm needs. vdsm with glusterfs-3.5.x on f19 and f20 should be okay. Is there something new in vdsm-4-14 that really needs glusterfs-3.6? If so we can revisit whether we release 3.6 to fedora 19 and 20. The chain of dependencies is like this: vdsm - qemu - libgfapi.so.0 I think a rebuild of QEMU should be sufficient. I'm planning to put glusterfs-3.6 and rebuilds of related packages in a Fedora COPR. This would make it possible for Fedora users to move to 3.6 before they switch to Fedora 22. AFAICT the only difference between libgfapi.so.0 and libgfapi.so.7 are two added synbols (glfs_get_volfile, glfs_h_access) and __THROW on functions. Wouldn't it be possible to provide a compatibilty libgfapi.so.0 to ease migration? That is possible, sure. I think that rebuilding related packages is just easier, there are only a few needed. Users that would like to run 3.6 before it is made available with Fedora 22 need to add a repository for the glusterfs-3.6 packages anyway, using the same repository to provide related packages is simple enough. Except that you have to manually bump the version in those packages if yum should automatically pick up the new version (just realized that tonights rebuild of qemu was useless, since the version is the same :-(, sigh). I think a compat package would make the coupling between server and client looser, (i.e. one could run old clients on the same machine as a new server). Due to limited time and dependency on qemu on some of my testing machines, I still have not been able to test 3.6.0beta3. A -compat package would have helped me a lot (but maybe given you more bugs to fix :-)). But, if there is a strong interest in having a -compat package, we can discuss that during tomorrows (Wednesdays) meeting. Sorry that i missed the meeting (due to DST change and not doing date -d 12:00 UTC [should be in the etherpad]) /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Dependency issue while installing glusterfs-3.6beta with vdsm
On 2014-10-30 14:52, Kaleb KEITHLEY wrote: On 10/30/2014 04:36 AM, Anders Blomdell wrote: I think a compat package would make the coupling between server and client looser, (i.e. one could run old clients on the same machine as a new server). Due to limited time and dependency on qemu on some of my testing machines, I still have not been able to test 3.6.0beta3. A -compat package would have helped me a lot (but maybe given you more bugs to fix :-)). Hi, Here's an experimental respin of 3.6.0beta3 with a -compat RPM. http://koji.fedoraproject.org/koji/taskinfo?taskID=7981431 Please let us know how it works. The 3.6 release is coming very soon and if this works we'd like to include it in our Fedora and EPEL packages. Nope, does not work, since the running usr/lib/rpm/find-provides (or /usr/lib/rpm/redhat/find-provides) on the symlink does not yield the proper provides [which for my system should be libgfapi.so.0()(64bit)]. So no cigar :-( Have you checked my more heavyhanded http://review.gluster.org/9014 ? /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Dependency issue while installing glusterfs-3.6beta with vdsm
On 2014-10-30 20:55, Kaleb KEITHLEY wrote: On 10/30/2014 01:50 PM, Anders Blomdell wrote: On 2014-10-30 14:52, Kaleb KEITHLEY wrote: On 10/30/2014 04:36 AM, Anders Blomdell wrote: I think a compat package would make the coupling between server and client looser, (i.e. one could run old clients on the same machine as a new server). Due to limited time and dependency on qemu on some of my testing machines, I still have not been able to test 3.6.0beta3. A -compat package would have helped me a lot (but maybe given you more bugs to fix :-)). Hi, Here's an experimental respin of 3.6.0beta3 with a -compat RPM. http://koji.fedoraproject.org/koji/taskinfo?taskID=7981431 Please let us know how it works. The 3.6 release is coming very soon and if this works we'd like to include it in our Fedora and EPEL packages. Nope, does not work, since the running usr/lib/rpm/find-provides (or /usr/lib/rpm/redhat/find-provides) on the symlink does not yield the proper provides [which for my system should be libgfapi.so.0()(64bit)]. So no cigar :-( Hi, 1) I erred on the symlink in the -compat RPM. It should have been /usr/lib64/libgfapi.so.0 - libgfapi.so.7(.0.0). Noticed that, not the main problem though :-) 2) find-provides is just a wrapper that greps the SO_NAME from the shared lib. And if you pass symlinks such as /usr/lib64/libgfapi.so.7 or /usr/lib64/libgfapi.so.0 to it, they both return the same result, i.e. the null string. The DSO run-time does not check that the SO_NAME matches. No, but yum checks for libgfapi.so.0()(64bit) / libgfapi.so.0, so i think something like this is needed for yum to cope with upgrades. %ifarch x86_64 Provides: libgfapi.so.0()(64bit) %else Provides: libgfapi.so.0 %endif I have a revised set of rpms with a correct symlink available http://koji.fedoraproject.org/koji/taskinfo?taskID=7984220. The main test (that I'm interested in) is whether qemu built against 3.5.x works with it or not. First thing is to get a yum upgrade to succeed. Have you checked my more heavyhanded http://review.gluster.org/9014 ? I have. A) it's, well, heavy handed ;-) mainly due to, B) there's lot of duplicated code for no real purpose, and Agreed, quick fix to avoid soname hell (and me being unsure of what problems __THROW could give rise to). C) for whatever reason it's not making it through our smoke and regression tests (although I can't imagine how a new and otherwise unused library would break those.) Me neither, but I'm good at getting smoke :-) If it comes to it, I personally would rather take a different route and use versioned symbols in the library and not bump the SO_NAME. Because the old APIs are unchanged and all we've done is add new APIs. I guess we have passed that point since 3.6.0 is out in the wild (RHEL), and no way to bump down the version number. But we're running out of time for the 3.6 release (which is already months overdue.) I don't know if anyone here looked at doing versioned symbols before we made the decision to bump the SO_NAME. I've looked at Ulrich Dreppers https://software.intel.com/sites/default/files/m/a/1/e/dsohowto.pdf write-up and it's not very hard. Too late now, though? -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Dependency issue while installing glusterfs-3.6beta with vdsm
On 2014-10-30 22:06, Kaleb KEITHLEY wrote: On 10/30/2014 04:34 PM, Anders Blomdell wrote: On 2014-10-30 20:55, Kaleb KEITHLEY wrote: On 10/30/2014 01:50 PM, Anders Blomdell wrote: On 2014-10-30 14:52, Kaleb KEITHLEY wrote: On 10/30/2014 04:36 AM, Anders Blomdell wrote: I think a compat package would make the coupling between server and client looser, (i.e. one could run old clients on the same machine as a new server). Due to limited time and dependency on qemu on some of my testing machines, I still have not been able to test 3.6.0beta3. A -compat package would have helped me a lot (but maybe given you more bugs to fix :-)). Hi, Here's an experimental respin of 3.6.0beta3 with a -compat RPM. http://koji.fedoraproject.org/koji/taskinfo?taskID=7981431 Please let us know how it works. The 3.6 release is coming very soon and if this works we'd like to include it in our Fedora and EPEL packages. Nope, does not work, since the running usr/lib/rpm/find-provides (or /usr/lib/rpm/redhat/find-provides) on the symlink does not yield the proper provides [which for my system should be libgfapi.so.0()(64bit)]. So no cigar :-( Hi, 1) I erred on the symlink in the -compat RPM. It should have been /usr/lib64/libgfapi.so.0 - libgfapi.so.7(.0.0). Noticed that, not the main problem though :-) 2) find-provides is just a wrapper that greps the SO_NAME from the shared lib. And if you pass symlinks such as /usr/lib64/libgfapi.so.7 or /usr/lib64/libgfapi.so.0 to it, they both return the same result, i.e. the null string. The DSO run-time does not check that the SO_NAME matches. No, but yum checks for libgfapi.so.0()(64bit) / libgfapi.so.0, so i think something like this is needed for yum to cope with upgrades. %ifarch x86_64 Provides: libgfapi.so.0()(64bit) %else Provides: libgfapi.so.0 %endif That's already in the glusterfs-api-compat RPM that I sent you. The 64-bit part anyway. Yes, a complete fix would include the 32-bit too. A looked/tried at the wrong RPM I have a revised set of rpms with a correct symlink available http://koji.fedoraproject.org/koji/taskinfo?taskID=7984220. The main test (that I'm interested in) is whether qemu built against 3.5.x works with it or not. First thing is to get a yum upgrade to succeed. What was the error? Me :-( (putting files in the wrong location) Unfortunately hard to test, my libvirtd (1.1.3.6) seems to lack gluster support (even though qemu is linked against libvirtd), any recommended version of libvirtd to compile? Have you checked my more heavyhanded http://review.gluster.org/9014 ? I have. A) it's, well, heavy handed ;-) mainly due to, B) there's lot of duplicated code for no real purpose, and Agreed, quick fix to avoid soname hell (and me being unsure of what problems __THROW could give rise to). In C, that's a no-op. In C++, it tells the compiler that the function does not throw exceptions and can optimize accordingly. OK, no problems there then. C) for whatever reason it's not making it through our smoke and regression tests (although I can't imagine how a new and otherwise unused library would break those.) Me neither, but I'm good at getting smoke :-) If it comes to it, I personally would rather take a different route and use versioned symbols in the library and not bump the SO_NAME. Because the old APIs are unchanged and all we've done is add new APIs. I guess we have passed that point since 3.6.0 is out in the wild (RHEL), and no way to bump down the version number. That's RHS-Gluster, not community gluster. There's been some discussion of not packaging 3.6.0 and releasing and packaging 3.6.1 in short order. We might have a small window of opportunity. (Because there's never time to do it right the first time, but there's always time to do it over. ;-) But we're running out of time for the 3.6 release (which is already months overdue.) So is my testing, hope I'm not the bottleneck here :-) I don't know if anyone here looked at doing versioned symbols before we made the decision to bump the SO_NAME. I've looked at Ulrich Dreppers https://software.intel.com/sites/default/files/m/a/1/e/dsohowto.pdf write-up and it's not very hard. Too late now, though? Perhaps not. +2 /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Dependency issue while installing glusterfs-3.6beta with vdsm
On 2014-10-30 22:44, Anders Blomdell wrote: On 2014-10-30 22:06, Kaleb KEITHLEY wrote: On 10/30/2014 04:34 PM, Anders Blomdell wrote: On 2014-10-30 20:55, Kaleb KEITHLEY wrote: On 10/30/2014 01:50 PM, Anders Blomdell wrote: On 2014-10-30 14:52, Kaleb KEITHLEY wrote: On 10/30/2014 04:36 AM, Anders Blomdell wrote: I think a compat package would make the coupling between server and client looser, (i.e. one could run old clients on the same machine as a new server). Due to limited time and dependency on qemu on some of my testing machines, I still have not been able to test 3.6.0beta3. A -compat package would have helped me a lot (but maybe given you more bugs to fix :-)). Hi, Here's an experimental respin of 3.6.0beta3 with a -compat RPM. http://koji.fedoraproject.org/koji/taskinfo?taskID=7981431 Please let us know how it works. The 3.6 release is coming very soon and if this works we'd like to include it in our Fedora and EPEL packages. Nope, does not work, since the running usr/lib/rpm/find-provides (or /usr/lib/rpm/redhat/find-provides) on the symlink does not yield the proper provides [which for my system should be libgfapi.so.0()(64bit)]. So no cigar :-( Hi, 1) I erred on the symlink in the -compat RPM. It should have been /usr/lib64/libgfapi.so.0 - libgfapi.so.7(.0.0). Noticed that, not the main problem though :-) 2) find-provides is just a wrapper that greps the SO_NAME from the shared lib. And if you pass symlinks such as /usr/lib64/libgfapi.so.7 or /usr/lib64/libgfapi.so.0 to it, they both return the same result, i.e. the null string. The DSO run-time does not check that the SO_NAME matches. No, but yum checks for libgfapi.so.0()(64bit) / libgfapi.so.0, so i think something like this is needed for yum to cope with upgrades. %ifarch x86_64 Provides: libgfapi.so.0()(64bit) %else Provides: libgfapi.so.0 %endif That's already in the glusterfs-api-compat RPM that I sent you. The 64-bit part anyway. Yes, a complete fix would include the 32-bit too. A looked/tried at the wrong RPM I have a revised set of rpms with a correct symlink available http://koji.fedoraproject.org/koji/taskinfo?taskID=7984220. The main test (that I'm interested in) is whether qemu built against 3.5.x works with it or not. First thing is to get a yum upgrade to succeed. What was the error? Me :-( (putting files in the wrong location) Unfortunately hard to test, my libvirtd (1.1.3.6) seems to lack gluster support (even though qemu is linked against libvirtd), any recommended version of libvirtd to compile? With (srpms from fc21) libvirt-client-1.2.9-3.fc20.x86_64 libvirt-daemon-1.2.9-3.fc20.x86_64 libvirt-daemon-driver-interface-1.2.9-3.fc20.x86_64 libvirt-daemon-driver-network-1.2.9-3.fc20.x86_64 libvirt-daemon-driver-nodedev-1.2.9-3.fc20.x86_64 libvirt-daemon-driver-nwfilter-1.2.9-3.fc20.x86_64 libvirt-daemon-driver-qemu-1.2.9-3.fc20.x86_64 libvirt-daemon-driver-secret-1.2.9-3.fc20.x86_64 libvirt-daemon-driver-storage-1.2.9-3.fc20.x86_64 libvirt-daemon-kvm-1.2.9-3.fc20.x86_64 libvirt-daemon-qemu-1.2.9-3.fc20.x86_64 libvirt-devel-1.2.9-3.fc20.x86_64 libvirt-docs-1.2.9-3.fc20.x86_64 libvirt-gconfig-0.1.7-2.fc20.x86_64 libvirt-glib-0.1.7-2.fc20.x86_64 libvirt-gobject-0.1.7-2.fc20.x86_64 libvirt-python-1.2.7-2.fc20.x86_64 I can create a gluster pool, but when trying to create an image I get the error Libvirt version does not support storage cloning. Will continue tomorrow. Qemu's that does not touch gluster works OK, so the installation is OK right now :-) Have you checked my more heavyhanded http://review.gluster.org/9014 ? I have. A) it's, well, heavy handed ;-) mainly due to, B) there's lot of duplicated code for no real purpose, and Agreed, quick fix to avoid soname hell (and me being unsure of what problems __THROW could give rise to). In C, that's a no-op. In C++, it tells the compiler that the function does not throw exceptions and can optimize accordingly. OK, no problems there then. C) for whatever reason it's not making it through our smoke and regression tests (although I can't imagine how a new and otherwise unused library would break those.) Me neither, but I'm good at getting smoke :-) If it comes to it, I personally would rather take a different route and use versioned symbols in the library and not bump the SO_NAME. Because the old APIs are unchanged and all we've done is add new APIs. I guess we have passed that point since 3.6.0 is out in the wild (RHEL), and no way to bump down the version number. That's RHS-Gluster, not community gluster. There's been some discussion of not packaging 3.6.0 and releasing and packaging 3.6.1 in short order. We might have a small window of opportunity. (Because there's never time to do it right the first time, but there's always time to do it over. ;-) But we're running out of time for the 3.6 release (which is already months overdue.) So is my
Re: [Gluster-devel] [Gluster-users] Dependency issue while installing glusterfs-3.6beta with vdsm
On 2014-10-28 17:30, Niels de Vos wrote: On Tue, Oct 28, 2014 at 08:42:00AM -0400, Kaleb S. KEITHLEY wrote: On 10/28/2014 07:48 AM, Darshan Narayana Murthy wrote: Hi, Installation of glusterfs-3.6beta with vdsm (vdsm-4.14.8.1-0.fc19.x86_64) fails on f19 f20 because of dependency issues with qemu packages. I installed vdsm-4.14.8.1-0.fc19.x86_64 which installs glusterfs-3.5.2-1.fc19.x86_64 as dependency. Now when I try to update glusterfs by downloading rpms from : http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.6.0beta3/Fedora/fedora-19/ It fails with following error: Error: Package: 2:qemu-system-lm32-1.4.2-15.fc19.x86_64 (@updates) Requires: libgfapi.so.0()(64bit) Removing: glusterfs-api-3.5.2-1.fc19.x86_64 (@updates) libgfapi.so.0()(64bit) Updated By: glusterfs-api-3.6.0-0.5.beta3.fc19.x86_64 (/glusterfs-api-3.6.0-0.5.beta3.fc19.x86_64) ~libgfapi.so.7()(64bit) Available: glusterfs-api-3.4.0-0.5.beta2.fc19.x86_64 (fedora) libgfapi.so.0()(64bit) Full output at: http://ur1.ca/ikvk8 For having snapshot and geo-rep management through ovirt, we need glusterfs-3.6 to be installed with vdsm, which is currently failing. Can you please provide your suggestions to resolve this issue. Hi, Starting in 3.6 we have bumped the SO_VERSION of libgfapi. You need to install glusterfs-api-devel-3.6.0... first and build vdsm. But we are (or were) not planning to release glusterfs-3.6.0 on f19 and f20... Off hand I don't believe there's anything in glusterfs-api-3.6.0 that vdsm needs. vdsm with glusterfs-3.5.x on f19 and f20 should be okay. Is there something new in vdsm-4-14 that really needs glusterfs-3.6? If so we can revisit whether we release 3.6 to fedora 19 and 20. The chain of dependencies is like this: vdsm - qemu - libgfapi.so.0 I think a rebuild of QEMU should be sufficient. I'm planning to put glusterfs-3.6 and rebuilds of related packages in a Fedora COPR. This would make it possible for Fedora users to move to 3.6 before they switch to Fedora 22. AFAICT the only difference between libgfapi.so.0 and libgfapi.so.7 are two added synbols (glfs_get_volfile, glfs_h_access) and __THROW on functions. Wouldn't it be possible to provide a compatibilty libgfapi.so.0 to ease migration? When more details become available, I'll let this list know. Cheers, Niels ___ Gluster-users mailing list gluster-us...@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] warning: bogus date in %changelog
Does the following warning: warning: bogus date in %changelog: Thu Jun 29 2014 Humble Chirammal hchir...@redhat.com warrant a bug report/fix as below (based on date of commit 8d8abc19)? Or should I just leave it as is? diff --git a/glusterfs.spec.in b/glusterfs.spec.in index f373b45..5ad4956 100644 --- a/glusterfs.spec.in +++ b/glusterfs.spec.in @@ -1042,7 +1042,7 @@ fi * Wed Sep 24 2014 Balamurugan Arumugam barum...@redhat.com - remove /sbin/ldconfig as interpreter (#1145992) -* Thu Jun 29 2014 Humble Chirammal hchir...@redhat.com +* Thu Jun 19 2014 Humble Chirammal hchir...@redhat.com - Added dynamic loading of fuse module with glusterfs-fuse package installation in el5. * Fri Jun 27 2014 Kaleb S. KEITHLEY kkeit...@redhat.com /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Reintroduce IPv6 support
On 2014-10-05 21:30, Justin Clift wrote: On 02/09/2014, at 4:22 PM, Anders Blomdell wrote: Back from vacation, and would like to make a new version of http://review.gluster.org/#/c/8292/. snip Any thoughts? Any luck following this up? :) Nope, have been very quiet on the list regarding this, and I have got snagged up in other stuff :-(. Since a correct solution is more complex than I had previously assumed, I would like some input from the list before I start. I'll try to get some time in the not too distant future. /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] GlusterFest Test Week
On 2014-09-22 11:57, Justin Clift wrote: On 21/09/2014, at 7:47 PM, Vijay Bellur wrote: On 09/18/2014 05:42 PM, Humble Devassy Chirammal wrote: Greetings, As decided in our last GlusterFS meeting and the 3.6 planning schedule, we shall conduct GlusterFS 3.6 test days starting from next week. This time we intend testing one component and functionality per day. GlusterFS 3.6.0beta1 is now available to kick start the test week [1]. As we find issues and more patches do get merged in over the test week, I will be triggering further beta releases. We're going to need RPMs... and probably .deb's too. Tried to start testing this, but stumbled on qemu not building due to recent changes in systemtap, see https://bugzilla.redhat.com/show_bug.cgi?id=1145993 I added this bug as a blocker. Regards Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] GlusterFest Test Week
On 2014-09-24 11:22, Justin Clift wrote: On 24/09/2014, at 10:17 AM, Anders Blomdell wrote: On 2014-09-22 11:57, Justin Clift wrote: On 21/09/2014, at 7:47 PM, Vijay Bellur wrote: On 09/18/2014 05:42 PM, Humble Devassy Chirammal wrote: Greetings, As decided in our last GlusterFS meeting and the 3.6 planning schedule, we shall conduct GlusterFS 3.6 test days starting from next week. This time we intend testing one component and functionality per day. GlusterFS 3.6.0beta1 is now available to kick start the test week [1]. As we find issues and more patches do get merged in over the test week, I will be triggering further beta releases. We're going to need RPMs... and probably .deb's too. Tried to start testing this, but stumbled on qemu not building due to recent changes in systemtap, see https://bugzilla.redhat.com/show_bug.cgi?id=1145993 I added this bug as a blocker. Sounds like a beta2 coming up then. ;) Not really a gluster problem, but a systemtap/qemu issue (unless gluster could provide a compatibility libglusterfs.so.0) /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Gluster Test Framwork tests failed on Gluster+Zfs(Zfs on linux)
On 2014-09-04 13:08, Kaleb KEITHLEY wrote: On 09/04/2014 06:49 AM, Santosh Pradhan wrote: Hi, Currently GlusterFS is tightly coupled with ext(2/3/4) and XFS. Zfs (ZOL) and Btrfs are not supported at the moment, may get supported in future (at least btrfs). Thanks, Santosh On 09/04/2014 03:34 PM, Justin Clift wrote: On 28/08/2014, at 9:30 AM, Kiran Patil wrote: Hi Gluster Devs, I ran the Gluster Test Framework on Gluster+zfs stack and found issues. I would like to know if I need to submit a bug at Redhat Bugzilla since the stack has zfs, which is not supported by Redhat or Fedora if I am not wrong? Definitely create an issue on the Red Hat Bugzilla, for the GlusterFS product ; there: https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS Since it's for the upstream Community, the official Red Hat Supported list isn't super relevant. There is no officially blessed file system for Community GlusterFS. The only requirement is that the file system support extended attributes. And that it does not use 64-bit offsets, as can be seen in http://supercolony.gluster.org/pipermail/gluster-devel/2014-July/041604.html I've had bad experiences with replication and ext4 :-( The design considerations in https://bugzilla.redhat.com/show_bug.cgi?id=838784 are to me quite dubious: However both these filesystmes (EXT4 more importantly) are tolerant in terms of the accuracy of the value presented back in seekdir(). i.e, a seekdir(val) actually seeks to the entry which has the closest true offset. This two-prong scheme exploits this behavior - which seems to be the best middle ground amongst various approaches and has all the advantages of the old approach: - Works against XFS and EXT4, the two most common filesystems out there. (which wasn't an advantage of the old approach as it is borken against EXT4) - Probably works against most of the others as well. The ones which would NOT work are those which return HUGE d_offs _and_ NOT tolerant to seekdir() to closest true offset. As a guide to best practice, you may wish to look at what Red Hat officially supports for RHS GlusterFS — that is XFS. As you might expect, that gets the most testing and likely has the fewest bugs related to it. But people are successfully using other file systems, e.g. ffs on NetBSD and FreeBSD, HFS+ on Mac OS X, and btrfs and zfs on Linux. You may certainly file bugs against Community GlusterFS with zfs. I will warn you though that due to various legal and political realities, probably (certainly) none of the Red Hat employees that work on GlusterFS will be able to devote time to it. If you fix the bugs you find, please submit your fix in Gerrit. We will certainly accept fixes for legitimate bugs regardless of which file system you use. Our development workflow is described here http://www.gluster.org/community/documentation/index.php/Development_Work_Flow. /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Reintroduce IPv6 support
Back from vacation, and would like to make a new version of http://review.gluster.org/#/c/8292/. If I understand Emmanuel Dreyfus and some scattered mails on the internet, the reason for IPv6 to be removed was that the code using getaddrinfo is written in such a way that it (often incorrectly) assumes that the first address returned by getaddrinfo is the one we are interested in. As I said in private to Emmanuel: On 2014-07-31 19:55, Anders Blomdell wrote: On 2014-07-31 14:49, Emmanuel Dreyfus wrote: Hi Here is a test case that shows the problem: AF_UNSPEC really means either AF_INET or AF_INET6, you have to choose for a given socket. I wonder what is Linux output. It actually means all: #include stdio.h #include unistd.h #include err.h #include netdb.h #include sysexits.h #include sys/socket.h #include string.h #include arpa/inet.h int main(void) { struct addrinfo hints, *res, *res0; int error; char buf[128]; memset(hints, 0, sizeof(hints)); hints.ai_family = AF_UNSPEC; hints.ai_socktype = SOCK_STREAM; error = getaddrinfo(google.com, http, hints, res0); if (error) { errx(1, %s, gai_strerror(error)); /*NOTREACHED*/ } for (res = res0; res; res = res-ai_next) { void *p; if (res-ai_family == AF_INET) { p = ((struct sockaddr_in *)(res-ai_addr))-sin_addr; } else if (res-ai_family == AF_INET6) { p = ((struct sockaddr_in6 *)(res-ai_addr))-sin6_addr; } printf(family = %d, addr = %s\n, res-ai_family, inet_ntop(res-ai_family, p, buf, 128)); } return 0; } Which returns: family = 10, addr = 2a00:1450:400f:805::1005 family = 2, addr = 74.125.232.227 family = 2, addr = 74.125.232.228 family = 2, addr = 74.125.232.229 family = 2, addr = 74.125.232.230 family = 2, addr = 74.125.232.231 family = 2, addr = 74.125.232.232 family = 2, addr = 74.125.232.233 family = 2, addr = 74.125.232.238 family = 2, addr = 74.125.232.224 family = 2, addr = 74.125.232.225 family = 2, addr = 74.125.232.226 Which means that the logic has to be something like (at least on Linux): * Listen: if wildcard bind/listen IPv6 else bind/listen to all returned addresses, it's OK if some of them fails (then we probably don't have that address on any of our interfaces) * Connect: connect to any of the given addresses (to speed things up we could try all in parallell and take the first that responds). AFAICT this means that all code where only the first result from getaddrinfo is used, the code should be re-factored along these lines: int f(char *host, int (*cb_start)(void *cb_context), int (*cb_each)(struct addrinfo *a, void *cb_context), int (*cb_end)(void *cb_context), void *cb_context) { struct addrinfo *addr, *p; ... ret = getaddrinfo (..., addr); if (ret == 0) { cb_start(context); for (p = addr; p != NULL; p = p-ai_next) { cb_each(p, context); } cb_end(context); freeaddrinfo (addr); } ... } Any thoughts? -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Monotonically increasing memory
On 2014-08-01 02:02, Harshavardhana wrote: On Thu, Jul 31, 2014 at 11:31 AM, Anders Blomdell anders.blomd...@control.lth.se wrote: During rsync of 35 files, memory consumption of glusterfs rose to 12 GB (after approx 14 hours), I take it that this is a bug I should try to track down? Does it ever come down? what happens if you repeatedly the same files again? does it OOM? Well, it OOM'd my firefox first (that's how good I monitor my experiments :-() No, memory usage does not come down by itself, ACAICT On 2014-08-01 02:12, Raghavendra Gowdappa wrote: Anders, Mostly its a case of memory leak. It would be helpful if you can file a bug on this. Following information would be useful to fix the issue: 1. valgrind reports (if possible). a. To start brick and nfs processes with valgrind you can use following cmdline when starting glusterd. # glusterd --xlator-option *.run-with-valgrind=yes In this case all the valgrind logs can be found in standard glusterfs log directory. b. For client you can start glusterfs just like any other process in valgrind. Since glusterfs is daemonized, while running with valgrind we need to prevent it by running it in foreground. We can use -N option to do that # valgrind --leak-check=full --log-file=path-to-valgrind-log glusterfs --volfile-id=xyz --volfile-server=abc -N /mnt/glfs 2. Once you observe a considerable leak in memory, please get a statedump of glusterfs # gluster volume statedump volname and attach the reports in the bug. Since it looks like Pranith has a clue, I'll leave it for a few weeks (other pressing duties). On 2014-08-01 03:24, Pranith Kumar Karampuri wrote: Yes, even I saw the following leaks, when I tested it a week back. These were the leaks: You should probably take a statedump and see what datatypes are leaking. root@localhost - /usr/local/var/run/gluster 14:10:26 ? awk -f /home/pk1/mem-leaks.awk glusterdump.22412.dump.1406174043 [mount/fuse.fuse - usage-type gf_common_mt_char memusage] size=341240 num_allocs=23602 max_size=347987 max_num_allocs=23604 total_allocs=653194 ... I'll revisit this in a few weeks, Harshavardhana, Raghavendra, Pranith (and all others), Gluster is one of the most responsive Open Source project I have participated in this far, I'm very happy with all support, help and encouragement I have got this far. Even though my initial tests weren't fully satisfactory, you are the main reason for my perseverance :-) /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Monotonically increasing memory
On 2014-08-01 08:56, Pranith Kumar Karampuri wrote: On 08/01/2014 12:09 PM, Anders Blomdell wrote: On 2014-08-01 02:02, Harshavardhana wrote: On Thu, Jul 31, 2014 at 11:31 AM, Anders Blomdell anders.blomd...@control.lth.se wrote: During rsync of 35 files, memory consumption of glusterfs rose to 12 GB (after approx 14 hours), I take it that this is a bug I should try to track down? Does it ever come down? what happens if you repeatedly the same files again? does it OOM? Well, it OOM'd my firefox first (that's how good I monitor my experiments :-() No, memory usage does not come down by itself, ACAICT On 2014-08-01 02:12, Raghavendra Gowdappa wrote: Anders, Mostly its a case of memory leak. It would be helpful if you can file a bug on this. Following information would be useful to fix the issue: 1. valgrind reports (if possible). a. To start brick and nfs processes with valgrind you can use following cmdline when starting glusterd. # glusterd --xlator-option *.run-with-valgrind=yes In this case all the valgrind logs can be found in standard glusterfs log directory. b. For client you can start glusterfs just like any other process in valgrind. Since glusterfs is daemonized, while running with valgrind we need to prevent it by running it in foreground. We can use -N option to do that # valgrind --leak-check=full --log-file=path-to-valgrind-log glusterfs --volfile-id=xyz --volfile-server=abc -N /mnt/glfs 2. Once you observe a considerable leak in memory, please get a statedump of glusterfs # gluster volume statedump volname and attach the reports in the bug. Since it looks like Pranith has a clue, I'll leave it for a few weeks (other pressing duties). On 2014-08-01 03:24, Pranith Kumar Karampuri wrote: Yes, even I saw the following leaks, when I tested it a week back. These were the leaks: You should probably take a statedump and see what datatypes are leaking. root@localhost - /usr/local/var/run/gluster 14:10:26 ? awk -f /home/pk1/mem-leaks.awk glusterdump.22412.dump.1406174043 [mount/fuse.fuse - usage-type gf_common_mt_char memusage] size=341240 num_allocs=23602 max_size=347987 max_num_allocs=23604 total_allocs=653194 ... I'll revisit this in a few weeks, Harshavardhana, Raghavendra, Pranith (and all others), Gluster is one of the most responsive Open Source project I have participated in this far, I'm very happy with all support, help and encouragement I have got this far. Even though my initial tests weren't fully satisfactory, you are the main reason for my perseverance :-) Yay! good :-). Do you have any suggestions where we need to improve as a community that would make it easier for new contributors? http://review.gluster.org/#/c/8181/ (will hopefully come around and review that, real soon now...) Otherwise, no. Will recommend gluster as an eminent crash-course in git, gerrit and continous integration. Keep up the good work. /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Monotonically increasing memory
During rsync of 35 files, memory consumption of glusterfs rose to 12 GB (after approx 14 hours), I take it that this is a bug I should try to track down? Version is 3.7dev as of tuesday... /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] How should I submit a testcase without proper solution
Hi, finally got around to look into Symlink mtime changes when rebalancing (https://bugzilla.redhat.com/show_bug.cgi?id=1122443), and I have submitted a test-case (http://review.gluster.org/#/c/8383/), but that is expected to fail (since I have not managed to write a patch that adresses the problem), and hence it will be voted down by Jenkins, is there something I should do about this? /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Documentation on 'gluster volume replace-brick' out of date?
While looking into Symlink mtime changes when rebalancing (https://bugzilla.redhat.com/show_bug.cgi?id=1122443), I found that it looks like the documentation on 'gluster volume replace-brick' is out of date (v3.7dev-34-g67a6f40), gluster.8 says: volume replace-brick VOLNAME (BRICK NEW-BRICK) start|pause|abort|status|commit Replace the specified brick. while 'gluster volume replace-brick volname old-brick new-brick start' says: All replace-brick commands except commit force are deprecated. Do you want to continue? (y/n) Am I right to assume that the correct way migrate data to a new brick, is: # gluster volume add-brick volname new-brick # gluster volume remove-brick volname old-brick start ... wait for completion, would maybe be nice to have a ... gluster volume volname wait UUID # gluster volume remove-brick volname old-brick commit Removing brick(s) can result in data loss. Do you want to Continue? (y/n) ... Would be nice if the fact that migration is complete was reflected in ... the dialog. AFAICT this also means that there is no way to replace a single brick in a replicated volume in such a way that all old brick replicas are online until the new brick is fully healed/populated, meaning that with replica a count 2, we only have one active replica until healing is done. /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Status of bit-rot detection
On 2014-07-23 13:42, Joseph Fernandes wrote: Hi Anders, Currently we don't have an implementation/patch for bit-rot. We are working on the design of bit-rot protection(for read-only data), as part of Gluster Compliance. read only data is nice for archival (which is why my backups go to CDs/DVDs since 15 years back and bit-rot detection by md5 sums). Please refer to the Gluster Compliance Proposal http://supercolony.gluster.org/pipermail/gluster-devel/2014-June/041258.html If you have any design proposal/suggestion, please do share, so that we can have a discussion on it. I'm more interested in periodically (or triggered by writes) scan and checksum all/parts of the files on gluster volumes, and compare those checksums between replicas (wont work for open files like databases/VM-images). I'll guess that I put my current tools onto each brick, and whip up some scripts to compare those. When something materializes, I'm interested in testing. Regards, Joe - Original Message - From: Anders Blomdell anders.blomd...@control.lth.se To: Gluster Devel gluster-devel@gluster.org Sent: Monday, July 21, 2014 10:42:00 PM Subject: [Gluster-devel] Status of bit-rot detection Since switching to xfs have left me with a seemingly working system :-), what is the current status on bit-rot detection ( http://www.gluster.org/community/documentation/index.php/Arch/BitRot_Detection), any patches for me to try? /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] spurious regression failures again! [bug-1112559.t]
:24 PM, Joseph Fernandes wrote: Hi Pranith, Could you please share the link of the console output of the failures. Added them inline. Thanks for reminding :-) Pranith Regards, Joe - Original Message - From: Pranith Kumar Karampuri pkara...@redhat.com To: Gluster Devel gluster-devel@gluster.org, Varun Shastry vshas...@redhat.com Sent: Tuesday, July 15, 2014 8:52:44 PM Subject: [Gluster-devel] spurious regression failures again! hi, We have 4 tests failing once in a while causing problems: 1) tests/bugs/bug-1087198.t - Author: Varun http://build.gluster.org/job/rackspace-regression-2GB-triggered/379/consoleFull 2) tests/basic/mgmt_v3-locks.t - Author: Avra http://build.gluster.org/job/rackspace-regression-2GB-triggered/375/consoleFull 3) tests/basic/fops-sanity.t - Author: Pranith http://build.gluster.org/job/rackspace-regression-2GB-triggered/383/consoleFull Please take a look at them and post updates. Pranith /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] spurious regression failures again! [bug-1112559.t]
On 2014-07-22 16:44, Justin Clift wrote: On 22/07/2014, at 3:28 PM, Joe Julian wrote: On 07/22/2014 07:19 AM, Anders Blomdell wrote: Could this be a time to propose that gluster understands port reservation a'la systemd (LISTEN_FDS), and make the test harness make sure that random ports do not collide with the set of expected ports, which will be beneficial when starting from systemd as well. Wouldn't that only work for Fedora and RHEL7? Probably depends how it's done. Maybe make it a conditional thing that's compiled in or not, depending on the platform? Don't think so, the LISTEN_FDS is dead simple; if LISTEN_FDS is set in the environment, fd#3 to fd#3+LISTEN_FDS are sockets opened by the calling process, and their function has to be deduced via getsockname and sockets should not opened by the process. If LISTEN_FDS is not set, proceed to open sockets just like before. The good thing about this is that systemd can reserve the ports used very early during boot, and no other process can steal them away. For testing purposes, this could be used to assure that all ports are available before starting tests (if random port stealing is the true problem here, that is still an unverified shot in the dark). Unless there's a better, cross platform approach of course. :) Regards and best wishes, Justin Clift /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Duplicate entries and other weirdness in a 3*4 volume
On 2014-07-21 13:36, Pranith Kumar Karampuri wrote: On 07/21/2014 05:03 PM, Anders Blomdell wrote: On 2014-07-19 04:43, Pranith Kumar Karampuri wrote: On 07/18/2014 07:57 PM, Anders Blomdell wrote: During testing of a 3*4 gluster (from master as of yesterday), I encountered two major weirdnesses: 1. A 'rm -rf some_dir' needed several invocations to finish, each time reporting a number of lines like these: rm: cannot remove ‘a/b/c/d/e/f’: Directory not empty 2. After having successfully deleted all files from the volume, i have a single directory that is duplicated in gluster-fuse, like this: # ls -l /mnt/gluster total 24 drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/ drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/ any idea on how to debug this issue? What are the steps to recreate? We need to first find what lead to this. Then probably which xlator leads to this. Would a pcap network dump + the result from 'tar -c --xattrs /brick/a/gluster' on all the hosts before and after the following commands are run be of any help: # mount -t glusterfs gluster-host:/test /mnt/gluster # mkdir /mnt/gluster/work2 ; # ls /mnt/gluster work2 work2 Are you using ext4? Yes Is this on latest upstream? kernel is 3.14.9-200.fc20.x86_64, if that is latest upstream, I don't know. gluster is from master as of end of last week If there are known issues with ext4 i could switch to something else, but during the last 15 years or so, I have had very little problems with ext2/3/4, thats the reason for choosing it. /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Duplicate entries and other weirdness in a 3*4 volume
On 2014-07-21 13:49, Pranith Kumar Karampuri wrote: On 07/21/2014 05:17 PM, Anders Blomdell wrote: On 2014-07-21 13:36, Pranith Kumar Karampuri wrote: On 07/21/2014 05:03 PM, Anders Blomdell wrote: On 2014-07-19 04:43, Pranith Kumar Karampuri wrote: On 07/18/2014 07:57 PM, Anders Blomdell wrote: During testing of a 3*4 gluster (from master as of yesterday), I encountered two major weirdnesses: 1. A 'rm -rf some_dir' needed several invocations to finish, each time reporting a number of lines like these: rm: cannot remove ‘a/b/c/d/e/f’: Directory not empty 2. After having successfully deleted all files from the volume, i have a single directory that is duplicated in gluster-fuse, like this: # ls -l /mnt/gluster total 24 drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/ drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/ any idea on how to debug this issue? What are the steps to recreate? We need to first find what lead to this. Then probably which xlator leads to this. Would a pcap network dump + the result from 'tar -c --xattrs /brick/a/gluster' on all the hosts before and after the following commands are run be of any help: # mount -t glusterfs gluster-host:/test /mnt/gluster # mkdir /mnt/gluster/work2 ; # ls /mnt/gluster work2 work2 Are you using ext4? Yes Is this on latest upstream? kernel is 3.14.9-200.fc20.x86_64, if that is latest upstream, I don't know. gluster is from master as of end of last week If there are known issues with ext4 i could switch to something else, but during the last 15 years or so, I have had very little problems with ext2/3/4, thats the reason for choosing it. The problem is afrv2 + dht + ext4 offsets. Soumya and Xavier were working on it last I heard(CCed) Should I switch to xfs or be guinea pig for testing a fixed version? /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Cmockery2 in GlusterFS
On 2014-07-20 16:01, Niels de Vos wrote: On Fri, Jul 18, 2014 at 02:52:18PM -0400, Luis Pabón wrote: Hi all, A few months ago, the unit test framework based on cmockery2 was in the repo for a little while, then removed while we improved the packaging method. Now support for cmockery2 ( http://review.gluster.org/#/c/7538/ ) has been merged into the repo again. This will most likely require you to install cmockery2 on your development systems by doing the following: * Fedora/EPEL: $ sudo yum -y install cmockery2-devel * All other systems please visit the following page: https://github.com/lpabon/cmockery2/blob/master/doc/usage.md#installation Here is also some information about Cmockery2 and how to use it: * Introduction to Unit Tests in C Presentation: http://slides-lpabon.rhcloud.com/feb24_glusterfs_unittest.html#/ * Cmockery2 Usage Guide: https://github.com/lpabon/cmockery2/blob/master/doc/usage.md * Using Cmockery2 with GlusterFS: https://github.com/gluster/glusterfs/blob/master/doc/hacker-guide/en-US/markdown/unittest.md When starting out writing unit tests, I would suggest writing unit tests for non-xlator interface files when you start. Once you feel more comfortable writing unit tests, then move to writing them for the xlators interface files. Awesome, many thanks! I'd like to add some unittests for the RPC and NFS layer. Several functions (like ip-address/netmask matching for ACLs) look very suitable. Did you have any particular functions in mind that you would like to see unittests for? If so, maybe you can file some bugs for the different tests so that we won't forget about it? Depending on the tests, these bugs may get the EasyFix keyword if there is a clear description and some pointers to examples. Looks like parts of cmockery was forgotten in glusterfs.spec.in: # rpm -q -f `which gluster` glusterfs-cli-3.7dev-0.9.git5b8de97.fc20.x86_64 # ldd `which gluster` linux-vdso.so.1 = (0x74dfe000) libglusterfs.so.0 = /lib64/libglusterfs.so.0 (0x7fe034cc4000) libreadline.so.6 = /lib64/libreadline.so.6 (0x7fe034a7d000) libncurses.so.5 = /lib64/libncurses.so.5 (0x7fe034856000) libtinfo.so.5 = /lib64/libtinfo.so.5 (0x7fe03462c000) libgfxdr.so.0 = /lib64/libgfxdr.so.0 (0x7fe034414000) libgfrpc.so.0 = /lib64/libgfrpc.so.0 (0x7fe0341f8000) libxml2.so.2 = /lib64/libxml2.so.2 (0x7fe033e8f000) libz.so.1 = /lib64/libz.so.1 (0x7fe033c79000) libm.so.6 = /lib64/libm.so.6 (0x7fe033971000) libdl.so.2 = /lib64/libdl.so.2 (0x7fe03376d000) libcmockery.so.0 = not found libpthread.so.0 = /lib64/libpthread.so.0 (0x7fe03354f000) libcrypto.so.10 = /lib64/libcrypto.so.10 (0x7fe033168000) libc.so.6 = /lib64/libc.so.6 (0x7fe032da9000) libcmockery.so.0 = not found libcmockery.so.0 = not found libcmockery.so.0 = not found liblzma.so.5 = /lib64/liblzma.so.5 (0x7fe032b82000) /lib64/ld-linux-x86-64.so.2 (0x7fe0351f1000) Should I file a bug report or could someone on the fast-lane fix this? /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Duplicate entries and other weirdness in a 3*4 volume
On 2014-07-21 19:14, Jeff Darcy wrote: But this offset gap widens as and when more translators (which need to store subvol-id) get added to the gluster stack which may eventually result in the similar issue which you are facing now. Perhaps it's time to revisit the idea of making assumptions about d_off values +1 :-) and twiddling them back and forth, vs. maintaining a precise mapping between our values and local-FS values. http://review.gluster.org/#/c/4675/ That patch is old and probably incomplete, but at the time it worked just as well as the one that led us into the current situation. Seems a lot sounder than: However both these filesystmes (EXT4 more importantly) are tolerant in terms of the accuracy of the value presented back in seekdir(). i.e, a seekdir(val) actually seeks to the entry which has the closest true offset. Let me know if you revisit this this one. Thanks Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] release-3.6 branch created
On 2014-07-17 20:22, Vijay Bellur wrote: Hi All, A new branch, 'release-3.6', has been branched from this commit in master: commit 950f9d8abe714708ca62b86f304e7417127e1132 Author: Jeff Darcy jda...@redhat.com Date: Tue Jul 8 21:56:04 2014 -0400 dht: fix rename race You can checkout this branch through: $git checkout -b release-3.6 origin/release-3.6 rfc.sh is being updated to send patches to the appropriate branch. The plan is to have all 3.6.x releases happen off this branch. If you need any fix to be part of a 3.4.x release, please send out a backport of the same from master to release-3.4 after it has been accepted in master. More notes on backporting are available at [1]. Shouldn't the root of this branch get a tag to avoid this weirdness: # git checkout -b release-3.6 origin/release-3.6 Branch release-3.6 set up to track remote branch release-3.6 from origin. Switched to a new branch 'release-3.6' # git describe v3.5qa2-762-g950f9d8 or have I missed some git magic? /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Unnecessary bugs entered in BZ, ...
...and I'm sorry about that, the following documents are somewhat contradictory: http://gluster.org/community/documentation/index.php/Simplified_dev_workflow The script will ask you to enter a bugzilla bug id. Every change submitted to GlusterFS needs a bugzilla entry to be accepted. If you do not already have a bug id, file a new bug at Red Hat Bugzilla. If the patch is submitted for review, the rfc.sh script will return the gerrit url for the review request. www.gluster.org/community/documentation/index.php/Development_Work_Flow Prompt for a Bug Id for each commit (if it was not already provded) and include it as a BUG: tag in the commit log. You can just hit enter at this prompt if your submission is purely for review purposes. /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Unnecessary bugs entered in BZ, ...
On 2014-07-17 15:44, Kaushal M wrote: I don't understand your confusion, but maybe these docs need to be reworded. Not confused, just observed that some of my patches regarding development workflow does not need to show up as bugs for gluster (at least not until somebody has verified that they could be useful to somebody else but me). What both of these want to say is that for a patch to be merged into glusterfs, it needs to be associated with a bug-id. This association is done by adding a 'BUG: id' line in the commit message. If you haven't manually added a bug-id in the commit message, the rfc.sh script will prompt you to enter one and add it to the commit-message. But, it is possible ignore this prompt and submit a patch for review. A patch submitted for review in this manner will only be reviewed. It will not be merged. The simplified workflow document doesn't mention this as it was targeted at new developers, and I felt having this details was TMI for them. But now when I rethink it, it's the seasoned developers who are beginning to contribute to gluster, who are more likely to use that s/seasoned developers/ignorant old fools/g :-) doc. ~kaushal On Thu, Jul 17, 2014 at 6:19 PM, Anders Blomdell anders.blomd...@control.lth.se wrote: ...and I'm sorry about that, the following documents are somewhat contradictory: http://gluster.org/community/documentation/index.php/Simplified_dev_workflow The script will ask you to enter a bugzilla bug id. Every change submitted to GlusterFS needs a bugzilla entry to be accepted. If you do not already have a bug id, file a new bug at Red Hat Bugzilla. If the patch is submitted for review, the rfc.sh script will return the gerrit url for the review request. www.gluster.org/community/documentation/index.php/Development_Work_Flow Prompt for a Bug Id for each commit (if it was not already provded) and include it as a BUG: tag in the commit log. You can just hit enter at this prompt if your submission is purely for review purposes. /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Is gerrit/jenkins integration sick?
I'm getting http://review.gluster.org/#/c/8291/: Patch Set 5: Build Successful http://rhs-client34.lab.eng.blr.redhat.com:8080/job/libgfapi-qemu/322/ : SUCCESS (skipped) http://review.gluster.org/#/c/8299/: Patch Set 6: Verified-1 Build Failed http://rhs-client34.lab.eng.blr.redhat.com:8080/job/libgfapi-qemu/323/ : FAILURE How do I debug these? /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Verfification failure on GlusterBuildSystem2
What can I do about this one http://review.gluster.org/#/c/8299/? -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Patches to be merged before 3.6 branching
On 2014-07-15 11:30, Niels de Vos wrote: On Mon, Jul 14, 2014 at 06:05:25PM +0100, Justin Clift wrote: On 14/07/2014, at 4:20 PM, Anders Blomdell wrote: On 2014-07-14 16:03, Vijay Bellur wrote: Hi All, I intend creating the 3.6 branch tomorrow. After that, the branch will be restricted to bug fixes only. If you have any major patches to be reviewed and merged for release-3.6, please update this thread. Does this mean that https://bugzilla.redhat.com/show_bug.cgi?id=1113050 has no chance to get in (no patch for that yet, still trying to figure out how things work inside gluster)? Sounds like that would be a bug fix, so it'd get applied to everything that it makes sense too. eg likely 3.6, 3.5, 3.7dev, etc. Indeed, if there is no patch available when release-3.6 gets branched, bug fixes and (serious) usability improvements can get backported to the 3.6 version. The process to do so has been documented here: - http://www.gluster.org/community/documentation/index.php/Backport_Guidelines When the branching happened, the 3.6 release will be closed for new feature enhancements. The goal is to stabilize the branch and not introduce any complex/major changes anymore. Vijay will likely mention this in his announcement of the release-3.6 branch too. There can be a thin line between 'usability bug' and 'feature'. If an acceptable case is made for an improvement, I doubt it will have difficulties getting included in any of the stable branches. snip And I would really like to get http://review.gluster.org/8292, IPv6 support. Niels, have you had a change to look at this? Should we get this merged, and fix any bugs that turn up later? No, and unfortunately I will unlikely be able to look into that within the next few weeks. I'd really like to see good IPv6 support in Gluster, but for me it really is a long-term goal. OK, will have to keep my own version here then (some of my hosts are IPv6 only) since the current state of affairs, where you have to edit a lot of volfiles, etc in order to get it working is far too cumbersome. Maybe the 'better peer identification' feature improves IPv6 support as well. Some details and further pointers to discussions about it can be found here: - http://www.gluster.org/community/documentation/index.php/Features/Better_peer_identification Very little IPv6 there, but very useful for multi-homed machines (which I also want, but not as badly). /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Patches to be merged before 3.6 branching
On 2014-07-14 16:03, Vijay Bellur wrote: Hi All, I intend creating the 3.6 branch tomorrow. After that, the branch will be restricted to bug fixes only. If you have any major patches to be reviewed and merged for release-3.6, please update this thread. Does this mean that https://bugzilla.redhat.com/show_bug.cgi?id=1113050 has no chance to get in (no patch for that yet, still trying to figure out how things work inside gluster)? I think http://review.gluster.org/8299 is important, since gluster will break with some locales. And I would really like to get http://review.gluster.org/8292, http://review.gluster.org/8208 and http://review.gluster.org/8203. All the things above should already be on https://bugzilla.redhat.com/show_bug.cgi?id=1117822, so maybe this mail is just noise. /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Is it OK to pick Code-Reviewer(s)
When submitting patches where there is an/some obvious person(s) to blame, is it OK/desirable to request them as Code-Reviewers in gerrit? /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Is this a transient failure?
In http://build.gluster.org/job/rackspace-regression-2GB-triggered/297/consoleFull, I have one failure: No volumes present read failed: No data available read returning junk fd based file operation 1 failed read failed: No data available read returning junk fstat failed : No data available fd based file operation 2 failed read failed: No data available read returning junk dup fd based file operation failed [18:51:01] ./tests/basic/fops-sanity.t ... What should I do about it? /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] gerrit accepts patches that are not signed off
http://www.gluster.org/community/documentation/index.php/Simplified_dev_workflow says: It is essential that you commit with the '-s' option, which will sign-off the commit with your configured email, as gerrit is configured to reject patches which are not signed-off. this does not seem to be true (and, yes, I should learn to read instructions). /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] v3.5qa2 tag name on master is annoying
On 2014-07-09 22:39, Harshavardhana wrote: I thought pkg-version in build-aux should have fixed this properly? Well, it does generate correct ascending names, but since 'git describe' picks up the v3.5qa2 tag, new package names are based on that (and hence yum considers 3.5.1 newer than master). On Wed, Jul 9, 2014 at 1:33 PM, Justin Clift jus...@gluster.org wrote: That v3.5qa2 tag name on master is annoying, due to the RPM naming it causes when building on master. Did we figure out a solution? Maybe we should do a v3.6something tag at feature freeze time or something? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Question regarding bug 1113050 (Transient failures immediately after add-brick to a mounted volume)
Is the current behavior in master, where file-system operations on the fuse-fs fails with 'Transport endpoint is not connected' for approximately 3.2 seconds after an add-brick is done, or would it make sense to add something like NFS's {soft/hard}/retrans/timeo logic, and if so, could somebody give some hints on a good way to do it (client.c?) /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] RPM's for latest glusterfs does not install on Fedora-20
On 2014-07-09 15:02, Niels de Vos wrote: On Tue, Jul 08, 2014 at 11:40:35AM +0100, Justin Clift wrote: On 08/07/2014, at 9:53 AM, Anders Blomdell wrote: snip 2. What are the rules for marking a bug as blocking the trackers (can't find one for 3.6.0 yet, so currently a moot point). We should probably create an alias thing for 3.6.0, like we have for 3.5.2: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.5.2 Not sure who does that, maybe Vijay (CC'd)? Anyone can do that, I've now created one for 3.6.0: - https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.6.0 Thanks :-) When there are urgent bugs that need to get fixed in 3.6.0, open the bug (not the tracker) and add 'glusterfs-3.6.0' (or 1117822) in the 'blocks' field. If that is not working due to some permission issues, send an email to this list or ask on Freenode/IRC in #gluster-dev if someone can take care of adding it to the tracker. Have added my own favorite itches to it. I still don't feel confident enough with git to add my patches to gerrit and get jenkins to do the right thing is probably even worse, so where I have code that works for me™, bug reports contains patches, please bear with me. Removal of erroneous blockers should preferably be done without flame-throwers :-) BTW: doc/hacker-guide/en-US/markdown/unittest.md seems to lack info on how to write tests/bugs/bug-*.t files. /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] RPM's for latest glusterfs does not install on Fedora-20
On 2014-07-07 21:13, Niels de Vos wrote: On Mon, Jul 07, 2014 at 07:29:15PM +0200, Anders Blomdell wrote: On 2014-07-07 18:17, Niels de Vos wrote: On Mon, Jul 07, 2014 at 04:48:18PM +0200, Anders Blomdell wrote: On 2014-07-07 15:08, Lalatendu Mohanty wrote: # rpm -U ./glusterfs-server-3.5qa2-0.722.git2c5eb5c.fc20.x86_64.rpm ./glusterfs-fuse-3.5qa2-0.722.git2c5eb5c.fc20.x86_64.rpm ./glusterfs-cli-3.5qa2-0.722.git2c5eb5c.fc20.x86_64.rpm ./glusterfs-libs-3.5qa2-0.722.git2c5eb5c.fc20.x86_64.rpm ./glusterfs-3.5qa2-0.722.git2c5eb5c.fc20.x86_64.rpm ./glusterfs-api-3.5qa2-0.722.git2c5eb5c.fc20.x86_64.rpm error: Failed dependencies: libgfapi.so.0()(64bit) is needed by (installed) qemu-common-2:1.6.2-6.fc20.x86_64 libgfapi.so.0()(64bit) is needed by (installed) qemu-system-x86-2:1.6.2-6.fc20.x86_64 libgfapi.so.0()(64bit) is needed by (installed) qemu-img-2:1.6.2-6.fc20.x86_64 Feature or bug? /Anders Hey Anders, You should not see this issue as glusterfs-api provides /usr/lib64/libgfapi.so.0 . $rpm -ql glusterfs-api /usr/lib/python2.7/site-packages/gluster/__init__.py /usr/lib/python2.7/site-packages/gluster/__init__.pyc /usr/lib/python2.7/site-packages/gluster/__init__.pyo /usr/lib/python2.7/site-packages/gluster/gfapi.py /usr/lib/python2.7/site-packages/gluster/gfapi.pyc /usr/lib/python2.7/site-packages/gluster/gfapi.pyo /usr/lib/python2.7/site-packages/glusterfs_api-3.5.0-py2.7.egg-info /usr/lib64/glusterfs/3.5.0/xlator/mount/api.so /usr/lib64/libgfapi.so.0 /usr/lib64/libgfapi.so.0.0.0 True for branch release-3.5 from git://git.gluster.org/glusterfs.git, but not for master. Indeed, the master branch uses SONAME-versioning. Any applications (like qemu) using libgfapi.so need to get re-compiled to use the update. Can do without these for my current testing :-) Of course, removing packages you don't need is a nice workaround :) Is there any specific reason you want to use glusterfs-server-3.5qa2-0.722.git2c5eb5c.fc20.x86_64.rpm? because 3.5.1 is available in fedora 20 in update repo . So if you do yum update it i.e. yum update glusterfs-server you should get glusterfs 3.5.1 GA version which like to be stable than glusterfs-server-3.5qa2-0.722. I would like to track progress of the fixing of my bugs and at the same time experiment with reverting 3dc56cbd16b1074d7ca1a4fe4c5bf44400eb63ff to re-add IPv6 support, and I figured that the master is what will eventually become 3.6.0 Yes, you are correct. The master branch will become glusterfs-3.6. You should be able to recompile qemu (and other?) packages on a system that has glusterfs-devel-3.6 installed. I can also recommend the use of 'mockchain' (from the 'mock' package) for rebuilding glusterfs and other dependent packages. I'll keep tracking master (exposing as many bugs as I can), I'll redirect followups to gluster-devel@gluster.org and drop gluster-us...@gluster.org after this message. Great, thanks for the testing! Will hopefully get me a better product in the end :-) A few questions: 1. Should bugs/patches be posted on the list or should I file a BugZilla for each? 2. What are the rules for marking a bug as blocking the trackers (can't find one for 3.6.0 yet, so currently a moot point). 3. Given that my understanding of gluster innards is very limited, my guess is that the 'GlusterFS Development Workflow' is a overkill, but please correct me if I'm wrong. /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Building rpms for master
It seems like http://www.gluster.org/community/documentation/index.php/CompilingRPMS is missing one crucial step: $ ./autogen.sh $ ./configure --enable-fusermount $ make dist ... $ cd extras/LinuxRPM $ make glusterrpms should be: $ rm -f ./autom4te.cache/* # make sure to use current # './build-aux/pkg-version --release' $ ./autogen.sh $ ./configure --enable-fusermount $ make dist ... $ cd extras/LinuxRPM $ make glusterrpms Otherwise the rpm's wont pick the correct releaseversion, i.e. ./autogen.sh will pickup the result of './build-aux/pkg-version --release' from the cache instead of figure out where HEAD is currently located. Even nicer would be if '(cd extras/LinuxRPM ; make glusterrpms)' did the right thing. /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Locale problem in master
Due to the line (commit 040319d8bced2f25bf25d8f6b937901c3a40e34b): ./libglusterfs/src/logging.c:503:setlocale(LC_ALL, ); The command env -i LC_NUMERIC=sv_SE.utf8 /usr/sbin/glusterfs ... will fail due to the fact that the swedish decimal separator is not '.', but ',', i.e. _gf_string2double will fail due to strtod ('1.0', tail) will give the tail '.0'. /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Locale problem in master
On 2014-07-07 18:57, Pranith Kumar Karampuri wrote: Including Bala who is the author of the commit Pranith On 07/07/2014 10:18 PM, Anders Blomdell wrote: Due to the line (commit 040319d8bced2f25bf25d8f6b937901c3a40e34b): ./libglusterfs/src/logging.c:503:setlocale(LC_ALL, ); The command env -i LC_NUMERIC=sv_SE.utf8 /usr/sbin/glusterfs ... will fail due to the fact that the swedish decimal separator is not '.', but ',', i.e. _gf_string2double will fail due to strtod ('1.0', tail) will give the tail '.0'. /Anders Simple fix: --- a/libglusterfs/src/logging.c +++ b/libglusterfs/src/logging.c @@ -501,6 +501,7 @@ gf_openlog (const char *ident, int option, int facility) /* TODO: Should check for errors here and return appropriately */ setlocale(LC_ALL, ); +setlocale(LC_NUMERIC, C); /* close the previous syslog if open as we are changing settings */ closelog (); openlog(ident, _option, _facility); -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] RPM's for latest glusterfs does not install on Fedora-20
On 2014-07-07 18:17, Niels de Vos wrote: On Mon, Jul 07, 2014 at 04:48:18PM +0200, Anders Blomdell wrote: On 2014-07-07 15:08, Lalatendu Mohanty wrote: # rpm -U ./glusterfs-server-3.5qa2-0.722.git2c5eb5c.fc20.x86_64.rpm ./glusterfs-fuse-3.5qa2-0.722.git2c5eb5c.fc20.x86_64.rpm ./glusterfs-cli-3.5qa2-0.722.git2c5eb5c.fc20.x86_64.rpm ./glusterfs-libs-3.5qa2-0.722.git2c5eb5c.fc20.x86_64.rpm ./glusterfs-3.5qa2-0.722.git2c5eb5c.fc20.x86_64.rpm ./glusterfs-api-3.5qa2-0.722.git2c5eb5c.fc20.x86_64.rpm error: Failed dependencies: libgfapi.so.0()(64bit) is needed by (installed) qemu-common-2:1.6.2-6.fc20.x86_64 libgfapi.so.0()(64bit) is needed by (installed) qemu-system-x86-2:1.6.2-6.fc20.x86_64 libgfapi.so.0()(64bit) is needed by (installed) qemu-img-2:1.6.2-6.fc20.x86_64 Feature or bug? /Anders Hey Anders, You should not see this issue as glusterfs-api provides /usr/lib64/libgfapi.so.0 . $rpm -ql glusterfs-api /usr/lib/python2.7/site-packages/gluster/__init__.py /usr/lib/python2.7/site-packages/gluster/__init__.pyc /usr/lib/python2.7/site-packages/gluster/__init__.pyo /usr/lib/python2.7/site-packages/gluster/gfapi.py /usr/lib/python2.7/site-packages/gluster/gfapi.pyc /usr/lib/python2.7/site-packages/gluster/gfapi.pyo /usr/lib/python2.7/site-packages/glusterfs_api-3.5.0-py2.7.egg-info /usr/lib64/glusterfs/3.5.0/xlator/mount/api.so /usr/lib64/libgfapi.so.0 /usr/lib64/libgfapi.so.0.0.0 True for branch release-3.5 from git://git.gluster.org/glusterfs.git, but not for master. Indeed, the master branch uses SONAME-versioning. Any applications (like qemu) using libgfapi.so need to get re-compiled to use the update. Can do without these for my current testing :-) Is there any specific reason you want to use glusterfs-server-3.5qa2-0.722.git2c5eb5c.fc20.x86_64.rpm? because 3.5.1 is available in fedora 20 in update repo . So if you do yum update it i.e. yum update glusterfs-server you should get glusterfs 3.5.1 GA version which like to be stable than glusterfs-server-3.5qa2-0.722. I would like to track progress of the fixing of my bugs and at the same time experiment with reverting 3dc56cbd16b1074d7ca1a4fe4c5bf44400eb63ff to re-add IPv6 support, and I figured that the master is what will eventually become 3.6.0 Yes, you are correct. The master branch will become glusterfs-3.6. You should be able to recompile qemu (and other?) packages on a system that has glusterfs-devel-3.6 installed. I can also recommend the use of 'mockchain' (from the 'mock' package) for rebuilding glusterfs and other dependent packages. I'll keep tracking master (exposing as many bugs as I can), I'll redirect followups to gluster-devel@gluster.org and drop gluster-us...@gluster.org after this message. Thanks! /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] 3.6 Feature Freeze - move to mid next week?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2014-07-04 13:30, Vijay Bellur wrote: Hi All, Given the holiday weekend in US, I feel that it would be appropriate to move the 3.6 feature freeze date to mid next week so that we can have more reviews done address review comments too. We can still continue to track other milestones as per our release schedule [1]. What do you folks think? Is 'git://git.gluster.org/glusterfs master' the right thing to clone if I'm interested in testing the IPv6 support (and the possible fixes for my outstanding bugs), or should I cherry-pick appropriate bits and apply to 3.5.1 (I'm not planning to go live until some time in september [or later ;-)]) /Anders - -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden -BEGIN PGP SIGNATURE- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTtrpAAAoJENZYyvaDG8Nct0QH/RpdwqrksBkUpuqNb2X+2aPO xZc4FtpmALNi6lM0sUcmGnMPLC9JptcHjw2edr4izrQheqoaFhfCo9Zbar0b+8Yy JVin11u+q6WrA/aVdH/+MhQ38M9lKBV1SUiIGP2FrEW805NbDNGfxj0q8S1mGmQr zgk8SL+cMcGDW7aNSckPj0P39Fa+lwYX3nI9mrQRAHf1mqP6Apl6zCL/kUmu2xgH hjX1cYEI4FleYl9h0L6FaROt4jThngs4S2be4E8Z8U+GmzuxMiCVaLKf2mZAQmEp ZGUBWRp7vDEvHLWZk5NTtiF5Ty1ntaRcBugS94yt/KJUNzJooIdYibh7BJjtPM4= =xyNm -END PGP SIGNATURE- ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] 3.6 Feature Freeze - move to mid next week?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2014-07-04 17:14, Jeff Darcy wrote: Given the holiday weekend in US, I feel that it would be appropriate to move the 3.6 feature freeze date to mid next week so that we can have more reviews done address review comments too. We can still continue to track other milestones as per our release schedule [1]. What do you folks think? I think the answer depends on what we can expect to change between now and then. Since the gluster.org feature page never got updated to reflect the real feature set for 3.6, I took the list from email sent after the planning meeting. * Better SSL Two out of three patches merged, one still in review. * Data Classification Design barely begun. * Heterogeneous Bricks Patch has CR+1 V+1 but still stalled in review. * Trash Ancient one is still there, probably doesn't even work. * Disperse Patches still in very active review. * Persistent AFR Changelog Xattributes Patches merged. * Better Peer Identification Patch still in review (fails verification). * Gluster Volume Snapshot Tons of patches merged, tons more still to come. * AFRv2 Jammed in long ago. * Policy Based Split-Brain Resolver (PBSBR) No patches, feature page still says in design. * RDMA Improvements No patches, feature page says work in progress. * Server-side Barrier Feature Patches merged. That leaves us with a very short list of items that are likely to change state. * Better SSL * Heterogeneous Bricks * Disperse * Better Peer Identification Of those, I think only disperse is likely to benefit from an extension. The others just need people to step up and finish reviewing them, which could happen today if there were sufficient will. The real question is what to do about disperse. Some might argue that it's already complete enough to go in, so long as its limitations are documented appropriately. Others might argue that it's still months away from being usable (especially wrt performance). In a way it doesn't matter, because either way a few days won't make a difference. We just need to make a collective decision based on its current state (or close to it). If we need to wait a few days before people can come together for that, so be it. OK, this probably answered my earlier question, since there is no IPv6 on this list (stated somewhere to depend on 'Better Peer Identification'), i.e. I should stick to 3.5.1 and only apply patches to address my needs and then check what needs to be done when 3.6.0 is out. /Anders - -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden -BEGIN PGP SIGNATURE- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTtsZ3AAoJENZYyvaDG8NcJPMH/j1P/DN4lerQHtxOLjS7b6MM dNw12blXOIioFbGv/Sh7EYQm5A0Db4Hk21ngIYQcrZVgab/rVv6pfqvpV97S74sE A1yzTfJSMtshSter4F4VSV7BZrPHq7+hYKEkTNEu4Ugw7+PcGvjMAhfVmgiVqUT4 xTqSzB3IsOPELXIOrlB6AZbA7037UvWyyhxjilH5IRVW8KB2ButP2baP0zXlXMf6 622mn3CK11mp/VrXHyxBgGXUMWpJQ9r1vLEn4COhqhALQJ+0vW8uayzcYMYWz56m etTJKtGTZtPalrt1XrFq2Ny5o1KsG4GXlRIGwoqIHtn2v71Sserl6CZQRA+AVAo= =/oBv -END PGP SIGNATURE- ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] 3.5.1-beta2 Problems with suid and sgid bits on directories
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2014-06-24 22:26, Shyamsundar Ranganathan wrote: - Original Message - From: Anders Blomdell anders.blomd...@control.lth.se To: Niels de Vos nde...@redhat.com Cc: Shyamsundar Ranganathan srang...@redhat.com, Gluster Devel gluster-devel@gluster.org, Susant Palai spa...@redhat.com Sent: Tuesday, June 24, 2014 4:09:52 AM Subject: Re: [Gluster-devel] 3.5.1-beta2 Problems with suid and sgid bits on directories On 2014-06-23 12:03, Niels de Vos wrote: On Tue, Jun 17, 2014 at 11:49:26AM -0400, Shyamsundar Ranganathan wrote: You maybe looking at the problem being fixed here, [1]. On a lookup attribute mismatch was not being healed across directories, and this patch attempts to address the same. Currently the version of the patch does not heal the S_ISUID and S_ISGID bits, which is work in progress (but easy enough to incorporate and test based on the patch at [1]). On a separate note, add-brick just adds a brick to the cluster, the lookup is where the heal (or creation of the directory across all sub volumes in DHT xlator) is being done. I assume that this is not a regression between 3.5.0 and 3.5.1? If that is the case, we can pull the fix in 3.5.2 because 3.5.1 really should not get delayed much longer. No, it does not work in 3.5.0 either :-( I ran these tests using your scripts and observed similar behavior and need to dig into this a little further to understand how to make this work reliably. This might be a root cause, probably should be resolved first: https://bugzilla.redhat.com/show_bug.cgi?id=1113050 The proposed patch does not work as intended, with the following hieararchy 7550: 0 /mnt/gluster 27770:1000 /mnt/gluster/test 2755 1000:1000 /mnt/gluster/test/dir1 2755 1000:1000 /mnt/gluster/test/dir1/dir2 In the (approx 25%) of cases where my test-script does trigger a self heal on disk2, 10% ends up with (giving access error on client): 00: 0 /data/disk2/gluster/test 755 1000:1000 /data/disk2/gluster/test/dir1 755 1000:1000 /data/disk2/gluster/test/dir1/dir2 or 27770:1000 /data/disk2/gluster/test 00: 0 /data/disk2/gluster/test/dir1 755 1000:1000 /data/disk2/gluster/test/dir1/dir2 or 27770:1000 /data/disk2/gluster/test 2755 1000:1000 /data/disk2/gluster/test/dir1 00: 0 /data/disk2/gluster/test/dir1/dir2 and 73% ends up with either partially healed directories (/data/disk2/gluster/test/dir1/dir2 or /data/disk2/gluster/test/dir1 missing) or the sgid bit [randomly] set on some of the directories. Since I don't even understand how to reliably trigger a self-heal of the directories, I'm currently clueless to the reason for this behaviour. Soo, I think that the comment from susant in http://review.gluster.org/#/c/6983/3/xlators/cluster/dht/src/dht-common.c: susant palaiJun 13 9:04 AM I think we dont have to worry about that. Rebalance does not interfere with directory SUID/GID/STICKY bits. unfortunately is wrong :-(, and I'm on too deep water to understand how to fix this at the moment. Currently in the test case rebalance is not run, so the above comment in relation to rebalance is sort of different that what is observed. Just a note. I stand corrected :-) So far only self-heal has interfered. N.B: with 00777 flags on the /mnt/gluster/test directory I have not been able to trigger any unreadable directories /Anders Thanks, Niels Shyam [1] http://review.gluster.org/#/c/6983/ - Original Message - From: Anders Blomdell anders.blomd...@control.lth.se To: Gluster Devel gluster-devel@gluster.org Sent: Tuesday, June 17, 2014 10:53:52 AM Subject: [Gluster-devel] 3.5.1-beta2 Problems with suid and sgid bits on directories With a glusterfs-3.5.1-0.3.beta2.fc20.x86_64 with a reverted 3dc56cbd16b1074d7ca1a4fe4c5bf44400eb63ff (due to local lack of IPv4 addresses), I get weird behavior if I: 1. Create a directory with suid/sgid/sticky bit set (/mnt/gluster/test) 2. Make a subdirectory of #1 (/mnt/gluster/test/dir1) 3. Do an add-brick Before add-brick 755 /mnt/gluster 7775 /mnt/gluster/test 2755 /mnt/gluster/test/dir1 After add-brick 755 /mnt/gluster 1775 /mnt/gluster/test 755 /mnt/gluster/test/dir1 On the server it looks like this: 7775 /data/disk1/gluster/test 2755 /data/disk1/gluster/test/dir1 1775 /data/disk2/gluster/test 755 /data/disk2/gluster/test/dir1 Filed as bug: https://bugzilla.redhat.com/show_bug.cgi?id=1110262 If somebody can point me to where the logic of add-brick is placed, I can give it a shot (a find/grep on mkdir didn't immediately point me to the right place). /Anders - -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O
Re: [Gluster-devel] 3.5.1-beta2 Problems with suid and sgid bits on directories
On 2014-06-19 13:48, Susant Palai wrote: Adding Susant Unfortunately things don't go so well here, with --brick-log-level=DEBUG, I get very weird results (probably because the first brick is slower to respond while it's printing debug info), I suspect I trigger some timing related bug. I attach my testscript and a log of 20 runs (with 02777 flags). The real worrisome thing here is: backing: 0 0:0 /data/disk2/gluster/test/dir1 which means that the backing store has an unreadable dir, which gets propagated to clients... /Anders - Original Message - From: Anders Blomdell anders.blomd...@control.lth.se To: Shyamsundar Ranganathan srang...@redhat.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Wednesday, 18 June, 2014 9:33:04 PM Subject: Re: [Gluster-devel] 3.5.1-beta2 Problems with suid and sgid bits on directories On 2014-06-17 18:47, Anders Blomdell wrote: On 2014-06-17 17:49, Shyamsundar Ranganathan wrote: You maybe looking at the problem being fixed here, [1]. On a lookup attribute mismatch was not being healed across directories, and this patch attempts to address the same. Currently the version of the patch does not heal the S_ISUID and S_ISGID bits, which is work in progress (but easy enough to incorporate and test based on the patch at [1]). Thanks, will look into it tomorrow. On a separate note, add-brick just adds a brick to the cluster, the lookup is where the heal (or creation of the directory across all sub volumes in DHT xlator) is being done. Thanks for the clarification (I guess that a rebalance would trigger it as well?) Attached slightly modified version of patch [1] seems to work correctly after a rebalance that is allowed to run to completion on its own, if directories are traversed during rebalance, some 0 dirs show spurious 01777, 0 and sometimes ends up with the wrong permission. Continuing debug tomorrow... Shyam [1] http://review.gluster.org/#/c/6983/ - Original Message - From: Anders Blomdell anders.blomd...@control.lth.se To: Gluster Devel gluster-devel@gluster.org Sent: Tuesday, June 17, 2014 10:53:52 AM Subject: [Gluster-devel] 3.5.1-beta2 Problems with suid and sgid bits on directories With a glusterfs-3.5.1-0.3.beta2.fc20.x86_64 with a reverted 3dc56cbd16b1074d7ca1a4fe4c5bf44400eb63ff (due to local lack of IPv4 addresses), I get weird behavior if I: 1. Create a directory with suid/sgid/sticky bit set (/mnt/gluster/test) 2. Make a subdirectory of #1 (/mnt/gluster/test/dir1) 3. Do an add-brick Before add-brick 755 /mnt/gluster 7775 /mnt/gluster/test 2755 /mnt/gluster/test/dir1 After add-brick 755 /mnt/gluster 1775 /mnt/gluster/test 755 /mnt/gluster/test/dir1 On the server it looks like this: 7775 /data/disk1/gluster/test 2755 /data/disk1/gluster/test/dir1 1775 /data/disk2/gluster/test 755 /data/disk2/gluster/test/dir1 Filed as bug: https://bugzilla.redhat.com/show_bug.cgi?id=1110262 If somebody can point me to where the logic of add-brick is placed, I can give it a shot (a find/grep on mkdir didn't immediately point me to the right place). /Anders /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden bug-add-brick.sh Description: application/shellscript volume create: testvol: success: please start the volume to access data volume start: testvol: success mounted: 755 0:0 /mnt/gluster mounted: 2777 0:1600 /mnt/gluster/test mounted: 2755 247:1600 /mnt/gluster/test/dir1 Before add-brick 755 /mnt/gluster 2777 /mnt/gluster/test 2755 /mnt/gluster/test/dir1 volume add-brick: success volume set: success Files /tmp/tmp.3lK6STezID and /tmp/tmp.Z2Pr46kVu1 differ ## Differ tor jun 19 15:30:01 CEST 2014 -mounted: 755 0:0 /mnt/gluster -mounted: 2777 0:1600 /mnt/gluster/test -mounted: 2755 247:1600 /mnt/gluster/test/dir1 -mounted: 2755 247:1600 /mnt/gluster/test/dir1/dir2 +755 0:0 /mnt/gluster +2777 0:1600 /mnt/gluster/test +2755 247:1600 /mnt/gluster/test/dir1 +2755 247:1600 /mnt/gluster/test/dir1/dir2 ## TIMEOUT tor jun 19 15:30:06 CEST 2014 mounted: 755 0:0 /mnt/gluster mounted: 2777 0:1600 /mnt/gluster/test mounted: 2755 247:1600 /mnt/gluster/test/dir1 mounted: 2755 247:1600 /mnt/gluster/test/dir1/dir2 backing: 2777 0:1600 /data/disk1/gluster/test backing: 2755 247:1600 /data/disk1/gluster/test/dir1 backing: 2755 247:1600 /data/disk1/gluster/test/dir1/dir2 volume create: testvol: success: please start the volume to access data volume start: testvol: success mounted: 755 0:0 /mnt/gluster mounted: 2777 0:1600 /mnt/gluster/test mounted: 2755 247:1600 /mnt/gluster/test/dir1 Before add-brick 755 /mnt/gluster 2777 /mnt/gluster/test 2755 /mnt/gluster/test/dir1 volume add-brick: success volume set: success Files /tmp/tmp.5DWFQY6fus and /tmp/tmp.p7BxWShXLg differ
Re: [Gluster-devel] 3.5.1-beta2 Problems with suid and sgid bits on directories
On 06/19/2014 03:39 PM, Anders Blomdell wrote: On 2014-06-19 13:48, Susant Palai wrote: Adding Susant Unfortunately things don't go so well here, with --brick-log-level=DEBUG, I get very weird results (probably because the first brick is slower to respond while it's printing debug info), I suspect I trigger some timing related bug. I attach my testscript and a log of 20 runs (with 02777 flags). The real worrisome thing here is: backing: 0 0:0 /data/disk2/gluster/test/dir1 which means that the backing store has an unreadable dir, which gets propagated to clients... I have an embryo of an theory of what happens: 1. directories are created on the first brick. 2. fuse starts to read directories from the first brick. 3. getdents64 or fstatat64 to first brick takes too long, and is redirected to second brick. 4. self-heal is initiated on second brick. On monday, I will see if I can come up with some clever firewall tricks to trigger this behaviour in a reliable way. /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] 3.5.1-beta2 Problems with suid and sgid bits on directories
On 2014-06-17 18:47, Anders Blomdell wrote: On 2014-06-17 17:49, Shyamsundar Ranganathan wrote: You maybe looking at the problem being fixed here, [1]. On a lookup attribute mismatch was not being healed across directories, and this patch attempts to address the same. Currently the version of the patch does not heal the S_ISUID and S_ISGID bits, which is work in progress (but easy enough to incorporate and test based on the patch at [1]). Thanks, will look into it tomorrow. On a separate note, add-brick just adds a brick to the cluster, the lookup is where the heal (or creation of the directory across all sub volumes in DHT xlator) is being done. Thanks for the clarification (I guess that a rebalance would trigger it as well?) Attached slightly modified version of patch [1] seems to work correctly after a rebalance that is allowed to run to completion on its own, if directories are traversed during rebalance, some 0 dirs show spurious 01777, 0 and sometimes ends up with the wrong permission. Continuing debug tomorrow... Shyam [1] http://review.gluster.org/#/c/6983/ - Original Message - From: Anders Blomdell anders.blomd...@control.lth.se To: Gluster Devel gluster-devel@gluster.org Sent: Tuesday, June 17, 2014 10:53:52 AM Subject: [Gluster-devel] 3.5.1-beta2 Problems with suid and sgid bits on directories With a glusterfs-3.5.1-0.3.beta2.fc20.x86_64 with a reverted 3dc56cbd16b1074d7ca1a4fe4c5bf44400eb63ff (due to local lack of IPv4 addresses), I get weird behavior if I: 1. Create a directory with suid/sgid/sticky bit set (/mnt/gluster/test) 2. Make a subdirectory of #1 (/mnt/gluster/test/dir1) 3. Do an add-brick Before add-brick 755 /mnt/gluster 7775 /mnt/gluster/test 2755 /mnt/gluster/test/dir1 After add-brick 755 /mnt/gluster 1775 /mnt/gluster/test 755 /mnt/gluster/test/dir1 On the server it looks like this: 7775 /data/disk1/gluster/test 2755 /data/disk1/gluster/test/dir1 1775 /data/disk2/gluster/test 755 /data/disk2/gluster/test/dir1 Filed as bug: https://bugzilla.redhat.com/show_bug.cgi?id=1110262 If somebody can point me to where the logic of add-brick is placed, I can give it a shot (a find/grep on mkdir didn't immediately point me to the right place). /Anders /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden diff -urb glusterfs-3.5.1beta2/xlators/cluster/dht/src/dht-common.c glusterfs-3.5.1.orig/xlators/cluster/dht/src/dht-common.c --- glusterfs-3.5.1beta2/xlators/cluster/dht/src/dht-common.c 2014-06-10 18:55:22.0 +0200 +++ glusterfs-3.5.1.orig/xlators/cluster/dht/src/dht-common.c 2014-06-17 22:46:28.710636632 +0200 @@ -523,6 +523,28 @@ } int +permission_changed (ia_prot_t *local, ia_prot_t *stbuf) +{ +if( (local-owner.read != stbuf-owner.read) || +(local-owner.write != stbuf-owner.write) || +(local-owner.exec != stbuf-owner.exec) || +(local-group.read != stbuf-group.read) || +(local-group.write != stbuf-group.write) || +(local-group.exec != stbuf-group.exec) || +(local-other.read != stbuf-other.read) || +(local-other.write != stbuf-other.write) || +(local-other.exec != stbuf-other.exec ) || +(local-suid != stbuf-suid ) || +(local-sgid != stbuf-sgid ) || +(local-sticky != stbuf-sticky )) +{ +return 1; +} else { +return 0; +} +} + +int dht_revalidate_cbk (call_frame_t *frame, void *cookie, xlator_t *this, int op_ret, int op_errno, inode_t *inode, struct iatt *stbuf, dict_t *xattr, @@ -617,12 +639,16 @@ stbuf-ia_ctime_nsec)) { local-prebuf.ia_gid = stbuf-ia_gid; local-prebuf.ia_uid = stbuf-ia_uid; +local-prebuf.ia_prot = stbuf-ia_prot; } } if (local-stbuf.ia_type != IA_INVAL) { if ((local-stbuf.ia_gid != stbuf-ia_gid) || -(local-stbuf.ia_uid != stbuf-ia_uid)) { +(local-stbuf.ia_uid != stbuf-ia_uid) || +(permission_changed ((local-stbuf.ia_prot) +, ((stbuf-ia_prot) +{ local-need_selfheal = 1; } } @@ -669,6 +695,8 @@ uuid_copy (local-gfid, local-stbuf.ia_gfid); local-stbuf.ia_gid = local
Re: [Gluster-devel] 3.5.1-beta2 Problems with suid and sgid bits on directories
On 2014-06-17 17:49, Shyamsundar Ranganathan wrote: You maybe looking at the problem being fixed here, [1]. On a lookup attribute mismatch was not being healed across directories, and this patch attempts to address the same. Currently the version of the patch does not heal the S_ISUID and S_ISGID bits, which is work in progress (but easy enough to incorporate and test based on the patch at [1]). Thanks, will look into it tomorrow. On a separate note, add-brick just adds a brick to the cluster, the lookup is where the heal (or creation of the directory across all sub volumes in DHT xlator) is being done. Thanks for the clarification (I guess that a rebalance would trigger it as well?) Shyam [1] http://review.gluster.org/#/c/6983/ - Original Message - From: Anders Blomdell anders.blomd...@control.lth.se To: Gluster Devel gluster-devel@gluster.org Sent: Tuesday, June 17, 2014 10:53:52 AM Subject: [Gluster-devel] 3.5.1-beta2 Problems with suid and sgid bits ondirectories With a glusterfs-3.5.1-0.3.beta2.fc20.x86_64 with a reverted 3dc56cbd16b1074d7ca1a4fe4c5bf44400eb63ff (due to local lack of IPv4 addresses), I get weird behavior if I: 1. Create a directory with suid/sgid/sticky bit set (/mnt/gluster/test) 2. Make a subdirectory of #1 (/mnt/gluster/test/dir1) 3. Do an add-brick Before add-brick 755 /mnt/gluster 7775 /mnt/gluster/test 2755 /mnt/gluster/test/dir1 After add-brick 755 /mnt/gluster 1775 /mnt/gluster/test 755 /mnt/gluster/test/dir1 On the server it looks like this: 7775 /data/disk1/gluster/test 2755 /data/disk1/gluster/test/dir1 1775 /data/disk2/gluster/test 755 /data/disk2/gluster/test/dir1 Filed as bug: https://bugzilla.redhat.com/show_bug.cgi?id=1110262 If somebody can point me to where the logic of add-brick is placed, I can give it a shot (a find/grep on mkdir didn't immediately point me to the right place). /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel