[Gluster-devel] Regression tests have been put on-hold, manual intervention needed
Hi, it seems that there is a major breakage in the current master branch. Jeff already send an email about this yesterday [1], but it seems that things have not settled down yet. In order to prevent (more) confusion, automated regression tests have been disabled for now. Patches that have been recently submitted, will need a rebase when http://review.gluster.org/10105 gets merged. Enabling of the regression testing can be done by clicking the [Enable] button on http://build.gluster.org/job/rackspace-regression-2GB-triggered/ Thanks, Niels [1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10142 pgpywDDgY2iR2.pgp Description: PGP signature ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] crypt xlator bug
On 04/02/2015 12:27 AM, Raghavendra Talur wrote: On Wed, Apr 1, 2015 at 10:34 PM, Justin Clift jus...@gluster.org mailto:jus...@gluster.org wrote: On 1 Apr 2015, at 10:57, Emmanuel Dreyfus m...@netbsd.org mailto:m...@netbsd.org wrote: Hi crypt.t was recently broken in NetBSD regression. The glusterfs returns a node with file type invalid to FUSE, and that breaks the test. After running a git bisect, I found the offending commit after which this behavior appeared: 8a2e2b88fc21dc7879f838d18cd0413dd88023b7 mem-pool: invalidate memory on GF_FREE to aid debugging This means the bug has always been there, but this debugging aid caused it to be reliable. Sounds like that commit is a good win then. :) Harsha/Pranith/Lala, your names are on the git blame for crypt.c... any ideas? :) I found one issue that local is not allocated using GF_CALLOC and with a mem-type. This is a patch which *might* fix it. diff --git a/xlators/encryption/crypt/src/crypt-mem-types.h b/xlators/encryption/crypt/src/crypt-mem-types.h index 2eab921..c417b67 100644 --- a/xlators/encryption/crypt/src/crypt-mem-types.h +++ b/xlators/encryption/crypt/src/crypt-mem-types.h @@ -24,6 +24,7 @@ enum gf_crypt_mem_types_ { gf_crypt_mt_key, gf_crypt_mt_iovec, gf_crypt_mt_char, +gf_crypt_mt_local, gf_crypt_mt_end, }; diff --git a/xlators/encryption/crypt/src/crypt.c b/xlators/encryption/crypt/src/crypt.c index ae8cdb2..63c0977 100644 --- a/xlators/encryption/crypt/src/crypt.c +++ b/xlators/encryption/crypt/src/crypt.c @@ -48,7 +48,7 @@ static crypt_local_t *crypt_alloc_local(call_frame_t *frame, xlator_t *this, { crypt_local_t *local = NULL; - local = mem_get0(this-local_pool); +local = GF_CALLOC (sizeof (*local), 1, gf_crypt_mt_local); local is using the memory from pool earlier(i.e. with mem_get0()). Which seems ok to me. Changing it this way will include memory allocation in fop I/O path which is why xlators generally use the mem-pool approach. Pranith if (!local) { gf_log(this-name, GF_LOG_ERROR, out of memory); return NULL; Niels should be able to recognize if this is sufficient fix or not. Thanks, Raghavendra Talur + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift http://twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org mailto:Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel -- *Raghavendra Talur * ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] crypt xlator bug
On Thu, Apr 02, 2015 at 01:58:39PM +0530, Raghavendra Bhat wrote: On Thursday 02 April 2015 01:00 PM, Pranith Kumar Karampuri wrote: On 04/02/2015 12:27 AM, Raghavendra Talur wrote: On Wed, Apr 1, 2015 at 10:34 PM, Justin Clift jus...@gluster.org mailto:jus...@gluster.org wrote: On 1 Apr 2015, at 10:57, Emmanuel Dreyfus m...@netbsd.org mailto:m...@netbsd.org wrote: Hi crypt.t was recently broken in NetBSD regression. The glusterfs returns a node with file type invalid to FUSE, and that breaks the test. After running a git bisect, I found the offending commit after which this behavior appeared: 8a2e2b88fc21dc7879f838d18cd0413dd88023b7 mem-pool: invalidate memory on GF_FREE to aid debugging This means the bug has always been there, but this debugging aid caused it to be reliable. Sounds like that commit is a good win then. :) Harsha/Pranith/Lala, your names are on the git blame for crypt.c... any ideas? :) I found one issue that local is not allocated using GF_CALLOC and with a mem-type. This is a patch which *might* fix it. diff --git a/xlators/encryption/crypt/src/crypt-mem-types.h b/xlators/encryption/crypt/src/crypt-mem-types.h index 2eab921..c417b67 100644 --- a/xlators/encryption/crypt/src/crypt-mem-types.h +++ b/xlators/encryption/crypt/src/crypt-mem-types.h @@ -24,6 +24,7 @@ enum gf_crypt_mem_types_ { gf_crypt_mt_key, gf_crypt_mt_iovec, gf_crypt_mt_char, +gf_crypt_mt_local, gf_crypt_mt_end, }; diff --git a/xlators/encryption/crypt/src/crypt.c b/xlators/encryption/crypt/src/crypt.c index ae8cdb2..63c0977 100644 --- a/xlators/encryption/crypt/src/crypt.c +++ b/xlators/encryption/crypt/src/crypt.c @@ -48,7 +48,7 @@ static crypt_local_t *crypt_alloc_local(call_frame_t *frame, xlator_t *this, { crypt_local_t *local = NULL; - local = mem_get0(this-local_pool); +local = GF_CALLOC (sizeof (*local), 1, gf_crypt_mt_local); local is using the memory from pool earlier(i.e. with mem_get0()). Which seems ok to me. Changing it this way will include memory allocation in fop I/O path which is why xlators generally use the mem-pool approach. Pranith I think, crypt xlator should do a mem_put of local after doing STACK_UNWIND like other xlators which also use mem_get for local (such as AFR). I am suspecting crypt not doing mem_put might be the reason for the bug mentioned. I've looked at this now too. The use of mem_get0() seems fine to me. But indeed, calling mem_put() is missing. Whenever the crypt_local_t should be released, mem_put() should get called, just like any GF_CALLOC/GF_FREE combinations. HTH, Niels Regards, Raghavendra Bat if (!local) { gf_log(this-name, GF_LOG_ERROR, out of memory); return NULL; Niels should be able to recognize if this is sufficient fix or not. Thanks, Raghavendra Talur + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift http://twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org mailto:Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel -- *Raghavendra Talur * ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel pgp8J6DUYue6M.pgp Description: PGP signature ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] crypt xlator bug
On Thursday 02 April 2015 01:00 PM, Pranith Kumar Karampuri wrote: On 04/02/2015 12:27 AM, Raghavendra Talur wrote: On Wed, Apr 1, 2015 at 10:34 PM, Justin Clift jus...@gluster.org mailto:jus...@gluster.org wrote: On 1 Apr 2015, at 10:57, Emmanuel Dreyfus m...@netbsd.org mailto:m...@netbsd.org wrote: Hi crypt.t was recently broken in NetBSD regression. The glusterfs returns a node with file type invalid to FUSE, and that breaks the test. After running a git bisect, I found the offending commit after which this behavior appeared: 8a2e2b88fc21dc7879f838d18cd0413dd88023b7 mem-pool: invalidate memory on GF_FREE to aid debugging This means the bug has always been there, but this debugging aid caused it to be reliable. Sounds like that commit is a good win then. :) Harsha/Pranith/Lala, your names are on the git blame for crypt.c... any ideas? :) I found one issue that local is not allocated using GF_CALLOC and with a mem-type. This is a patch which *might* fix it. diff --git a/xlators/encryption/crypt/src/crypt-mem-types.h b/xlators/encryption/crypt/src/crypt-mem-types.h index 2eab921..c417b67 100644 --- a/xlators/encryption/crypt/src/crypt-mem-types.h +++ b/xlators/encryption/crypt/src/crypt-mem-types.h @@ -24,6 +24,7 @@ enum gf_crypt_mem_types_ { gf_crypt_mt_key, gf_crypt_mt_iovec, gf_crypt_mt_char, +gf_crypt_mt_local, gf_crypt_mt_end, }; diff --git a/xlators/encryption/crypt/src/crypt.c b/xlators/encryption/crypt/src/crypt.c index ae8cdb2..63c0977 100644 --- a/xlators/encryption/crypt/src/crypt.c +++ b/xlators/encryption/crypt/src/crypt.c @@ -48,7 +48,7 @@ static crypt_local_t *crypt_alloc_local(call_frame_t *frame, xlator_t *this, { crypt_local_t *local = NULL; - local = mem_get0(this-local_pool); +local = GF_CALLOC (sizeof (*local), 1, gf_crypt_mt_local); local is using the memory from pool earlier(i.e. with mem_get0()). Which seems ok to me. Changing it this way will include memory allocation in fop I/O path which is why xlators generally use the mem-pool approach. Pranith I think, crypt xlator should do a mem_put of local after doing STACK_UNWIND like other xlators which also use mem_get for local (such as AFR). I am suspecting crypt not doing mem_put might be the reason for the bug mentioned. Regards, Raghavendra Bat if (!local) { gf_log(this-name, GF_LOG_ERROR, out of memory); return NULL; Niels should be able to recognize if this is sufficient fix or not. Thanks, Raghavendra Talur + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift http://twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org mailto:Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel -- *Raghavendra Talur * ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Gluster 3.6.2 On Xeon Phi
On 12 Feb 2015, at 08:54, Mohammed Rafi K C rkavu...@redhat.com wrote: On 02/12/2015 08:32 AM, Rudra Siva wrote: Rafi, I'm preparing the Phi RDMA patch for submission If you can send a patch to support iWARP, that will be a great addition to gluster rdma. Clearing out older email... did this patch get submitted and merged? :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Security hardening RELRO PIE flags
On 31 Mar 2015, at 08:15, Niels de Vos nde...@redhat.com wrote: On Tue, Mar 31, 2015 at 12:20:19PM +0530, Kaushal M wrote: IMHO, doing hardening and security should be left the individual distributions and the package maintainers. Generally, each distribution has it's own policies with regards to hardening and security. We as an upstream project cannot decide on what a distribution should do. But we should be ready to fix bugs that could arise when distributions do hardened builds. So, I vote against having these hardening flags added to the base GlusterFS build. But we could add the flags the Fedora spec files which we carry with our source. Indeed, I agree that the compiler flags should be specified by the distributions. At least Fedora and Debian do this already include (probably different) options within their packaging scripts. We should set the flags we need, but not more. It would be annoying to set default flags that can conflict with others, or which are not (yet) available on architectures that we normally do not test. First thoughts: :) * We provide our own packaging scripts + distribute rpms/deb's from our own site too. Should we investigate/try these flags out for the packages we build + supply? * Are there changes in our code + debugging practises that would be needed for these security hardening flags to work? If there are, and we don't make these changes ourselves, doesn't that mean we're telling distributions they need to carry their own patch set in order to have a more secure GlusterFS? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Regression tests have been put on-hold, manual intervention needed
On 04/02/2015 01:28 PM, Niels de Vos wrote: Hi, it seems that there is a major breakage in the current master branch. Jeff already send an email about this yesterday [1], but it seems that things have not settled down yet. In order to prevent (more) confusion, automated regression tests have been disabled for now. Patches that have been recently submitted, will need a rebase when http://review.gluster.org/10105 gets merged. Enabling of the regression testing can be done by clicking the [Enable] button on http://build.gluster.org/job/rackspace-regression-2GB-triggered/ 10105 has been merged and regression has been enabled again. Thanks everyone for the quick turnaround! -Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] replace-brick command modification
Hi all, Since GlusterFs version 3.6.0 gluster volume replace-brick VOLNAME SOURCE-BRICK NEW-BRICK {start [force]|pause|abort|status|commit } command have deprecated. Only gluster volume replace-brick VOLNAME SOURCE-BRICK NEW-BRICK commit force command supported. for bug https://bugzilla.redhat.com/show_bug.cgi?id=1094119 , Patch http://review.gluster.org/#/c/10101/ is removing cli/glusterd code for gluster volume replace-brick VOLNAME BRICK NEW-BRICK {start [force]|pause|abort|status|commit } command. so only we have commit force option supported for replace-brick command. Should we have new command gluster volume replace-brick VOLNAME SOURCE-BRICK NEW-BRICK instead of having gluster volume replace-brick VOLNAME SOURCE-BRICK NEW-BRICK commit force command. Thanks Regards Gaurav Garg ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Security hardening RELRO PIE flags
Hi, Sorry for the top-post. Just to Amplify a but if what Niels has already said—— Yes, in Fedora, the glusterfs.spec file has a line %global _hardened_build 1 at the top. This enables PIE and RELRO in Fedora and EPEL builds. This line exists in the glusterfs.spec.in file in the Gluster source tree too. Debian-based builds have something analogous. (We don't have the Debian packing pieces in our source as we do for RPMs. I wanted it, but the community dictated otherwise.) Using hardened builds gives us the belt and suspenders model. IOW we fix things that Coverity finds as fast as we can, and then hardened builds (i.e. PIE + RELRO) close any gaps that Coverity hasn't found or that Coverity has found but haven't been fixed yet. We have a long list of Coverity issues that remain to be worked through. I'm not aware that compiling with PIE and RELRO provide anything of value for mainline development. There are no exra warnings or errors — nothing that the developer would have to change or fix as a function of their ordinary development practices. The executables that a developer produces are meant for development and debugging. PIE and RELRO don't get in the way, but they don't help either. I don't see any reason to enable this in the autoconf config or build. And, BTW, Coverity isn't the end-all solution to code quality and application security. Things like cpp-check, clang-analyze, even just using the Intel, AMD, Clang, and `gcc -pedantic` compilers will find lots of potential bugs just by compiling with them — bugs that gcc doesn't even warn about. Thanks, -- Kaleb On 04/02/2015 07:58 AM, Venky Shankar wrote: On 03/31/2015 12:45 PM, Niels de Vos wrote: On Tue, Mar 31, 2015 at 12:20:19PM +0530, Kaushal M wrote: IMHO, doing hardening and security should be left the individual distributions and the package maintainers. Generally, each distribution has it's own policies with regards to hardening and security. We as an upstream project cannot decide on what a distribution should do. But we should be ready to fix bugs that could arise when distributions do hardened builds. So, I vote against having these hardening flags added to the base GlusterFS build. But we could add the flags the Fedora spec files which we carry with our source. Indeed, I agree that the compiler flags should be specified by the distributions. At least Fedora and Debian do this already include (probably different) options within their packaging scripts. We should set the flags we need, but not more. It would be annoying to set default flags that can conflict with others, or which are not (yet) available on architectures that we normally do not test. Niels I echo the same. But, just for educational purposes it would be good to know what kind of attack(s) [buffer/heap overflows] GlusterFS is vulnerable as of now and probably fix them if possible (Coverity does the job for us to some extent, correct?). Are there any tools for this out in the open? ~kaushal On Tue, Mar 31, 2015 at 11:49 AM, Atin Mukherjeeamukh...@redhat.com wrote: Folks, There are some projects which uses compiler/glibc features to strengthen the security claims. Popular distros suggest to harden daemon with RELRO/PIE flags. You could see [1] [2] [3] Partial relro is when you have -Wl,-z,relro in the LDFLAGS for building libraries. Partial relro means that some ELF sections are reordered so that overflows in some likely sections don't affect others and the local offset table is readonly. To get full relro, you also need to have: -Wl,-z,bind_now added to LDFLAGS. What this does is make the Global Offset table and Procedure Lookup Table readonly. This takes some time, so its only worth it for apps that have a real possibility of being attacked. This would be setuid/setgid/setcap and daemons. There are some security critical apps that can have this too. If the apps likely parses files from an untrusted source (internet), then it might also want to have full relro. To enable PIE, you would pass -fPIE -DPIE in the CFLAGS and -pie in the LDFLAGS. What PIE does is randomize the locations of important items such as the base address of an executable and position of libraries, heap, and stack, in a process's address space. Sometimes this is called ASLR. Its designed to make buffer/heap overflow, return into libc attacks much harder. Part of the way it does this is to make a new section in the ELF image that is writable to redirect function calls to the correct address (offsets). This has to be writable because each invocation will have different layouts and needs to be fixed up. So, when you have an application with PIE, you want full relro so that these sections become readonly and not part of an attacker's target areas. I would like to hear from the community whether we should introduce these hardening flags in glusterfs as well. [1]https://fedorahosted.org/fesco/ticket/563 [2]https://wiki.debian.org/Hardening
Re: [Gluster-devel] Security hardening RELRO PIE flags
On 04/02/2015 08:22 AM, Kaleb KEITHLEY wrote: Hi, Sorry for the top-post. Just to Amplify a but if what Niels... Just to Amplify a bit of what Niels (Naughty fingers.) -- Kaleb ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Adventures in building GlusterFS
On 2 Apr 2015, at 12:10, Vijay Bellur vbel...@redhat.com wrote: On 04/02/2015 06:27 AM, Jeff Darcy wrote: My recommendations: (1) Apply the -Wno-error=cpp and -Wno-error=maybe-uninitialized changes wherever they need to be applied so that they're effective during normal regression builds Thanks, Jeff. Justin - would it be possible to do this change as well in build.sh? The regression builds seem to be running again at the moment without removing -Werror. So I'm not sure if this needs adjusting any more? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Security hardening RELRO PIE flags
On Thu, Apr 02, 2015 at 01:21:57PM +0100, Justin Clift wrote: On 31 Mar 2015, at 08:15, Niels de Vos nde...@redhat.com wrote: On Tue, Mar 31, 2015 at 12:20:19PM +0530, Kaushal M wrote: IMHO, doing hardening and security should be left the individual distributions and the package maintainers. Generally, each distribution has it's own policies with regards to hardening and security. We as an upstream project cannot decide on what a distribution should do. But we should be ready to fix bugs that could arise when distributions do hardened builds. So, I vote against having these hardening flags added to the base GlusterFS build. But we could add the flags the Fedora spec files which we carry with our source. Indeed, I agree that the compiler flags should be specified by the distributions. At least Fedora and Debian do this already include (probably different) options within their packaging scripts. We should set the flags we need, but not more. It would be annoying to set default flags that can conflict with others, or which are not (yet) available on architectures that we normally do not test. First thoughts: :) * We provide our own packaging scripts + distribute rpms/deb's from our own site too. Should we investigate/try these flags out for the packages we build + supply? At least for the RPMs, we try to follow the Fedora guidelines and their standard flags. With recent Fedora releases this includes additional hardening flags. * Are there changes in our code + debugging practises that would be needed for these security hardening flags to work? If there are, and we don't make these changes ourselves, doesn't that mean we're telling distributions they need to carry their own patch set in order to have a more secure GlusterFS? We have received several patches from the Debian maintainer that improve the handling of these options. When maintainers for distrubutions build GlusterFS and require changes, they either file bugs and/or send patches. I think this works quite well. Niels ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Adventures in building GlusterFS
On 2 Apr 2015, at 01:57, Jeff Darcy jda...@redhat.com wrote: As many of you have undoubtedly noticed, we're now in a situation where *all* regression builds are now failing, with something like this: - cc1: warnings being treated as errors /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c: In function ‘glusterd_snap_quorum_check_for_create’: /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:2615: error: passing argument 2 of ‘does_gd_meet_server_quorum’ from incompatible pointer type /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-server-quorum.h:56: note: expected ‘struct list_head *’ but argument is of type ‘struct cds_list_head *’ - The reason is that -Werror was turned on earlier today. I'm not quite sure how or where, because the version of build.sh that I thought builds would use doesn't seem to have changed since September 8, but then there's a lot about this system I don't understand. Vijay (who I believe made the change) knows it better than I ever will. A. This was me. Noticed the lack of -Werror lack night, and immediately fixed it. Then hit the sack shortly after. Umm Sorry? :/ + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Adventures in building GlusterFS
On 2 Apr 2015, at 12:10, Vijay Bellur vbel...@redhat.com wrote: On 04/02/2015 06:27 AM, Jeff Darcy wrote: My recommendations: (1) Apply the -Wno-error=cpp and -Wno-error=maybe-uninitialized changes wherever they need to be applied so that they're effective during normal regression builds Thanks, Jeff. Justin - would it be possible to do this change as well in build.sh? Sure. What needs changing from here? https://github.com/justinclift/glusterfs_patch_acceptance_tests/blob/master/build.sh + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Adventures in building GlusterFS
On 2 Apr 2015, at 01:57, Jeff Darcy jda...@redhat.com wrote: snip (1) Apply the -Wno-error=cpp and -Wno-error=maybe-uninitialized changes wherever they need to be applied so that they're effective during normal regression builds The git repo which holds our CentOS build and regression testing scripts, is here: https://review.gerrithub.io/#/admin/projects/justinclift/glusterfs_patch_acceptance_tests https://github.com/justinclift/glusterfs_patch_acceptance_tests It's being used as a test bunny to try out GerritHub. (May end in rabbit soup. I do not like rabbit soup. :/) The build bit in it is (bash script): P=/build; ./configure --prefix=$P/install --with-mountutildir=$P/install/sbin --with-initdir=$P/install/etc --localstatedir=/var --enable-bd-xlator=yes --enable-debug --silent make install CFLAGS=-g -O0 -Wall -Werror -j 4 With the -Werror added last night. Should we adjust it? Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Security hardening RELRO PIE flags
I've got responses from couple of folks, would also love hear from others. ~Atin On 03/31/2015 11:49 AM, Atin Mukherjee wrote: Folks, There are some projects which uses compiler/glibc features to strengthen the security claims. Popular distros suggest to harden daemon with RELRO/PIE flags. You could see [1] [2] [3] Partial relro is when you have -Wl,-z,relro in the LDFLAGS for building libraries. Partial relro means that some ELF sections are reordered so that overflows in some likely sections don't affect others and the local offset table is readonly. To get full relro, you also need to have: -Wl,-z,bind_now added to LDFLAGS. What this does is make the Global Offset table and Procedure Lookup Table readonly. This takes some time, so its only worth it for apps that have a real possibility of being attacked. This would be setuid/setgid/setcap and daemons. There are some security critical apps that can have this too. If the apps likely parses files from an untrusted source (internet), then it might also want to have full relro. To enable PIE, you would pass -fPIE -DPIE in the CFLAGS and -pie in the LDFLAGS. What PIE does is randomize the locations of important items such as the base address of an executable and position of libraries, heap, and stack, in a process's address space. Sometimes this is called ASLR. Its designed to make buffer/heap overflow, return into libc attacks much harder. Part of the way it does this is to make a new section in the ELF image that is writable to redirect function calls to the correct address (offsets). This has to be writable because each invocation will have different layouts and needs to be fixed up. So, when you have an application with PIE, you want full relro so that these sections become readonly and not part of an attacker's target areas. I would like to hear from the community whether we should introduce these hardening flags in glusterfs as well. [1] https://fedorahosted.org/fesco/ticket/563 [2] https://wiki.debian.org/Hardening [3] https://wiki.ubuntu.com/Security/Features#relro -- ~Atin ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-infra] Regression tests have been put on-hold, manual intervention needed
Also, everyone please rebase your patches so that they get the fix and regression can be triggered on them again. ~kaushal On Thu, Apr 2, 2015 at 3:42 PM, Vijay Bellur vbel...@redhat.com wrote: On 04/02/2015 01:28 PM, Niels de Vos wrote: Hi, it seems that there is a major breakage in the current master branch. Jeff already send an email about this yesterday [1], but it seems that things have not settled down yet. In order to prevent (more) confusion, automated regression tests have been disabled for now. Patches that have been recently submitted, will need a rebase when http://review.gluster.org/10105 gets merged. Enabling of the regression testing can be done by clicking the [Enable] button on http://build.gluster.org/job/rackspace-regression-2GB-triggered/ 10105 has been merged and regression has been enabled again. Thanks everyone for the quick turnaround! -Vijay ___ Gluster-infra mailing list gluster-in...@gluster.org http://www.gluster.org/mailman/listinfo/gluster-infra ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Adventures in building GlusterFS
On 04/02/2015 06:27 AM, Jeff Darcy wrote: My recommendations: (1) Apply the -Wno-error=cpp and -Wno-error=maybe-uninitialized changes wherever they need to be applied so that they're effective during normal regression builds Thanks, Jeff. Justin - would it be possible to do this change as well in build.sh? Regards, Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Multiple verify in gerrit
On 2 Apr 2015, at 05:18, Emmanuel Dreyfus m...@netbsd.org wrote: Hi I am now convinced the solution to our multiple regression problem is to introduce more Gluster Build System users: one for CentOS regression, another one for NetBSD regression (and one for each smoke test, as exaplained below). I just tested it on http://review.gluster.org/10052, and here is what gerrit display in the verified column - if there are neither verified=+1 or verified=-1 cast: nothing - if there is at least one verified=+1 and no verified=-1: verified - if there is at least one verified=-1: failed Therefore if CentOS regression uses bu...@review.gluster.org to report results and NetBSD regression uses nb7bu...@review.gluster.org (later user should be created), we acheive this outcome: - gerrit will display a change as verified if one regression reported it as verified and the other either also succeeded or failed to report - gerrit will display a change as failed if one regression reported it at failed, regardless of what the other reported. There is still one minor problem: if one regression does not report, or report late, we can have the feeling that a change is verified while it should not, and its status can change later. But this is a minor issue compaed to curent status. Other ideas: - smoke builds should also report as different gerrit users, so that a verified=+1 regression result does not override verified=-1 smoke build result - when we get a regression failure, we could cast the verified vote to gerrit and immediatly schedule another regression run. That way we could automatically workaround spurious failures without the need for retrigger in Jenkins. You're probably right. :) I'll set up test / sandbox VM's today using last night's backup of our Gerrit setup, then we can try stuff out on it to make sure. Give me a few hours though. ;) It needs to be able to communicate with stuff on the internet for OpenID to work, but unable to affect our Jenkins box, Forge/GitHub/etc. Best way I've thought of for doing that (so far) is adding static routes to bogus IP addresses in /etc/hosts for the things we don't want it communicating with. The other option might be to just use the built in IP tables firewall to disallow all communications except for whitelisted addresses. Will figure it out in a few hours. ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] crypt xlator bug
I think, crypt xlator should do a mem_put of local after doing STACK_UNWIND like other xlators which also use mem_get for local (such as AFR). I am suspecting crypt not doing mem_put might be the reason for the bug mentioned. My understanding was that mem_put should be called automatically from FRAME_DESTROY, which is itself called from STACK_DESTROY when the fop completes (e.g. at FUSE or GFAPI). On the other hand, I see that AFR and others call mem_put themselves, without zeroing the local pointer. In my (possibly no longer relevant) experience, freeing local myself without zeroing the pointer would lead to a double free, and I don't see why that's not the case here. What am I missing? ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
On 1 Apr 2015, at 19:47, Jeff Darcy jda...@redhat.com wrote: When doing an initial burn in test (regression run on master head of GlusterFS git), it coredumped on the new slave23.cloud.gluster.org VM. (yeah, I'm reusing VM names) http://build.gluster.org/job/regression-test-burn-in/16/console Does anyone have time to check the coredump, and see if this is the bug we already know about? This is *not* the same as others I've seen. There are no threads in the usual connection-cleanup/list_del code. Rather, it looks like some are in generic malloc code, possibly indicating some sort of arena corruption. Is it ok to put slave23.cloud.gluster.org into general rotation, so it runs regression jobs along with the rest? + Justing -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Security hardening RELRO PIE flags
On 03/31/2015 12:45 PM, Niels de Vos wrote: On Tue, Mar 31, 2015 at 12:20:19PM +0530, Kaushal M wrote: IMHO, doing hardening and security should be left the individual distributions and the package maintainers. Generally, each distribution has it's own policies with regards to hardening and security. We as an upstream project cannot decide on what a distribution should do. But we should be ready to fix bugs that could arise when distributions do hardened builds. So, I vote against having these hardening flags added to the base GlusterFS build. But we could add the flags the Fedora spec files which we carry with our source. Indeed, I agree that the compiler flags should be specified by the distributions. At least Fedora and Debian do this already include (probably different) options within their packaging scripts. We should set the flags we need, but not more. It would be annoying to set default flags that can conflict with others, or which are not (yet) available on architectures that we normally do not test. Niels I echo the same. But, just for educational purposes it would be good to know what kind of attack(s) [buffer/heap overflows] GlusterFS is vulnerable as of now and probably fix them if possible (Coverity does the job for us to some extent, correct?). Are there any tools for this out in the open? ~kaushal On Tue, Mar 31, 2015 at 11:49 AM, Atin Mukherjee amukh...@redhat.com wrote: Folks, There are some projects which uses compiler/glibc features to strengthen the security claims. Popular distros suggest to harden daemon with RELRO/PIE flags. You could see [1] [2] [3] Partial relro is when you have -Wl,-z,relro in the LDFLAGS for building libraries. Partial relro means that some ELF sections are reordered so that overflows in some likely sections don't affect others and the local offset table is readonly. To get full relro, you also need to have: -Wl,-z,bind_now added to LDFLAGS. What this does is make the Global Offset table and Procedure Lookup Table readonly. This takes some time, so its only worth it for apps that have a real possibility of being attacked. This would be setuid/setgid/setcap and daemons. There are some security critical apps that can have this too. If the apps likely parses files from an untrusted source (internet), then it might also want to have full relro. To enable PIE, you would pass -fPIE -DPIE in the CFLAGS and -pie in the LDFLAGS. What PIE does is randomize the locations of important items such as the base address of an executable and position of libraries, heap, and stack, in a process's address space. Sometimes this is called ASLR. Its designed to make buffer/heap overflow, return into libc attacks much harder. Part of the way it does this is to make a new section in the ELF image that is writable to redirect function calls to the correct address (offsets). This has to be writable because each invocation will have different layouts and needs to be fixed up. So, when you have an application with PIE, you want full relro so that these sections become readonly and not part of an attacker's target areas. I would like to hear from the community whether we should introduce these hardening flags in glusterfs as well. [1] https://fedorahosted.org/fesco/ticket/563 [2] https://wiki.debian.org/Hardening [3] https://wiki.ubuntu.com/Security/Features#relro -- ~Atin ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] crypt xlator bug
On 04/02/2015 07:27 PM, Raghavendra Bhat wrote: On Thursday 02 April 2015 05:50 PM, Jeff Darcy wrote: I think, crypt xlator should do a mem_put of local after doing STACK_UNWIND like other xlators which also use mem_get for local (such as AFR). I am suspecting crypt not doing mem_put might be the reason for the bug mentioned. My understanding was that mem_put should be called automatically from FRAME_DESTROY, which is itself called from STACK_DESTROY when the fop completes (e.g. at FUSE or GFAPI). On the other hand, I see that AFR and others call mem_put themselves, without zeroing the local pointer. In my (possibly no longer relevant) experience, freeing local myself without zeroing the pointer would lead to a double free, and I don't see why that's not the case here. What am I missing? As per my understanding, the xlators which get local by mem_get should be doing below things in callback funtion just before unwinding: 1) save frame-local pointer (i.e. local = frame-local); 2) STACK_UNWIND 3) mem_put (local) After STACK_UNWIND and before mem_put any reference to fd or inode or dict that might be present in the local should be unrefed (also any allocated resources that are present in local should be freed). So mem_put is done at last. To avoid double free in FRAME_DESTROY, frame-local is set to NULL before doing STACK_UNWIND. I suspect not doing 1 of the above three operations (may be either 1st or 3rd) in crypt xlator might be the reason for the bug. I still don't understand why http://review.gluster.org/10109 is working. Does anyone know the reason? How are you guys re-creating the crash? I ran crypt.t but no crashes on my laptop. Could some one help me re-create this issue. Pranith Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] replace-brick command modification
Hi all, Thank you for your thoughts. force should be present in command, i will keep it to commit force. replace brick command will be gluster volume replace-brick VOLNAME SOURCE-BRICK NEW-BRICK commit force Regards Gaurav - Original Message - From: Raghavendra Talur raghavendra.ta...@gmail.com To: Kaushal M kshlms...@gmail.com Cc: gluster-us...@gluster.org Sent: Thursday, 2 April, 2015 11:33:50 PM Subject: Re: [Gluster-users] replace-brick command modification On Thu, Apr 2, 2015 at 10:28 PM, Kaushal M kshlms...@gmail.com wrote: On Thu, Apr 2, 2015 at 7:20 PM, Kelvin Edmison kelvin.edmi...@alcatel-lucent.com wrote: Gaurav, I think that it is appropriate to keep the commit force options for replace-brick, just to prevent less experienced admins from self-inflicted data loss scenarios. The add-brick/remove-brick pair of operations is not an intuitive choice for admins who are trying to solve a problem with a specific brick. In this situation, admins are generally thinking 'how can I move the data from this brick to another one', and an admin that is casually surfing documentation might infer that the replace-brick operation is the correct one, rather than a sequence of commands that are somehow magically related. I believe that keeping the mandatory commit force options for replace-brick will help give these admins reason to pause and re-consider if this is the right command for them to do, and prevent cases where new gluster admins start shouting 'gluster lost my data'. Regards, Kelvin On 04/02/2015 07:26 AM, Gaurav Garg wrote: Hi all, Since GlusterFs version 3.6.0 gluster volume replace-brick VOLNAME SOURCE-BRICK NEW-BRICK {start [force]|pause|abort|status|commit } command have deprecated. Only gluster volume replace-brick VOLNAME SOURCE-BRICK NEW-BRICK commit force command supported. for bug https://bugzilla.redhat.com/show_bug.cgi?id=1094119 , Patch http://review.gluster.org/#/c/10101/ is removing cli/glusterd code for gluster volume replace-brick VOLNAME BRICK NEW-BRICK {start [force]|pause|abort|status|commit } command. so only we have commit force option supported for replace-brick command. Should we have new command gluster volume replace-brick VOLNAME SOURCE-BRICK NEW-BRICK instead of having gluster volume replace-brick VOLNAME SOURCE-BRICK NEW-BRICK commit force command. Thanks Regards Gaurav Garg ___ Gluster-users mailing list gluster-us...@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list gluster-us...@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users AFAIK, it was never the plan to remove 'replace-brick commit force'. The plan was always to retain it while removing the unsupported and unneeded options, ie 'replace-brick (start|pause|abort|status)'. Gaurav, your change is attempting to do the correct thing already and needs no changes (other than any that arise via the review process). I agree with Kelvin and Kaushal. We should retain commit force; force brings the implicit meaning that I fully understand what I am asking to be done is not the norm, but do proceed and I hold myself responsible for anything bad that happens. ~kaushal ___ Gluster-users mailing list gluster-us...@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users -- Raghavendra Talur ___ Gluster-users mailing list gluster-us...@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
On 2 Apr 2015, at 14:42, Jeff Darcy jda...@redhat.com wrote: Is it ok to put slave23.cloud.gluster.org into general rotation, so it runs regression jobs along with the rest? Sounds OK to me. Do we have a place to store the core tarball, just in case we decide we need to go back to it some day? Yep. They're now here: http://ded.ninja/gluster/slave23.cloud.gluster.org/ Should be safe for a couple of months at least. In theory. ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Adventures in building GlusterFS
The regression builds seem to be running again at the moment without removing -Werror. So I'm not sure if this needs adjusting any more? Yeah, I'm not sure why either. Maybe a difference in compiler versions? In any case, if it's working now, I'd say let's not mess with it. I can add the extra flags in my own builds easily enough. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Security hardening RELRO PIE flags
On 2 Apr 2015, at 14:08, Niels de Vos nde...@redhat.com wrote: On Thu, Apr 02, 2015 at 01:21:57PM +0100, Justin Clift wrote: On 31 Mar 2015, at 08:15, Niels de Vos nde...@redhat.com wrote: On Tue, Mar 31, 2015 at 12:20:19PM +0530, Kaushal M wrote: IMHO, doing hardening and security should be left the individual distributions and the package maintainers. Generally, each distribution has it's own policies with regards to hardening and security. We as an upstream project cannot decide on what a distribution should do. But we should be ready to fix bugs that could arise when distributions do hardened builds. So, I vote against having these hardening flags added to the base GlusterFS build. But we could add the flags the Fedora spec files which we carry with our source. Indeed, I agree that the compiler flags should be specified by the distributions. At least Fedora and Debian do this already include (probably different) options within their packaging scripts. We should set the flags we need, but not more. It would be annoying to set default flags that can conflict with others, or which are not (yet) available on architectures that we normally do not test. First thoughts: :) * We provide our own packaging scripts + distribute rpms/deb's from our own site too. Should we investigate/try these flags out for the packages we build + supply? At least for the RPMs, we try to follow the Fedora guidelines and their standard flags. With recent Fedora releases this includes additional hardening flags. * Are there changes in our code + debugging practises that would be needed for these security hardening flags to work? If there are, and we don't make these changes ourselves, doesn't that mean we're telling distributions they need to carry their own patch set in order to have a more secure GlusterFS? We have received several patches from the Debian maintainer that improve the handling of these options. When maintainers for distrubutions build GlusterFS and require changes, they either file bugs and/or send patches. I think this works quite well. Thanks Niels. Sounds like we're already in good shape then. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
Is it ok to put slave23.cloud.gluster.org into general rotation, so it runs regression jobs along with the rest? Sounds OK to me. Do we have a place to store the core tarball, just in case we decide we need to go back to it some day? ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] crypt xlator bug
On Thursday 02 April 2015 05:50 PM, Jeff Darcy wrote: I think, crypt xlator should do a mem_put of local after doing STACK_UNWIND like other xlators which also use mem_get for local (such as AFR). I am suspecting crypt not doing mem_put might be the reason for the bug mentioned. My understanding was that mem_put should be called automatically from FRAME_DESTROY, which is itself called from STACK_DESTROY when the fop completes (e.g. at FUSE or GFAPI). On the other hand, I see that AFR and others call mem_put themselves, without zeroing the local pointer. In my (possibly no longer relevant) experience, freeing local myself without zeroing the pointer would lead to a double free, and I don't see why that's not the case here. What am I missing? As per my understanding, the xlators which get local by mem_get should be doing below things in callback funtion just before unwinding: 1) save frame-local pointer (i.e. local = frame-local); 2) STACK_UNWIND 3) mem_put (local) After STACK_UNWIND and before mem_put any reference to fd or inode or dict that might be present in the local should be unrefed (also any allocated resources that are present in local should be freed). So mem_put is done at last. To avoid double free in FRAME_DESTROY, frame-local is set to NULL before doing STACK_UNWIND. I suspect not doing 1 of the above three operations (may be either 1st or 3rd) in crypt xlator might be the reason for the bug. Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel