Re: [Gluster-devel] crypt xlator bug

2015-04-03 Thread Emmanuel Dreyfus
On Fri, Apr 03, 2015 at 11:26:40AM +0530, Pranith Kumar Karampuri wrote:
> I still don't understand why http://review.gluster.org/10109 is working.
> Does anyone know the reason? How are you guys re-creating the crash? I ran
> crypt.t but no crashes on my laptop. Could some one help me re-create this
> issue.

The problem is 100% reproductible on NetBSD. Give it a try on nbslave75
which is disabled in jenkins right now.

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] crypt xlator bug

2015-04-02 Thread Pranith Kumar Karampuri


On 04/01/2015 03:27 PM, Emmanuel Dreyfus wrote:

Hi

crypt.t was recently broken in NetBSD regression. The glusterfs returns
a node with file type invalid to FUSE, and that breaks the test.

After running a git bisect, I found the offending commit after which
this behavior appeared:
 8a2e2b88fc21dc7879f838d18cd0413dd88023b7
 mem-pool: invalidate memory on GF_FREE to aid debugging

This means the bug has always been there, but this debugging aid
caused it to be reliable.

With the help of an assertion, I can detect when inode->ia_type gets
a corrupted value. It gives me this backtrace where in frame 4,
inode = 0xb9611880 and inode->ia_type = 12475 (which is wrong).
inode value comes from FUSE state->loc->inode and we get it from
frame 20 which is in crypt.c:

#4  0xb9bd2adf in mdc_inode_iatt_get (this=0xbb1df030,
 inode=0xb9611880, iatt=0xbf7fdfa0) at md-cache.c:471
#5  0xb9bd34e1 in mdc_lookup (frame=0xb9aa82b0, this=0xbb1df030,
 loc=0xb9608840, xdata=0x0) at md-cache.c:847
#6  0xb9bc216e in io_stats_lookup (frame=0xb9aa8200, this=0xbb1e0030,
 loc=0xb9608840, xdata=0x0) at io-stats.c:1934
#7  0xbb76755f in default_lookup (frame=0xb9aa8200, this=0xbb1d0030,
 loc=0xb9608840, xdata=0x0) at defaults.c:2138
#8  0xb9ba69cd in meta_lookup (frame=0xb9aa8200, this=0xbb1d0030,
 loc=0xb9608840, xdata=0x0) at meta.c:49
#9  0xbb277365 in fuse_lookup_resume (state=0xb9608830) at fuse-bridge.c:607
#10 0xbb276e07 in fuse_fop_resume (state=0xb9608830) at fuse-bridge.c:569
#11 0xbb274969 in fuse_resolve_done (state=0xb9608830) at fuse-resolve.c:644
#12 0xbb274a29 in fuse_resolve_all (state=0xb9608830) at fuse-resolve.c:671
#13 0xbb274941 in fuse_resolve (state=0xb9608830) at fuse-resolve.c:635
#14 0xbb274a06 in fuse_resolve_all (state=0xb9608830) at fuse-resolve.c:667
#15 0xbb274a8e in fuse_resolve_continue (state=0xb9608830) at fuse-resolve.c:687
#16 0xbb2731f4 in fuse_resolve_entry_cbk (frame=0xb9609688,
 cookie=0xb96140a0, this=0xbb193030, op_ret=0, op_errno=0,
 inode=0xb9611880, buf=0xb961e558, xattr=0xbb18a1a0,
 postparent=0xb961e628) at fuse-resolve.c:81
#17 0xb9bbd0c1 in io_stats_lookup_cbk (frame=0xb96140a0,
 cookie=0xb9614150, this=0xbb1e0030, op_ret=0, op_errno=0,
 inode=0xb9611880, buf=0xb961e558, xdata=0xbb18a1a0,
 postparent=0xb961e628) at io-stats.c:1512
#18 0xb9bd33ff in mdc_lookup_cbk (frame=0xb9614150, cookie=0xb9614410,
 this=0xbb1df030, op_ret=0, op_errno=0,
 inode=0xb9611880, stbuf=0xb961e558, dict=0xbb18a1a0,
  postparent=0xb961e628) at md-cache.c:816
#19 0xb9be2b10 in ioc_lookup_cbk (frame=0xb9614410, cookie=0xb96144c0,
 this=0xbb1de030, op_ret=0, op_errno=0,
 inode=0xb9611880, stbuf=0xb961e558, xdata=0xbb18a1a0,
 postparent=0xb961e628) at io-cache.c:260
#20 0xbb227fb5 in load_file_size (frame=0xb96144c0, cookie=0xb9aa8200,
 this=0xbb1db030, op_ret=0, op_errno=0,
 dict=0xbb18a470, xdata=0x0) at crypt.c:3830

In frame 20:
 case GF_FOP_LOOKUP:
STACK_UNWIND_STRICT(lookup,
frame,
op_ret,
op_errno,
op_ret >= 0 ? local->inode : NULL,
op_ret >= 0 ? &local->buf : NULL,
local->xdata,
op_ret >= 0 &local->postbuf : NULL);
  
Here is the problem, local->inode is not the 0xb9611880 value anymore,

which means local got corrupted:

(gdb) print local->inode
$2 = (inode_t *) 0x1db030de

I now suspect local has been freed, but I do not find where in crypt.c
this operation is done. There is a local = mem_get0(this->local_pool)
in crypt_alloc_local, but where is that structure freed? There is
no mem_put() call in crypt xlator.
I joined this thread after seeing raghavendra talur's patch which fixed 
the issue, which seemed extremely odd to me. Just checked this mail from 
you and
local->inode in crypt need not be same as state->loc->inode because, 
inode_link in fuse_resolve_entry_cbk will give address of already linked 
inode with same gfid if one exists. I see hardlink related commands in 
crypt.t so this could be part of looking up extra link may be? which is 
resolving to older inode that is already linked. It is still some memory 
problem, but may not be anything to do with crypt. Could you let me know 
the details of the setup where you saw this issue? I can take a look.


Pranith





___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] crypt xlator bug

2015-04-02 Thread Pranith Kumar Karampuri


On 04/02/2015 07:27 PM, Raghavendra Bhat wrote:

On Thursday 02 April 2015 05:50 PM, Jeff Darcy wrote:
I think, crypt xlator should do a mem_put of local after doing 
STACK_UNWIND

like other xlators which also use mem_get for local (such as AFR). I am
suspecting crypt not doing mem_put might be the reason for the bug
mentioned.

My understanding was that mem_put should be called automatically from
FRAME_DESTROY, which is itself called from STACK_DESTROY when the fop
completes (e.g. at FUSE or GFAPI).  On the other hand, I see that AFR
and others call mem_put themselves, without zeroing the local pointer.
In my (possibly no longer relevant) experience, freeing local myself
without zeroing the pointer would lead to a double free, and I don't
see why that's not the case here.  What am I missing?


As per my understanding, the xlators which get local by mem_get should 
be doing below things in callback funtion  just before unwinding:


1) save frame->local pointer (i.e. local = frame->local);
2) STACK_UNWIND
3) mem_put (local)

After STACK_UNWIND and before mem_put any reference to fd or inode or 
dict that might be present in the local should be unrefed (also any 
allocated resources that are present in local should be freed). So 
mem_put is done at last. To avoid double free in FRAME_DESTROY, 
frame->local is set to NULL before doing STACK_UNWIND.


I suspect not doing 1 of the above three operations (may be either 1st 
or 3rd) in crypt xlator might be the reason for the bug.
I still don't understand why http://review.gluster.org/10109 is working. 
Does anyone know the reason? How are you guys re-creating the crash? I 
ran crypt.t but no crashes on my laptop. Could some one help me 
re-create this issue.


Pranith


Regards,
Raghavendra Bhat


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] crypt xlator bug

2015-04-02 Thread Raghavendra Bhat

On Thursday 02 April 2015 05:50 PM, Jeff Darcy wrote:

I think, crypt xlator should do a mem_put of local after doing STACK_UNWIND
like other xlators which also use mem_get for local (such as AFR). I am
suspecting crypt not doing mem_put might be the reason for the bug
mentioned.

My understanding was that mem_put should be called automatically from
FRAME_DESTROY, which is itself called from STACK_DESTROY when the fop
completes (e.g. at FUSE or GFAPI).  On the other hand, I see that AFR
and others call mem_put themselves, without zeroing the local pointer.
In my (possibly no longer relevant) experience, freeing local myself
without zeroing the pointer would lead to a double free, and I don't
see why that's not the case here.  What am I missing?


As per my understanding, the xlators which get local by mem_get should 
be doing below things in callback funtion  just before unwinding:


1) save frame->local pointer (i.e. local = frame->local);
2) STACK_UNWIND
3) mem_put (local)

After STACK_UNWIND and before mem_put any reference to fd or inode or 
dict that might be present in the local should be unrefed (also any 
allocated resources that are present in local should be freed). So 
mem_put is done at last. To avoid double free in FRAME_DESTROY, 
frame->local is set to NULL before doing STACK_UNWIND.


I suspect not doing 1 of the above three operations (may be either 1st 
or 3rd) in crypt xlator might be the reason for the bug.


Regards,
Raghavendra Bhat


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] crypt xlator bug

2015-04-02 Thread Jeff Darcy
> I think, crypt xlator should do a mem_put of local after doing STACK_UNWIND
> like other xlators which also use mem_get for local (such as AFR). I am
> suspecting crypt not doing mem_put might be the reason for the bug
> mentioned.

My understanding was that mem_put should be called automatically from
FRAME_DESTROY, which is itself called from STACK_DESTROY when the fop
completes (e.g. at FUSE or GFAPI).  On the other hand, I see that AFR
and others call mem_put themselves, without zeroing the local pointer.
In my (possibly no longer relevant) experience, freeing local myself
without zeroing the pointer would lead to a double free, and I don't
see why that's not the case here.  What am I missing?
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] crypt xlator bug

2015-04-02 Thread Niels de Vos
On Thu, Apr 02, 2015 at 01:58:39PM +0530, Raghavendra Bhat wrote:
> On Thursday 02 April 2015 01:00 PM, Pranith Kumar Karampuri wrote:
> >
> >On 04/02/2015 12:27 AM, Raghavendra Talur wrote:
> >>
> >>
> >>On Wed, Apr 1, 2015 at 10:34 PM, Justin Clift  >>> wrote:
> >>
> >>On 1 Apr 2015, at 10:57, Emmanuel Dreyfus  >>> wrote:
> >>> Hi
> >>>
> >>> crypt.t was recently broken in NetBSD regression. The glusterfs
> >>returns
> >>> a node with file type invalid to FUSE, and that breaks the test.
> >>>
> >>> After running a git bisect, I found the offending commit after
> >>which
> >>> this behavior appeared:
> >>>8a2e2b88fc21dc7879f838d18cd0413dd88023b7
> >>>mem-pool: invalidate memory on GF_FREE to aid debugging
> >>>
> >>> This means the bug has always been there, but this debugging aid
> >>> caused it to be reliable.
> >>
> >>Sounds like that commit is a good win then. :)
> >>
> >>Harsha/Pranith/Lala, your names are on the git blame for crypt.c...
> >>any ideas? :)
> >>
> >>
> >>I found one issue that local is not allocated using GF_CALLOC and with a
> >>mem-type.
> >>This is a patch which *might* fix it.
> >>
> >>diff --git a/xlators/encryption/crypt/src/crypt-mem-types.h
> >>b/xlators/encryption/crypt/src/crypt-mem-types.h
> >>index 2eab921..c417b67 100644
> >>--- a/xlators/encryption/crypt/src/crypt-mem-types.h
> >>+++ b/xlators/encryption/crypt/src/crypt-mem-types.h
> >>@@ -24,6 +24,7 @@ enum gf_crypt_mem_types_ {
> >>gf_crypt_mt_key,
> >>gf_crypt_mt_iovec,
> >>gf_crypt_mt_char,
> >>+gf_crypt_mt_local,
> >>gf_crypt_mt_end,
> >> };
> >>diff --git a/xlators/encryption/crypt/src/crypt.c
> >>b/xlators/encryption/crypt/src/crypt.c
> >>index ae8cdb2..63c0977 100644
> >>--- a/xlators/encryption/crypt/src/crypt.c
> >>+++ b/xlators/encryption/crypt/src/crypt.c
> >>@@ -48,7 +48,7 @@ static crypt_local_t *crypt_alloc_local(call_frame_t
> >>*frame, xlator_t *this,
> >> {
> >>crypt_local_t *local = NULL;
> >>-   local = mem_get0(this->local_pool);
> >>+local = GF_CALLOC (sizeof (*local), 1, gf_crypt_mt_local);
> >local is using the memory from pool earlier(i.e. with mem_get0()). Which
> >seems ok to me. Changing it this way will include memory allocation in fop
> >I/O path which is why xlators generally use the mem-pool approach.
> >
> >Pranith
> 
> I think, crypt xlator should do a mem_put of local after doing STACK_UNWIND
> like other xlators which also use mem_get for local (such as AFR). I am
> suspecting crypt not doing mem_put might be the reason for the bug
> mentioned.

I've looked at this now too. The use of mem_get0() seems fine to me. But
indeed, calling mem_put() is missing. Whenever the crypt_local_t should
be released, mem_put() should get called, just like any
GF_CALLOC/GF_FREE combinations.

HTH,
Niels

> 
> Regards,
> Raghavendra Bat
> 
> >>if (!local) {
> >>gf_log(this->name, GF_LOG_ERROR, "out of memory");
> >>return NULL;
> >>
> >>
> >>Niels should be able to recognize if this is sufficient fix or not.
> >>
> >>Thanks,
> >>Raghavendra Talur
> >>
> >>+ Justin
> >>
> >>--
> >>GlusterFS - http://www.gluster.org
> >>
> >>An open source, distributed file system scaling to several
> >>petabytes, and handling thousands of clients.
> >>
> >>My personal twitter: twitter.com/realjustinclift
> >>
> >>
> >>___
> >>Gluster-devel mailing list
> >>Gluster-devel@gluster.org 
> >>http://www.gluster.org/mailman/listinfo/gluster-devel
> >>
> >>
> >>
> >>
> >>-- 
> >>*Raghavendra Talur *
> >>
> >
> >
> >
> >___
> >Gluster-devel mailing list
> >Gluster-devel@gluster.org
> >http://www.gluster.org/mailman/listinfo/gluster-devel
> 

> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel



pgp8J6DUYue6M.pgp
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] crypt xlator bug

2015-04-02 Thread Raghavendra Bhat

On Thursday 02 April 2015 01:00 PM, Pranith Kumar Karampuri wrote:


On 04/02/2015 12:27 AM, Raghavendra Talur wrote:



On Wed, Apr 1, 2015 at 10:34 PM, Justin Clift > wrote:


On 1 Apr 2015, at 10:57, Emmanuel Dreyfus mailto:m...@netbsd.org>> wrote:
> Hi
>
> crypt.t was recently broken in NetBSD regression. The glusterfs
returns
> a node with file type invalid to FUSE, and that breaks the test.
>
> After running a git bisect, I found the offending commit after
which
> this behavior appeared:
>8a2e2b88fc21dc7879f838d18cd0413dd88023b7
>mem-pool: invalidate memory on GF_FREE to aid debugging
>
> This means the bug has always been there, but this debugging aid
> caused it to be reliable.

Sounds like that commit is a good win then. :)

Harsha/Pranith/Lala, your names are on the git blame for crypt.c...
any ideas? :)


I found one issue that local is not allocated using GF_CALLOC and 
with a mem-type.

This is a patch which *might* fix it.

diff --git a/xlators/encryption/crypt/src/crypt-mem-types.h 
b/xlators/encryption/crypt/src/crypt-mem-types.h

index 2eab921..c417b67 100644
--- a/xlators/encryption/crypt/src/crypt-mem-types.h
+++ b/xlators/encryption/crypt/src/crypt-mem-types.h
@@ -24,6 +24,7 @@ enum gf_crypt_mem_types_ {
gf_crypt_mt_key,
gf_crypt_mt_iovec,
gf_crypt_mt_char,
+gf_crypt_mt_local,
gf_crypt_mt_end,
 };
diff --git a/xlators/encryption/crypt/src/crypt.c 
b/xlators/encryption/crypt/src/crypt.c

index ae8cdb2..63c0977 100644
--- a/xlators/encryption/crypt/src/crypt.c
+++ b/xlators/encryption/crypt/src/crypt.c
@@ -48,7 +48,7 @@ static crypt_local_t 
*crypt_alloc_local(call_frame_t *frame, xlator_t *this,

 {
crypt_local_t *local = NULL;
-   local = mem_get0(this->local_pool);
+local = GF_CALLOC (sizeof (*local), 1, gf_crypt_mt_local);
local is using the memory from pool earlier(i.e. with mem_get0()). 
Which seems ok to me. Changing it this way will include memory 
allocation in fop I/O path which is why xlators generally use the 
mem-pool approach.


Pranith


I think, crypt xlator should do a mem_put of local after doing 
STACK_UNWIND like other xlators which also use mem_get for local (such 
as AFR). I am suspecting crypt not doing mem_put might be the reason for 
the bug mentioned.


Regards,
Raghavendra Bat


if (!local) {
gf_log(this->name, GF_LOG_ERROR, "out of memory");
return NULL;


Niels should be able to recognize if this is sufficient fix or not.

Thanks,
Raghavendra Talur

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift


___
Gluster-devel mailing list
Gluster-devel@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-devel




--
*Raghavendra Talur *





___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] crypt xlator bug

2015-04-02 Thread Pranith Kumar Karampuri


On 04/02/2015 12:27 AM, Raghavendra Talur wrote:



On Wed, Apr 1, 2015 at 10:34 PM, Justin Clift > wrote:


On 1 Apr 2015, at 10:57, Emmanuel Dreyfus mailto:m...@netbsd.org>> wrote:
> Hi
>
> crypt.t was recently broken in NetBSD regression. The glusterfs
returns
> a node with file type invalid to FUSE, and that breaks the test.
>
> After running a git bisect, I found the offending commit after which
> this behavior appeared:
>8a2e2b88fc21dc7879f838d18cd0413dd88023b7
>mem-pool: invalidate memory on GF_FREE to aid debugging
>
> This means the bug has always been there, but this debugging aid
> caused it to be reliable.

Sounds like that commit is a good win then. :)

Harsha/Pranith/Lala, your names are on the git blame for crypt.c...
any ideas? :)


I found one issue that local is not allocated using GF_CALLOC and with 
a mem-type.

This is a patch which *might* fix it.

diff --git a/xlators/encryption/crypt/src/crypt-mem-types.h 
b/xlators/encryption/crypt/src/crypt-mem-types.h

index 2eab921..c417b67 100644
--- a/xlators/encryption/crypt/src/crypt-mem-types.h
+++ b/xlators/encryption/crypt/src/crypt-mem-types.h
@@ -24,6 +24,7 @@ enum gf_crypt_mem_types_ {
gf_crypt_mt_key,
gf_crypt_mt_iovec,
gf_crypt_mt_char,
+gf_crypt_mt_local,
gf_crypt_mt_end,
 };
diff --git a/xlators/encryption/crypt/src/crypt.c 
b/xlators/encryption/crypt/src/crypt.c

index ae8cdb2..63c0977 100644
--- a/xlators/encryption/crypt/src/crypt.c
+++ b/xlators/encryption/crypt/src/crypt.c
@@ -48,7 +48,7 @@ static crypt_local_t *crypt_alloc_local(call_frame_t 
*frame, xlator_t *this,

 {
crypt_local_t *local = NULL;
-   local = mem_get0(this->local_pool);
+local = GF_CALLOC (sizeof (*local), 1, gf_crypt_mt_local);
local is using the memory from pool earlier(i.e. with mem_get0()). Which 
seems ok to me. Changing it this way will include memory allocation in 
fop I/O path which is why xlators generally use the mem-pool approach.


Pranith

if (!local) {
gf_log(this->name, GF_LOG_ERROR, "out of memory");
return NULL;


Niels should be able to recognize if this is sufficient fix or not.

Thanks,
Raghavendra Talur

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift


___
Gluster-devel mailing list
Gluster-devel@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-devel




--
*Raghavendra Talur *



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] crypt xlator bug

2015-04-01 Thread Emmanuel Dreyfus
Jeff Darcy  wrote:

> > It does. The memory corruption disapeared and the test can complete.
> 
> Interesting.  

FWIW I made 147 successful runs in a row with the patch, while it always
failed without.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] crypt xlator bug

2015-04-01 Thread Jeff Darcy
> > I found one issue that local is not allocated using GF_CALLOC and with a
> > mem-type.
> > This is a patch which *might* fix it.
> 
> It does. The memory corruption disapeared and the test can complete.

Interesting.  I suspect this means that we *are* in the case where the
previous comment came from.  Mem_get can allocate objects two ways:

* As one of many objects in a slab, tracking internally.

* As a singleton, directly via GF_*ALLOC.

In mem_put, we do some pretty nasty pointer arithmetic to figure out
which way an object was allocated.  If we get it wrong, and therefore
use the wrong *de*allocate method (either way I believe) then we'll
corrupt memory.  The symptoms so far suggest that an object was
allocated within a slab, then deallocated as a singleton (causing its
memory to be poisoned).  That sucks.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] crypt xlator bug

2015-04-01 Thread Emmanuel Dreyfus
Raghavendra Talur  wrote:

> I found one issue that local is not allocated using GF_CALLOC and with a
> mem-type.
> This is a patch which *might* fix it.

It does. The memory corruption disapeared and the test can complete.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] crypt xlator bug

2015-04-01 Thread Jeff Darcy
> As I understand it, mem_get0 is a valid (and even more
> efficient) way to allocate such objects.  The frame cleanup
> code should recognize which method to use when deallocating.
> If that's broken, we're going to have more numerous and
> serious problems than this.  I'll look into it further.

I don't see anything obviously wrong, but I did find this
gem of a comment in mem_get:

* I am working around this by performing a regular allocation
* , just the way the caller would've done when not using the
* mem-pool. That also means, we're not padding the size with
* the list_head structure because, this will not be added to
* the list of chunks that belong to the mem-pool allocated
* initially.
*
* This is the best we can do without adding functionality for
* managing multiple slabs. That does not interest us at present
* because it is too much work knowing that a better slab
* allocator is coming RSN.

Now I'm curious to find out what effect your change will have,
but I suspect we'll still be a while figuring this out.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] crypt xlator bug

2015-04-01 Thread Jeff Darcy
> - local = mem_get0(this->local_pool);
> + local = GF_CALLOC (sizeof (*local), 1, gf_crypt_mt_local);

As I understand it, mem_get0 is a valid (and even more
efficient) way to allocate such objects.  The frame cleanup
code should recognize which method to use when deallocating.
If that's broken, we're going to have more numerous and
serious problems than this.  I'll look into it further.

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] crypt xlator bug

2015-04-01 Thread Raghavendra Talur
On Wed, Apr 1, 2015 at 10:34 PM, Justin Clift  wrote:

> On 1 Apr 2015, at 10:57, Emmanuel Dreyfus  wrote:
> > Hi
> >
> > crypt.t was recently broken in NetBSD regression. The glusterfs returns
> > a node with file type invalid to FUSE, and that breaks the test.
> >
> > After running a git bisect, I found the offending commit after which
> > this behavior appeared:
> >8a2e2b88fc21dc7879f838d18cd0413dd88023b7
> >mem-pool: invalidate memory on GF_FREE to aid debugging
> >
> > This means the bug has always been there, but this debugging aid
> > caused it to be reliable.
>
> Sounds like that commit is a good win then. :)
>
> Harsha/Pranith/Lala, your names are on the git blame for crypt.c...
> any ideas? :)
>
>
I found one issue that local is not allocated using GF_CALLOC and with a
mem-type.
This is a patch which *might* fix it.

diff --git a/xlators/encryption/crypt/src/crypt-mem-types.h
b/xlators/encryption/crypt/src/crypt-mem-types.h
index 2eab921..c417b67 100644
--- a/xlators/encryption/crypt/src/crypt-mem-types.h
+++ b/xlators/encryption/crypt/src/crypt-mem-types.h
@@ -24,6 +24,7 @@ enum gf_crypt_mem_types_ {
gf_crypt_mt_key,
gf_crypt_mt_iovec,
gf_crypt_mt_char,
+gf_crypt_mt_local,
gf_crypt_mt_end,
 };

diff --git a/xlators/encryption/crypt/src/crypt.c
b/xlators/encryption/crypt/src/crypt.c
index ae8cdb2..63c0977 100644
--- a/xlators/encryption/crypt/src/crypt.c
+++ b/xlators/encryption/crypt/src/crypt.c
@@ -48,7 +48,7 @@ static crypt_local_t *crypt_alloc_local(call_frame_t
*frame, xlator_t *this,
 {
crypt_local_t *local = NULL;

-   local = mem_get0(this->local_pool);
+local = GF_CALLOC (sizeof (*local), 1, gf_crypt_mt_local);
if (!local) {
gf_log(this->name, GF_LOG_ERROR, "out of memory");
return NULL;


Niels should be able to recognize if this is sufficient fix or not.

Thanks,
Raghavendra Talur


> + Justin
>
> --
> GlusterFS - http://www.gluster.org
>
> An open source, distributed file system scaling to several
> petabytes, and handling thousands of clients.
>
> My personal twitter: twitter.com/realjustinclift
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>



-- 
*Raghavendra Talur *
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] crypt xlator bug

2015-04-01 Thread Justin Clift
On 1 Apr 2015, at 10:57, Emmanuel Dreyfus  wrote:
> Hi
> 
> crypt.t was recently broken in NetBSD regression. The glusterfs returns
> a node with file type invalid to FUSE, and that breaks the test.
> 
> After running a git bisect, I found the offending commit after which 
> this behavior appeared:
>8a2e2b88fc21dc7879f838d18cd0413dd88023b7
>mem-pool: invalidate memory on GF_FREE to aid debugging
> 
> This means the bug has always been there, but this debugging aid 
> caused it to be reliable.

Sounds like that commit is a good win then. :)

Harsha/Pranith/Lala, your names are on the git blame for crypt.c...
any ideas? :)

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel