Re: [Gluster-users] Community Meeting, Sept 26 15:00 UTC in #gluster-meeting

2018-09-25 Thread Amar Tumballi
Updated the sheet with below details, happy to hear more from our users:
-

   - Date: 2018-09-26
   - Host: Amar Tumballi
   - Location: #gluster-meeting on Freenode (webchat
   )
   - Time:
  - 15:00 UTC
  - or in your local shell/terminal: date -d "15:00 UTC"

Topics
of Discussion

   - How can we do ‘triaging better’ (both in bugzilla / github issues) ?
   - Release 5.0, and the focus on stability
   - What is the best way to update community on progress across different
   project?
  - Should we accept a project only if there is at least once in a
  month update on the project to mailing list?




This will be in #gluster-meeting on freenode, and our agenda (which is
> always editable!) is at https://bit.ly/gluster-community-meetings.
>
>
Update above sheet if you want something to be discussed!

-- 
Amar Tumballi (amarts)
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Community Meeting, Sept 26 15:00 UTC in #gluster-meeting

2018-09-25 Thread Amye Scavarda
Hey there!
We've got a community meeting upcoming tomorrow at 15:00 UTC.
This will be in #gluster-meeting on freenode, and our agenda (which is
always editable!) is at https://bit.ly/gluster-community-meetings.

- amye

-- 
Amye Scavarda | a...@redhat.com | Gluster Community Lead
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Crash in glusterfs!!!

2018-09-25 Thread ABHISHEK PALIWAL
Hi Pranith,

I have some questions if you can answer them:


What in LIBC exit() routine has resulted in SIGSEGV in this case ?

- Why the call trace always point to LIBC exit() in all these crash
instances on gluster ?

- Can there be any connection between LIBC exit() crash and SIGTERM
handling at early start of gluster ?



 Regards,

Abhishek

On Tue, Sep 25, 2018 at 2:27 PM Pranith Kumar Karampuri 
wrote:

>
>
> On Tue, Sep 25, 2018 at 2:17 PM ABHISHEK PALIWAL 
> wrote:
>
>> I don't have the step to reproduce, but its a race condition where it
>> seems cleanup_and_exit() is accessing the data structure which are not yet
>> initialised (as gluster is in starting phase), due to SIGTERM/SIGINT is
>> sent in between.
>>
>
> But the crash happened inside exit() code for which will be in libc which
> doesn't access any data structures in glusterfs.
>
>
>>
>> Regards,
>> Abhishek
>>
>> On Mon, Sep 24, 2018 at 9:11 PM Pranith Kumar Karampuri <
>> pkara...@redhat.com> wrote:
>>
>>>
>>>
>>> On Mon, Sep 24, 2018 at 5:16 PM ABHISHEK PALIWAL <
>>> abhishpali...@gmail.com> wrote:
>>>
 Hi Pranith,

 As we know this problem is getting triggered at startup of the glusterd
 process when it received the SIGTERM.

 I think there is a problem in glusterfs code, if at startup someone
 sent the SIGTERM the exit handler should not be crash instead it should
 with some information.

 Could please let me know the possibility to fix it from glusterfs side?

>>>
>>> I am not as confident as you about the RC you provided. If you could
>>> give the steps to re-create, I will be happy to confirm that the RC is
>>> correct and then I will send out the fix.
>>>
>>>

 Regards,
 Abhishek

 On Mon, Sep 24, 2018 at 3:12 PM Pranith Kumar Karampuri <
 pkara...@redhat.com> wrote:

>
>
> On Mon, Sep 24, 2018 at 2:09 PM ABHISHEK PALIWAL <
> abhishpali...@gmail.com> wrote:
>
>> Could you please let me know about the bug in libc which you are
>> talking.
>>
>
> No, I mean, if you give the steps to reproduce, we will be able to pin
> point if the issue is with libc or glusterfs.
>
>
>>
>> On Mon, Sep 24, 2018 at 2:01 PM Pranith Kumar Karampuri <
>> pkara...@redhat.com> wrote:
>>
>>>
>>>
>>> On Mon, Sep 24, 2018 at 1:57 PM ABHISHEK PALIWAL <
>>> abhishpali...@gmail.com> wrote:
>>>
 If you see the source code in cleanup_and_exit() we are getting the
 SIGSEGV crash when 'exit(0)' is triggered.

>>>
>>> yes, that is what I was mentioning earlier. It is crashing in libc.
>>> So either there is a bug in libc (glusterfs actually found 1 bug so far 
>>> in
>>> libc, so I wouldn't rule out that possibility) or there is something 
>>> that
>>> is happening in glusterfs which is leading to the problem.
>>> Valgrind/address-sanitizer would help find where the problem could be in
>>> some cases, so before reaching out libc developers, it is better to 
>>> figure
>>> out where the problem is. Do you have steps to recreate it?
>>>
>>>

 On Mon, Sep 24, 2018 at 1:41 PM Pranith Kumar Karampuri <
 pkara...@redhat.com> wrote:

>
>
> On Mon, Sep 24, 2018 at 1:36 PM ABHISHEK PALIWAL <
> abhishpali...@gmail.com> wrote:
>
>> Hi Sanju,
>>
>> Do you have any update on this?
>>
>
> This seems to happen while the process is dying, in libc. I am not
> completely sure if there is anything glusterfs is contributing to it 
> from
> the bt at the moment. Do you have any steps to re-create this 
> problem? It
> is probably better to run the steps with valgrind/address-sanitizer 
> and see
> if it points to the problem in glusterfs.
>
>
>>
>> Regards,
>> Abhishek
>>
>> On Fri, Sep 21, 2018 at 4:07 PM ABHISHEK PALIWAL <
>> abhishpali...@gmail.com> wrote:
>>
>>> Hi Sanju,
>>>
>>> Output of 't a a bt full'
>>>
>>> (gdb) t a a bt full
>>>
>>>
>>>
>>> Thread 7 (LWP 1743):
>>>
>>> #0  0x3fffa3ea7e88 in __lll_lock_wait (futex=0x0, private=0)
>>> at lowlevellock.c:43
>>>
>>> r4 = 128
>>>
>>> r7 = 0
>>>
>>> arg2 = 128
>>>
>>> r5 = 2
>>>
>>> r8 = 1
>>>
>>> r0 = 221
>>>
>>> r3 = 0
>>>
>>> r6 = 0
>>>
>>> arg1 = 0
>>>
>>> __err = 221
>>>
>>> __ret = 0
>>>
>>> #1  0x3fffa3e9ef64 in __GI___pthread_mutex_lock
>>> (mutex=0x100272a8) at 

Re: [Gluster-users] [Gluster-devel] Crash in glusterfs!!!

2018-09-25 Thread Pranith Kumar Karampuri
On Tue, Sep 25, 2018 at 2:17 PM ABHISHEK PALIWAL 
wrote:

> I don't have the step to reproduce, but its a race condition where it
> seems cleanup_and_exit() is accessing the data structure which are not yet
> initialised (as gluster is in starting phase), due to SIGTERM/SIGINT is
> sent in between.
>

But the crash happened inside exit() code for which will be in libc which
doesn't access any data structures in glusterfs.


>
> Regards,
> Abhishek
>
> On Mon, Sep 24, 2018 at 9:11 PM Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
>>
>>
>> On Mon, Sep 24, 2018 at 5:16 PM ABHISHEK PALIWAL 
>> wrote:
>>
>>> Hi Pranith,
>>>
>>> As we know this problem is getting triggered at startup of the glusterd
>>> process when it received the SIGTERM.
>>>
>>> I think there is a problem in glusterfs code, if at startup someone sent
>>> the SIGTERM the exit handler should not be crash instead it should with
>>> some information.
>>>
>>> Could please let me know the possibility to fix it from glusterfs side?
>>>
>>
>> I am not as confident as you about the RC you provided. If you could give
>> the steps to re-create, I will be happy to confirm that the RC is correct
>> and then I will send out the fix.
>>
>>
>>>
>>> Regards,
>>> Abhishek
>>>
>>> On Mon, Sep 24, 2018 at 3:12 PM Pranith Kumar Karampuri <
>>> pkara...@redhat.com> wrote:
>>>


 On Mon, Sep 24, 2018 at 2:09 PM ABHISHEK PALIWAL <
 abhishpali...@gmail.com> wrote:

> Could you please let me know about the bug in libc which you are
> talking.
>

 No, I mean, if you give the steps to reproduce, we will be able to pin
 point if the issue is with libc or glusterfs.


>
> On Mon, Sep 24, 2018 at 2:01 PM Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
>>
>>
>> On Mon, Sep 24, 2018 at 1:57 PM ABHISHEK PALIWAL <
>> abhishpali...@gmail.com> wrote:
>>
>>> If you see the source code in cleanup_and_exit() we are getting the
>>> SIGSEGV crash when 'exit(0)' is triggered.
>>>
>>
>> yes, that is what I was mentioning earlier. It is crashing in libc.
>> So either there is a bug in libc (glusterfs actually found 1 bug so far 
>> in
>> libc, so I wouldn't rule out that possibility) or there is something that
>> is happening in glusterfs which is leading to the problem.
>> Valgrind/address-sanitizer would help find where the problem could be in
>> some cases, so before reaching out libc developers, it is better to 
>> figure
>> out where the problem is. Do you have steps to recreate it?
>>
>>
>>>
>>> On Mon, Sep 24, 2018 at 1:41 PM Pranith Kumar Karampuri <
>>> pkara...@redhat.com> wrote:
>>>


 On Mon, Sep 24, 2018 at 1:36 PM ABHISHEK PALIWAL <
 abhishpali...@gmail.com> wrote:

> Hi Sanju,
>
> Do you have any update on this?
>

 This seems to happen while the process is dying, in libc. I am not
 completely sure if there is anything glusterfs is contributing to it 
 from
 the bt at the moment. Do you have any steps to re-create this problem? 
 It
 is probably better to run the steps with valgrind/address-sanitizer 
 and see
 if it points to the problem in glusterfs.


>
> Regards,
> Abhishek
>
> On Fri, Sep 21, 2018 at 4:07 PM ABHISHEK PALIWAL <
> abhishpali...@gmail.com> wrote:
>
>> Hi Sanju,
>>
>> Output of 't a a bt full'
>>
>> (gdb) t a a bt full
>>
>>
>>
>> Thread 7 (LWP 1743):
>>
>> #0  0x3fffa3ea7e88 in __lll_lock_wait (futex=0x0, private=0)
>> at lowlevellock.c:43
>>
>> r4 = 128
>>
>> r7 = 0
>>
>> arg2 = 128
>>
>> r5 = 2
>>
>> r8 = 1
>>
>> r0 = 221
>>
>> r3 = 0
>>
>> r6 = 0
>>
>> arg1 = 0
>>
>> __err = 221
>>
>> __ret = 0
>>
>> #1  0x3fffa3e9ef64 in __GI___pthread_mutex_lock
>> (mutex=0x100272a8) at ../nptl/pthread_mutex_lock.c:81
>>
>> __futex = 0x100272a8
>>
>> __PRETTY_FUNCTION__ = "__pthread_mutex_lock"
>>
>> type = 
>>
>> id = 
>>
>> #2  0x3fffa3f6ce8c in _gf_msg (domain=0x3fff98006c90
>> "c_glusterfs-client-0", file=0x3fff9fb34de0 "client.c",
>> function=0x3fff9fb34cd8 <__FUNCTION__.18849> "notify",
>>
>> line=, level=,
>> errnum=, trace=, msgid=114020,
>>
>> fmt=0x3fff9fb35350 "parent translators are ready, 

Re: [Gluster-users] [Gluster-devel] Crash in glusterfs!!!

2018-09-25 Thread ABHISHEK PALIWAL
I don't have the step to reproduce, but its a race condition where it seems
cleanup_and_exit() is accessing the data structure which are not yet
initialised (as gluster is in starting phase), due to SIGTERM/SIGINT is
sent in between.

Regards,
Abhishek

On Mon, Sep 24, 2018 at 9:11 PM Pranith Kumar Karampuri 
wrote:

>
>
> On Mon, Sep 24, 2018 at 5:16 PM ABHISHEK PALIWAL 
> wrote:
>
>> Hi Pranith,
>>
>> As we know this problem is getting triggered at startup of the glusterd
>> process when it received the SIGTERM.
>>
>> I think there is a problem in glusterfs code, if at startup someone sent
>> the SIGTERM the exit handler should not be crash instead it should with
>> some information.
>>
>> Could please let me know the possibility to fix it from glusterfs side?
>>
>
> I am not as confident as you about the RC you provided. If you could give
> the steps to re-create, I will be happy to confirm that the RC is correct
> and then I will send out the fix.
>
>
>>
>> Regards,
>> Abhishek
>>
>> On Mon, Sep 24, 2018 at 3:12 PM Pranith Kumar Karampuri <
>> pkara...@redhat.com> wrote:
>>
>>>
>>>
>>> On Mon, Sep 24, 2018 at 2:09 PM ABHISHEK PALIWAL <
>>> abhishpali...@gmail.com> wrote:
>>>
 Could you please let me know about the bug in libc which you are
 talking.

>>>
>>> No, I mean, if you give the steps to reproduce, we will be able to pin
>>> point if the issue is with libc or glusterfs.
>>>
>>>

 On Mon, Sep 24, 2018 at 2:01 PM Pranith Kumar Karampuri <
 pkara...@redhat.com> wrote:

>
>
> On Mon, Sep 24, 2018 at 1:57 PM ABHISHEK PALIWAL <
> abhishpali...@gmail.com> wrote:
>
>> If you see the source code in cleanup_and_exit() we are getting the
>> SIGSEGV crash when 'exit(0)' is triggered.
>>
>
> yes, that is what I was mentioning earlier. It is crashing in libc. So
> either there is a bug in libc (glusterfs actually found 1 bug so far in
> libc, so I wouldn't rule out that possibility) or there is something that
> is happening in glusterfs which is leading to the problem.
> Valgrind/address-sanitizer would help find where the problem could be in
> some cases, so before reaching out libc developers, it is better to figure
> out where the problem is. Do you have steps to recreate it?
>
>
>>
>> On Mon, Sep 24, 2018 at 1:41 PM Pranith Kumar Karampuri <
>> pkara...@redhat.com> wrote:
>>
>>>
>>>
>>> On Mon, Sep 24, 2018 at 1:36 PM ABHISHEK PALIWAL <
>>> abhishpali...@gmail.com> wrote:
>>>
 Hi Sanju,

 Do you have any update on this?

>>>
>>> This seems to happen while the process is dying, in libc. I am not
>>> completely sure if there is anything glusterfs is contributing to it 
>>> from
>>> the bt at the moment. Do you have any steps to re-create this problem? 
>>> It
>>> is probably better to run the steps with valgrind/address-sanitizer and 
>>> see
>>> if it points to the problem in glusterfs.
>>>
>>>

 Regards,
 Abhishek

 On Fri, Sep 21, 2018 at 4:07 PM ABHISHEK PALIWAL <
 abhishpali...@gmail.com> wrote:

> Hi Sanju,
>
> Output of 't a a bt full'
>
> (gdb) t a a bt full
>
>
>
> Thread 7 (LWP 1743):
>
> #0  0x3fffa3ea7e88 in __lll_lock_wait (futex=0x0, private=0)
> at lowlevellock.c:43
>
> r4 = 128
>
> r7 = 0
>
> arg2 = 128
>
> r5 = 2
>
> r8 = 1
>
> r0 = 221
>
> r3 = 0
>
> r6 = 0
>
> arg1 = 0
>
> __err = 221
>
> __ret = 0
>
> #1  0x3fffa3e9ef64 in __GI___pthread_mutex_lock
> (mutex=0x100272a8) at ../nptl/pthread_mutex_lock.c:81
>
> __futex = 0x100272a8
>
> __PRETTY_FUNCTION__ = "__pthread_mutex_lock"
>
> type = 
>
> id = 
>
> #2  0x3fffa3f6ce8c in _gf_msg (domain=0x3fff98006c90
> "c_glusterfs-client-0", file=0x3fff9fb34de0 "client.c",
> function=0x3fff9fb34cd8 <__FUNCTION__.18849> "notify",
>
> line=, level=, errnum= out>, trace=, msgid=114020,
>
> fmt=0x3fff9fb35350 "parent translators are ready, attempting
> connect on transport") at logging.c:2058
>
> ret = 
>
> msgstr = 
>
> ap = 
>
> this = 0x3fff980061f0
>
> ctx = 0x10027010
>
> callstr = '\000' 
>
> passcallstr = 0
>
> log_inited = 0