Re: [Gluster-devel] Update on mgmt_v3-locks.t failure in netbsd

2015-04-28 Thread Kaushal M
Posted changes for review,
https://review.gluster.org/10425
https://review.gluster.org/10426

~kaushal

On Tue, Apr 28, 2015 at 6:34 PM, Kaushal M kshlms...@gmail.com wrote:
 After a long discussion (not really), we've concluded that a 64-bit
 uint is really overkill for how we are using the generation number,
 and I'll change it to a 32-bit uint. I'll send a change to do this
 right away.

 The generation number we use is not permanent, it is valid only within
 a GlusterD processe's lifetime. It starts fresh on every GlusterD
 start. We don't save it or restore it. The generation number is bumped
 for each peerinfo object allocated. Considering this, even if were to
 do peer probes every second it'd be ages before we overflow with a
 32-bit generation number. If someone is somehow able to bombard us
 with 4 billion requests, I expect GlusterD to bork before the
 generation number ever reaches the overflow point.

 ~kaushal

 On Tue, Apr 28, 2015 at 6:13 PM, Emmanuel Dreyfus m...@netbsd.org wrote:
 On Tue, Apr 28, 2015 at 06:08:45PM +0530, Vijay Bellur wrote:
 Let us retain support for 32-bit platforms. Though as developers we do not
 run many tests on non 64-bit platforms, it would be not be good to break
 compatibility for 32-bit platforms as we have many community users running
 this

 That is a point in favor of not migrating NetBSD slave VM running
 regression to 64 bit systems. Keeping them as is will catch any
 32 bit regressuin.

 --
 Emmanuel Dreyfus
 m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Update on mgmt_v3-locks.t failure in netbsd

2015-04-28 Thread Kaushal M
AFAIU we've not officially supported 32-bit architectures for sometime
(possibly for ever) as a community. But we had users running it
anyway.

3.7 as it is currently, cannot run on 32-bit platforms. I've used
atomic operations which depend on specific processor instructions, to
increment a 64-bit (uint64_t) generation number. This is not possible
on 32-bit platforms.

I'm currently debating with myself and others around me, if we can be
satisfied with using a 32-bit generation number. I used a 64-bit
generation number, just to give us the widest possible range before
running out. But a 32-bit generation number should be more than enough
for normal usage.

I'll update the list once we come to a decision.

~kaushal

On Tue, Apr 28, 2015 at 5:37 PM, Justin Clift jus...@gluster.org wrote:
 Does this mean we're officially no longer supporting 32 bit architectures?

 (or is that just on x86?)

 + Justin


 On 28 Apr 2015, at 12:45, Kaushal M kshlms...@gmail.com wrote:
 Found the problem. The NetBSD slaves are running a 32-bit kernel and 
 userspace.
 ```
 nbslave7a# uname -p
 i386
 ```

 Because of this CAA_BITS_PER_LONG is set to 32 and the case for size 8
 isn't compiled in uatomic_add_return. Even though the underlying
 (virtual) hardware has 64-bit support, and supports the required
 8-byte wide instrcution, it cannot be used because we are running on a
 32-bit kernel with a 32-bit userspace.

 Manu, was there any particular reason why you 32-bit NetBSD? If there
 are none, can you please replace the VMs with 64-bit NetBSD. Until
 then you can keep mgmt_v3-locks.t disabled.

 ~kaushal

 On Tue, Apr 28, 2015 at 4:56 PM, Kaushal M kshlms...@gmail.com wrote:
 I seem to have found the issue.

 The uatomic_add_return function is defined in urcu/uatomic.h as
 ```
 /* uatomic_add_return */

 static inline __attribute__((always_inline))
 unsigned long __uatomic_add_return(void *addr, unsigned long val,
int len)
 {
   switch (len) {
   case 1:
   {
   unsigned char result = val;

   __asm__ __volatile__(
   lock; xaddb %1, %0
   : +m(*__hp(addr)), +q (result)
   :
   : memory);
   return result + (unsigned char)val;
   }
   case 2:
   {
   unsigned short result = val;

   __asm__ __volatile__(
   lock; xaddw %1, %0
   : +m(*__hp(addr)), +r (result)
   :
   : memory);
   return result + (unsigned short)val;
   }
   case 4:
   {
   unsigned int result = val;

   __asm__ __volatile__(
   lock; xaddl %1, %0
   : +m(*__hp(addr)), +r (result)
   :
   : memory);
   return result + (unsigned int)val;
   }
 #if (CAA_BITS_PER_LONG == 64)
   case 8:
   {
   unsigned long result = val;

   __asm__ __volatile__(
   lock; xaddq %1, %0
   : +m(*__hp(addr)), +r (result)
   :
   : memory);
   return result + (unsigned long)val;
   }
 #endif
   }
   /*
* generate an illegal instruction. Cannot catch this with
* linker tricks when optimizations are disabled.
*/
   __asm__ __volatile__(ud2);
   return 0;
 }
 ```

 As we can see, uatomic_add_return uses different assembly instructions
 to perform the add based on the size of the datatype of the value. If
 the size of the value doesn't exactly match one of the sizes in the
 switch case, it deliberately generates a SIGILL.

 The case for size 8, is conditionally compiled as we can see above.
 From the backtrace Atin provided earlier, we see that the size of the
 value is indeed 8 (we use uint64_t). Because we had a SIGILL, we can
 conclude that the case for size 8 wasn't compiled.

 I don't know why this compilation didn't (or as this is in a header
 file, doesn't) happen on the NetBSD slaves and this is something I'd
 like to find out.

 ~kaushal

 On Tue, Apr 28, 2015 at 1:50 PM, Anand Nekkunti anekk...@redhat.com wrote:

 On 04/28/2015 01:40 PM, Emmanuel Dreyfus wrote:

 On Tue, Apr 28, 2015 at 01:37:42PM +0530, Anand Nekkunti wrote:

__asm__ is for to write assembly code in c (gcc),
 __volatile__(:::)
 compiler level  barrier to force the compiler not to do reorder the
 instructions(to avoid optimization ) .

 Sure, but the gory details should be of no interest to the developer
 engaged in debug: if it crashes this is probably because it is called
 with wrong arguments, hence the question:
  ccing gluster-devel

 new_peer-generation = uatomic_add_return (conf-generation,
 1);
 Are new_peer-generation and conf-generation sane?


 ___
 Gluster-devel mailing list
 

Re: [Gluster-devel] Update on mgmt_v3-locks.t failure in netbsd

2015-04-28 Thread Joe Julian
No Raspberry Pi servers any more? 

On April 28, 2015 5:07:06 AM PDT, Justin Clift jus...@gluster.org wrote:
Does this mean we're officially no longer supporting 32 bit
architectures?

(or is that just on x86?)

+ Justin


On 28 Apr 2015, at 12:45, Kaushal M kshlms...@gmail.com wrote:
 Found the problem. The NetBSD slaves are running a 32-bit kernel and
userspace.
 ```
 nbslave7a# uname -p
 i386
 ```
 
 Because of this CAA_BITS_PER_LONG is set to 32 and the case for size
8
 isn't compiled in uatomic_add_return. Even though the underlying
 (virtual) hardware has 64-bit support, and supports the required
 8-byte wide instrcution, it cannot be used because we are running on
a
 32-bit kernel with a 32-bit userspace.
 
 Manu, was there any particular reason why you 32-bit NetBSD? If there
 are none, can you please replace the VMs with 64-bit NetBSD. Until
 then you can keep mgmt_v3-locks.t disabled.
 
 ~kaushal
 
 On Tue, Apr 28, 2015 at 4:56 PM, Kaushal M kshlms...@gmail.com
wrote:
 I seem to have found the issue.
 
 The uatomic_add_return function is defined in urcu/uatomic.h as
 ```
 /* uatomic_add_return */
 
 static inline __attribute__((always_inline))
 unsigned long __uatomic_add_return(void *addr, unsigned long val,
int len)
 {
   switch (len) {
   case 1:
   {
   unsigned char result = val;
 
   __asm__ __volatile__(
   lock; xaddb %1, %0
   : +m(*__hp(addr)), +q (result)
   :
   : memory);
   return result + (unsigned char)val;
   }
   case 2:
   {
   unsigned short result = val;
 
   __asm__ __volatile__(
   lock; xaddw %1, %0
   : +m(*__hp(addr)), +r (result)
   :
   : memory);
   return result + (unsigned short)val;
   }
   case 4:
   {
   unsigned int result = val;
 
   __asm__ __volatile__(
   lock; xaddl %1, %0
   : +m(*__hp(addr)), +r (result)
   :
   : memory);
   return result + (unsigned int)val;
   }
 #if (CAA_BITS_PER_LONG == 64)
   case 8:
   {
   unsigned long result = val;
 
   __asm__ __volatile__(
   lock; xaddq %1, %0
   : +m(*__hp(addr)), +r (result)
   :
   : memory);
   return result + (unsigned long)val;
   }
 #endif
   }
   /*
* generate an illegal instruction. Cannot catch this with
* linker tricks when optimizations are disabled.
*/
   __asm__ __volatile__(ud2);
   return 0;
 }
 ```
 
 As we can see, uatomic_add_return uses different assembly
instructions
 to perform the add based on the size of the datatype of the value.
If
 the size of the value doesn't exactly match one of the sizes in the
 switch case, it deliberately generates a SIGILL.
 
 The case for size 8, is conditionally compiled as we can see above.
 From the backtrace Atin provided earlier, we see that the size of
the
 value is indeed 8 (we use uint64_t). Because we had a SIGILL, we can
 conclude that the case for size 8 wasn't compiled.
 
 I don't know why this compilation didn't (or as this is in a header
 file, doesn't) happen on the NetBSD slaves and this is something I'd
 like to find out.
 
 ~kaushal
 
 On Tue, Apr 28, 2015 at 1:50 PM, Anand Nekkunti
anekk...@redhat.com wrote:
 
 On 04/28/2015 01:40 PM, Emmanuel Dreyfus wrote:
 
 On Tue, Apr 28, 2015 at 01:37:42PM +0530, Anand Nekkunti wrote:
 
__asm__ is for to write assembly code in c (gcc),
 __volatile__(:::)
 compiler level  barrier to force the compiler not to do reorder
the
 instructions(to avoid optimization ) .
 
 Sure, but the gory details should be of no interest to the
developer
 engaged in debug: if it crashes this is probably because it is
called
 with wrong arguments, hence the question:
  ccing gluster-devel
 
 new_peer-generation = uatomic_add_return
(conf-generation,
 1);
 Are new_peer-generation and conf-generation sane?
 
 
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

-- 
Sent from my Android device with K-9 Mail. Please excuse my 

Re: [Gluster-devel] Update on mgmt_v3-locks.t failure in netbsd

2015-04-28 Thread Atin Mukherjee


On 04/28/2015 06:08 PM, Vijay Bellur wrote:
 On 04/28/2015 05:48 PM, Kaushal M wrote:
 AFAIU we've not officially supported 32-bit architectures for sometime
 (possibly for ever) as a community. But we had users running it
 anyway.

 3.7 as it is currently, cannot run on 32-bit platforms. I've used
 atomic operations which depend on specific processor instructions, to
 increment a 64-bit (uint64_t) generation number. This is not possible
 on 32-bit platforms.

 I'm currently debating with myself and others around me, if we can be
 satisfied with using a 32-bit generation number. I used a 64-bit
 generation number, just to give us the widest possible range before
 running out. But a 32-bit generation number should be more than enough
 for normal usage.

 I'll update the list once we come to a decision.

 
 Let us retain support for 32-bit platforms. Though as developers we do
 not run many tests on non 64-bit platforms, it would be not be good to
 break compatibility for 32-bit platforms as we have many community users
 running this
To get rid of all this fuss, I think its better to use uint32_t instead
of unint64_t.
 
 -Vijay
 
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel

-- 
~Atin
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Update on mgmt_v3-locks.t failure in netbsd

2015-04-28 Thread Vijay Bellur

On 04/28/2015 05:48 PM, Kaushal M wrote:

AFAIU we've not officially supported 32-bit architectures for sometime
(possibly for ever) as a community. But we had users running it
anyway.

3.7 as it is currently, cannot run on 32-bit platforms. I've used
atomic operations which depend on specific processor instructions, to
increment a 64-bit (uint64_t) generation number. This is not possible
on 32-bit platforms.

I'm currently debating with myself and others around me, if we can be
satisfied with using a 32-bit generation number. I used a 64-bit
generation number, just to give us the widest possible range before
running out. But a 32-bit generation number should be more than enough
for normal usage.

I'll update the list once we come to a decision.



Let us retain support for 32-bit platforms. Though as developers we do 
not run many tests on non 64-bit platforms, it would be not be good to 
break compatibility for 32-bit platforms as we have many community users 
running this


-Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Update on mgmt_v3-locks.t failure in netbsd

2015-04-28 Thread Kaushal M
After a long discussion (not really), we've concluded that a 64-bit
uint is really overkill for how we are using the generation number,
and I'll change it to a 32-bit uint. I'll send a change to do this
right away.

The generation number we use is not permanent, it is valid only within
a GlusterD processe's lifetime. It starts fresh on every GlusterD
start. We don't save it or restore it. The generation number is bumped
for each peerinfo object allocated. Considering this, even if were to
do peer probes every second it'd be ages before we overflow with a
32-bit generation number. If someone is somehow able to bombard us
with 4 billion requests, I expect GlusterD to bork before the
generation number ever reaches the overflow point.

~kaushal

On Tue, Apr 28, 2015 at 6:13 PM, Emmanuel Dreyfus m...@netbsd.org wrote:
 On Tue, Apr 28, 2015 at 06:08:45PM +0530, Vijay Bellur wrote:
 Let us retain support for 32-bit platforms. Though as developers we do not
 run many tests on non 64-bit platforms, it would be not be good to break
 compatibility for 32-bit platforms as we have many community users running
 this

 That is a point in favor of not migrating NetBSD slave VM running
 regression to 64 bit systems. Keeping them as is will catch any
 32 bit regressuin.

 --
 Emmanuel Dreyfus
 m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Update on mgmt_v3-locks.t failure in netbsd

2015-04-28 Thread Justin Clift
Does this mean we're officially no longer supporting 32 bit architectures?

(or is that just on x86?)

+ Justin


On 28 Apr 2015, at 12:45, Kaushal M kshlms...@gmail.com wrote:
 Found the problem. The NetBSD slaves are running a 32-bit kernel and 
 userspace.
 ```
 nbslave7a# uname -p
 i386
 ```
 
 Because of this CAA_BITS_PER_LONG is set to 32 and the case for size 8
 isn't compiled in uatomic_add_return. Even though the underlying
 (virtual) hardware has 64-bit support, and supports the required
 8-byte wide instrcution, it cannot be used because we are running on a
 32-bit kernel with a 32-bit userspace.
 
 Manu, was there any particular reason why you 32-bit NetBSD? If there
 are none, can you please replace the VMs with 64-bit NetBSD. Until
 then you can keep mgmt_v3-locks.t disabled.
 
 ~kaushal
 
 On Tue, Apr 28, 2015 at 4:56 PM, Kaushal M kshlms...@gmail.com wrote:
 I seem to have found the issue.
 
 The uatomic_add_return function is defined in urcu/uatomic.h as
 ```
 /* uatomic_add_return */
 
 static inline __attribute__((always_inline))
 unsigned long __uatomic_add_return(void *addr, unsigned long val,
int len)
 {
   switch (len) {
   case 1:
   {
   unsigned char result = val;
 
   __asm__ __volatile__(
   lock; xaddb %1, %0
   : +m(*__hp(addr)), +q (result)
   :
   : memory);
   return result + (unsigned char)val;
   }
   case 2:
   {
   unsigned short result = val;
 
   __asm__ __volatile__(
   lock; xaddw %1, %0
   : +m(*__hp(addr)), +r (result)
   :
   : memory);
   return result + (unsigned short)val;
   }
   case 4:
   {
   unsigned int result = val;
 
   __asm__ __volatile__(
   lock; xaddl %1, %0
   : +m(*__hp(addr)), +r (result)
   :
   : memory);
   return result + (unsigned int)val;
   }
 #if (CAA_BITS_PER_LONG == 64)
   case 8:
   {
   unsigned long result = val;
 
   __asm__ __volatile__(
   lock; xaddq %1, %0
   : +m(*__hp(addr)), +r (result)
   :
   : memory);
   return result + (unsigned long)val;
   }
 #endif
   }
   /*
* generate an illegal instruction. Cannot catch this with
* linker tricks when optimizations are disabled.
*/
   __asm__ __volatile__(ud2);
   return 0;
 }
 ```
 
 As we can see, uatomic_add_return uses different assembly instructions
 to perform the add based on the size of the datatype of the value. If
 the size of the value doesn't exactly match one of the sizes in the
 switch case, it deliberately generates a SIGILL.
 
 The case for size 8, is conditionally compiled as we can see above.
 From the backtrace Atin provided earlier, we see that the size of the
 value is indeed 8 (we use uint64_t). Because we had a SIGILL, we can
 conclude that the case for size 8 wasn't compiled.
 
 I don't know why this compilation didn't (or as this is in a header
 file, doesn't) happen on the NetBSD slaves and this is something I'd
 like to find out.
 
 ~kaushal
 
 On Tue, Apr 28, 2015 at 1:50 PM, Anand Nekkunti anekk...@redhat.com wrote:
 
 On 04/28/2015 01:40 PM, Emmanuel Dreyfus wrote:
 
 On Tue, Apr 28, 2015 at 01:37:42PM +0530, Anand Nekkunti wrote:
 
__asm__ is for to write assembly code in c (gcc),
 __volatile__(:::)
 compiler level  barrier to force the compiler not to do reorder the
 instructions(to avoid optimization ) .
 
 Sure, but the gory details should be of no interest to the developer
 engaged in debug: if it crashes this is probably because it is called
 with wrong arguments, hence the question:
  ccing gluster-devel
 
 new_peer-generation = uatomic_add_return (conf-generation,
 1);
 Are new_peer-generation and conf-generation sane?
 
 
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Update on mgmt_v3-locks.t failure in netbsd

2015-04-28 Thread Emmanuel Dreyfus
On Tue, Apr 28, 2015 at 05:15:30PM +0530, Kaushal M wrote:
 Because of this CAA_BITS_PER_LONG is set to 32 and the case for size 8
 isn't compiled in uatomic_add_return. Even though the underlying
 (virtual) hardware has 64-bit support, and supports the required
 8-byte wide instrcution, it cannot be used because we are running on a
 32-bit kernel with a 32-bit userspace.

NetBSD has a non-standard atomic_add_64() available on alpha, amd64, 
ia64 and suprisingly i386. It is also availaible for some sparc, mips
and powerpc CPU.

 Manu, was there any particular reason why you 32-bit NetBSD? If there
 are none, can you please replace the VMs with 64-bit NetBSD. Until
 then you can keep mgmt_v3-locks.t disabled.

Well, I was not even aware 32 bits machine were not supported. That is
a huge blow to portability, especialy considering that i386 seems capable
of handling 64 bit atomic operations.

At a minimum we should introduce a gf_atomic_add() macro that exapnds
to approproiate non standard stuff. But why so we need 64 bits here? What
is the requirement?


-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Update on mgmt_v3-locks.t failure in netbsd

2015-04-28 Thread Kaushal M
I seem to have found the issue.

The uatomic_add_return function is defined in urcu/uatomic.h as
```
/* uatomic_add_return */

static inline __attribute__((always_inline))
unsigned long __uatomic_add_return(void *addr, unsigned long val,
int len)
{
   switch (len) {
   case 1:
   {
   unsigned char result = val;

   __asm__ __volatile__(
   lock; xaddb %1, %0
   : +m(*__hp(addr)), +q (result)
   :
   : memory);
   return result + (unsigned char)val;
   }
   case 2:
   {
   unsigned short result = val;

   __asm__ __volatile__(
   lock; xaddw %1, %0
   : +m(*__hp(addr)), +r (result)
   :
   : memory);
   return result + (unsigned short)val;
   }
   case 4:
   {
   unsigned int result = val;

   __asm__ __volatile__(
   lock; xaddl %1, %0
   : +m(*__hp(addr)), +r (result)
   :
   : memory);
   return result + (unsigned int)val;
   }
#if (CAA_BITS_PER_LONG == 64)
   case 8:
   {
   unsigned long result = val;

   __asm__ __volatile__(
   lock; xaddq %1, %0
   : +m(*__hp(addr)), +r (result)
   :
   : memory);
   return result + (unsigned long)val;
   }
#endif
   }
   /*
* generate an illegal instruction. Cannot catch this with
* linker tricks when optimizations are disabled.
*/
   __asm__ __volatile__(ud2);
   return 0;
}
```

As we can see, uatomic_add_return uses different assembly instructions
to perform the add based on the size of the datatype of the value. If
the size of the value doesn't exactly match one of the sizes in the
switch case, it deliberately generates a SIGILL.

The case for size 8, is conditionally compiled as we can see above.
From the backtrace Atin provided earlier, we see that the size of the
value is indeed 8 (we use uint64_t). Because we had a SIGILL, we can
conclude that the case for size 8 wasn't compiled.

I don't know why this compilation didn't (or as this is in a header
file, doesn't) happen on the NetBSD slaves and this is something I'd
like to find out.

~kaushal

On Tue, Apr 28, 2015 at 1:50 PM, Anand Nekkunti anekk...@redhat.com wrote:

 On 04/28/2015 01:40 PM, Emmanuel Dreyfus wrote:

 On Tue, Apr 28, 2015 at 01:37:42PM +0530, Anand Nekkunti wrote:

 __asm__ is for to write assembly code in c (gcc),
 __volatile__(:::)
 compiler level  barrier to force the compiler not to do reorder the
 instructions(to avoid optimization ) .

 Sure, but the gory details should be of no interest to the developer
 engaged in debug: if it crashes this is probably because it is called
 with wrong arguments, hence the question:
   ccing gluster-devel

  new_peer-generation = uatomic_add_return (conf-generation,
 1);
 Are new_peer-generation and conf-generation sane?


 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Update on mgmt_v3-locks.t failure in netbsd

2015-04-28 Thread Niels de Vos
On Tue, Apr 28, 2015 at 12:15:11PM +0530, Atin Mukherjee wrote:
 I see netbsd regression doesn't execute peer probe from any other tests
 apart from mgmt_v3-locks.t, if it had that would have also failed. So
 the conclusion is peer probe doesn't work in netbsd. Glusterd crashes
 with following bt when peer probe is executed:
 
 #0  __uatomic_add_return (len=8, val=1, addr=optimized out) at
 /usr/pkg/include/urcu/uatomic.h:233
 233 __asm__ __volatile__(ud2);
 (gdb) bt
 #0  __uatomic_add_return (len=8, val=1, addr=optimized out) at
 /usr/pkg/include/urcu/uatomic.h:233
 #1  glusterd_peerinfo_new (state=state@entry=GD_FRIEND_STATE_DEFAULT,
 uuid=uuid@entry=0x0,
 hostname=optimized out, hostname@entry=0xb8b040e0 127.1.1.2,
 port=port@entry=24007)
 at glusterd-peer-utils.c:308
 #2  0xb91e0068 in glusterd_friend_add (hoststr=hoststr@entry=0xb8b040e0
 127.1.1.2, port=port@entry=24007,
 state=state@entry=GD_FRIEND_STATE_DEFAULT, uuid=uuid@entry=0x0,
 friend=friend@entry=0xb89fff30,
 restore=restore@entry=_gf_false, args=args@entry=0xb89fff38) at
 glusterd-handler.c:3212
 #3  0xb91e2927 in glusterd_probe_begin (req=req@entry=0xb8f40040,
 hoststr=0xb8b040e0 127.1.1.2, port=24007,
 dict=0xb9c013b0, op_errno=op_errno@entry=0xb89fff9c) at
 glusterd-handler.c:3320
 #4  0xb91e2de2 in __glusterd_handle_cli_probe (req=0xb8f40040) at
 glusterd-handler.c:1078
 #5  0xb91dc932 in glusterd_big_locked_handler (req=req@entry=0xb8f40040,
 actor_fn=actor_fn@entry=0xb91e294d __glusterd_handle_cli_probe) at
 glusterd-handler.c:83
 #6  0xb91dc9e8 in glusterd_handle_cli_probe (req=0xb8f40040) at
 glusterd-handler.c:1105
 #7  0xbb787c82 in synctask_wrap (old_task=0xb8f66000) at syncop.c:375
 #8  0xbb39c630 in ?? () from /usr/lib/libc.so.12
 
 http://review.gluster.org/#/c/10147/ is the cause for it. I will
 continue to investigate on this, however I am not able to understand
 what this line __asm__ __volatile__(ud2) indicating. Any experts :) ?

IIRC, the Linux kernel executes this when a BUG() statement is hit and
logging or a kernel panic should happen.

You probably need someone that understands userspace-rcu to get an
understanding of when/how ud2 is used there.

HTH,
Niels


pgpyatlOhqZfS.pgp
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Update on mgmt_v3-locks.t failure in netbsd

2015-04-28 Thread Emmanuel Dreyfus
On Tue, Apr 28, 2015 at 12:15:11PM +0530, Atin Mukherjee wrote:
 I see netbsd regression doesn't execute peer probe from any other tests
 apart from mgmt_v3-locks.t, if it had that would have also failed. So
 the conclusion is peer probe doesn't work in netbsd.

It does not work anymore: that works in release-3.6, and it broke
quite recently in master. 

 http://review.gluster.org/#/c/10147/ is the cause for it. I will
 continue to investigate on this, however I am not able to understand
 what this line __asm__ __volatile__(ud2) indicating. Any experts :) ?

This is the implementation for that in glusterd_peerinfo_new():
new_peer-generation = uatomic_add_return (conf-generation, 1);

Are new_peer-generation and conf-generation sane?

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Update on mgmt_v3-locks.t failure in netbsd

2015-04-28 Thread Atin Mukherjee


On 04/28/2015 01:24 PM, Emmanuel Dreyfus wrote:
 On Tue, Apr 28, 2015 at 12:15:11PM +0530, Atin Mukherjee wrote:
 I see netbsd regression doesn't execute peer probe from any other tests
 apart from mgmt_v3-locks.t, if it had that would have also failed. So
 the conclusion is peer probe doesn't work in netbsd.
 
 It does not work anymore: that works in release-3.6, and it broke
 quite recently in master. 
 
 http://review.gluster.org/#/c/10147/ is the cause for it. I will
 continue to investigate on this, however I am not able to understand
 what this line __asm__ __volatile__(ud2) indicating. Any experts :) ?
 
 This is the implementation for that in glusterd_peerinfo_new():
 new_peer-generation = uatomic_add_return (conf-generation, 1);
 
 Are new_peer-generation and conf-generation sane?
The expectation from the above statement is to execute
new_peer-generation = ++conf-generation atomically
 

-- 
~Atin
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Update on mgmt_v3-locks.t failure in netbsd

2015-04-28 Thread Anand Nekkunti


On 04/28/2015 01:40 PM, Emmanuel Dreyfus wrote:

On Tue, Apr 28, 2015 at 01:37:42PM +0530, Anand Nekkunti wrote:

__asm__ is for to write assembly code in c (gcc), __volatile__(:::)
compiler level  barrier to force the compiler not to do reorder the
instructions(to avoid optimization ) .

Sure, but the gory details should be of no interest to the developer
engaged in debug: if it crashes this is probably because it is called
with wrong arguments, hence the question:
  ccing gluster-devel

 new_peer-generation = uatomic_add_return (conf-generation, 1);
Are new_peer-generation and conf-generation sane?


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel