Re: [PATCH for v4.2 v18 1/3] sys_membarrier(): system-wide memory barrier (generic, x86)

2015-05-31 Thread Mathieu Desnoyers
- On May 30, 2015, at 12:40 AM, Andrew Morton a...@linux-foundation.org 
wrote:

> On Sat, 16 May 2015 19:48:18 -0400 Mathieu Desnoyers
>  wrote:
> 
>> Here is an implementation of a new system call, sys_membarrier(), which
>> executes a memory barrier on all threads running on the system. It is
>> implemented by calling synchronize_sched(). It can be used to distribute
>> the cost of user-space memory barriers asymmetrically by transforming
>> pairs of memory barriers into pairs consisting of sys_membarrier() and a
>> compiler barrier. For synchronization primitives that distinguish
>> between read-side and write-side (e.g. userspace RCU [1], rwlocks), the
>> read-side can be accelerated significantly by moving the bulk of the
>> memory barrier overhead to the write-side.
>>
>> ...
>>
> 
> It would be nice to hear about the real world value of this syscall to
> our users.  I'm seeing test results for a microbenchmark but so what.
> What actual applications or application classes are calling for this and
> what results can they expect to see?

AFAIK, the existing open source applications that would be improved by this
system call are as follows:

* Through Userspace RCU library (http://urcu.so)
  - DNS server (Knot DNS) https://www.knot-dns.cz/
  - Network sniffer (http://netsniff-ng.org/)
  - Distributed object storage (https://sheepdog.github.io/sheepdog/)
  - User-space tracing (http://lttng.org)
  - Network storage system (https://www.gluster.org/)

Those projects use RCU in userspace to increase read-side speed and
scalability compared to locking. Especially in the case of RCU used
by libraries, sys_membarrier can speed up the read-side by moving the
bulk of the memory barrier cost to synchronize_rcu().

* Direct users of sys_membarrier
  - core dotnet garbage collector (https://github.com/dotnet/coreclr/issues/198)

Microsoft core dotnet GC developers are planning to use the mprotect()
side-effect of issuing memory barriers through IPIs as a way to implement 
Windows
FlushProcessWriteBuffers() on Linux. They are referring to sys_membarrier in 
their
github thread, specifically stating that sys_membarrier() is what they are 
looking
for.

> 
>> 
>> membarrier(2) man page:
>> --- snip ---
>> MEMBARRIER(2)  Linux Programmer's Manual 
>> MEMBARRIER(2)
>> 
>> NAME
>>membarrier - issue memory barriers on a set of threads
>> 
>> SYNOPSIS
>>#include 
>> 
>>int membarrier(int cmd, int flags);
>> 
>> DESCRIPTION
>>The cmd argument is one of the following:
>> 
>>MEMBARRIER_CMD_QUERY
>>   Query  the  set  of  supported commands. It returns a bitmask 
>> of
>>   supported commands.
>> 
>>MEMBARRIER_CMD_SHARED
>>   Execute a memory barrier on all threads running on  the  
>> system.
>>   Upon  return from system call, the caller thread is ensured 
>> that
>>   all running threads have passed through a state where all 
>> memory
>>   accesses  to  user-space  addresses  match program order 
>> between
>>   entry to and return from the system  call  (non-running  
>> threads
>>   are de facto in such a state). This covers threads from all 
>> pro___
>>   cesses running on the system.  This command returns 0.
>> 
>>The flags argument needs to be 0. For future extensions.
>> 
>>All memory accesses performed  in  program  order  from  each  
>> targeted
>>thread is guaranteed to be ordered with respect to sys_membarrier(). 
>> If
>>we use the semantic "barrier()" to represent a compiler barrier 
>> forcing
>>memory  accesses  to  be performed in program order across the 
>> barrier,
>>and smp_mb() to represent explicit memory barriers forcing full  
>> memory
>>ordering  across  the barrier, we have the following ordering table 
>> for
>>each pair of barrier(), sys_membarrier() and smp_mb():
>> 
>>The pair ordering is detailed as (O: ordered, X: not ordered):
>> 
>>   barrier()   smp_mb() sys_membarrier()
>>   barrier()  X   XO
>>   smp_mb()   X   OO
>>   sys_membarrier()   O   OO
>> 
>> RETURN VALUE
>>On success, these system calls return zero.  On error, -1 is  
>> returned,
>>and errno is set appropriately. For a given command, with flags
>>argument set to 0, this system call is guaranteed to always return the
>>same value until reboot.
> 
> I suggest "with flags argument set to MEMBARRIER_CMD_QUERY" here.

No, the enum is for the "cmd" argument (see above) not the flags argument. We
really mean flags = 0 (the value) here.

> 
>> 
>> ERRORS
>>ENOSYS System call is not implemented.
>> 
>>EINVAL Invalid arguments.
>> 
>> ...
>>
>> 

Re: [PATCH for v4.2 v18 1/3] sys_membarrier(): system-wide memory barrier (generic, x86)

2015-05-31 Thread Mathieu Desnoyers
- On May 30, 2015, at 12:40 AM, Andrew Morton a...@linux-foundation.org 
wrote:

 On Sat, 16 May 2015 19:48:18 -0400 Mathieu Desnoyers
 mathieu.desnoy...@efficios.com wrote:
 
 Here is an implementation of a new system call, sys_membarrier(), which
 executes a memory barrier on all threads running on the system. It is
 implemented by calling synchronize_sched(). It can be used to distribute
 the cost of user-space memory barriers asymmetrically by transforming
 pairs of memory barriers into pairs consisting of sys_membarrier() and a
 compiler barrier. For synchronization primitives that distinguish
 between read-side and write-side (e.g. userspace RCU [1], rwlocks), the
 read-side can be accelerated significantly by moving the bulk of the
 memory barrier overhead to the write-side.

 ...

 
 It would be nice to hear about the real world value of this syscall to
 our users.  I'm seeing test results for a microbenchmark but so what.
 What actual applications or application classes are calling for this and
 what results can they expect to see?

AFAIK, the existing open source applications that would be improved by this
system call are as follows:

* Through Userspace RCU library (http://urcu.so)
  - DNS server (Knot DNS) https://www.knot-dns.cz/
  - Network sniffer (http://netsniff-ng.org/)
  - Distributed object storage (https://sheepdog.github.io/sheepdog/)
  - User-space tracing (http://lttng.org)
  - Network storage system (https://www.gluster.org/)

Those projects use RCU in userspace to increase read-side speed and
scalability compared to locking. Especially in the case of RCU used
by libraries, sys_membarrier can speed up the read-side by moving the
bulk of the memory barrier cost to synchronize_rcu().

* Direct users of sys_membarrier
  - core dotnet garbage collector (https://github.com/dotnet/coreclr/issues/198)

Microsoft core dotnet GC developers are planning to use the mprotect()
side-effect of issuing memory barriers through IPIs as a way to implement 
Windows
FlushProcessWriteBuffers() on Linux. They are referring to sys_membarrier in 
their
github thread, specifically stating that sys_membarrier() is what they are 
looking
for.

 
 
 membarrier(2) man page:
 --- snip ---
 MEMBARRIER(2)  Linux Programmer's Manual 
 MEMBARRIER(2)
 
 NAME
membarrier - issue memory barriers on a set of threads
 
 SYNOPSIS
#include linux/membarrier.h
 
int membarrier(int cmd, int flags);
 
 DESCRIPTION
The cmd argument is one of the following:
 
MEMBARRIER_CMD_QUERY
   Query  the  set  of  supported commands. It returns a bitmask 
 of
   supported commands.
 
MEMBARRIER_CMD_SHARED
   Execute a memory barrier on all threads running on  the  
 system.
   Upon  return from system call, the caller thread is ensured 
 that
   all running threads have passed through a state where all 
 memory
   accesses  to  user-space  addresses  match program order 
 between
   entry to and return from the system  call  (non-running  
 threads
   are de facto in such a state). This covers threads from all 
 pro___
   cesses running on the system.  This command returns 0.
 
The flags argument needs to be 0. For future extensions.
 
All memory accesses performed  in  program  order  from  each  
 targeted
thread is guaranteed to be ordered with respect to sys_membarrier(). 
 If
we use the semantic barrier() to represent a compiler barrier 
 forcing
memory  accesses  to  be performed in program order across the 
 barrier,
and smp_mb() to represent explicit memory barriers forcing full  
 memory
ordering  across  the barrier, we have the following ordering table 
 for
each pair of barrier(), sys_membarrier() and smp_mb():
 
The pair ordering is detailed as (O: ordered, X: not ordered):
 
   barrier()   smp_mb() sys_membarrier()
   barrier()  X   XO
   smp_mb()   X   OO
   sys_membarrier()   O   OO
 
 RETURN VALUE
On success, these system calls return zero.  On error, -1 is  
 returned,
and errno is set appropriately. For a given command, with flags
argument set to 0, this system call is guaranteed to always return the
same value until reboot.
 
 I suggest with flags argument set to MEMBARRIER_CMD_QUERY here.

No, the enum is for the cmd argument (see above) not the flags argument. We
really mean flags = 0 (the value) here.

 
 
 ERRORS
ENOSYS System call is not implemented.
 
EINVAL Invalid arguments.
 
 ...

 +SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
 +{
 +if (flags)
 +return -EINVAL;
 
 I'm not a huge fan of this add a flags arg to syscalls rule.  

Re: [PATCH for v4.2 v18 1/3] sys_membarrier(): system-wide memory barrier (generic, x86)

2015-05-29 Thread Andrew Morton
On Sat, 16 May 2015 19:48:18 -0400 Mathieu Desnoyers 
 wrote:

> Here is an implementation of a new system call, sys_membarrier(), which
> executes a memory barrier on all threads running on the system. It is
> implemented by calling synchronize_sched(). It can be used to distribute
> the cost of user-space memory barriers asymmetrically by transforming
> pairs of memory barriers into pairs consisting of sys_membarrier() and a
> compiler barrier. For synchronization primitives that distinguish
> between read-side and write-side (e.g. userspace RCU [1], rwlocks), the
> read-side can be accelerated significantly by moving the bulk of the
> memory barrier overhead to the write-side.
>
> ...
>

It would be nice to hear about the real world value of this syscall to
our users.  I'm seeing test results for a microbenchmark but so what. 
What actual applications or application classes are calling for this and
what results can they expect to see?

> 
> membarrier(2) man page:
> --- snip ---
> MEMBARRIER(2)  Linux Programmer's Manual MEMBARRIER(2)
> 
> NAME
>membarrier - issue memory barriers on a set of threads
> 
> SYNOPSIS
>#include 
> 
>int membarrier(int cmd, int flags);
> 
> DESCRIPTION
>The cmd argument is one of the following:
> 
>MEMBARRIER_CMD_QUERY
>   Query  the  set  of  supported commands. It returns a bitmask of
>   supported commands.
> 
>MEMBARRIER_CMD_SHARED
>   Execute a memory barrier on all threads running on  the  system.
>   Upon  return from system call, the caller thread is ensured that
>   all running threads have passed through a state where all memory
>   accesses  to  user-space  addresses  match program order between
>   entry to and return from the system  call  (non-running  threads
>   are de facto in such a state). This covers threads from all 
> pro___
>   cesses running on the system.  This command returns 0.
> 
>The flags argument needs to be 0. For future extensions.
> 
>All memory accesses performed  in  program  order  from  each  targeted
>thread is guaranteed to be ordered with respect to sys_membarrier(). If
>we use the semantic "barrier()" to represent a compiler barrier forcing
>memory  accesses  to  be performed in program order across the barrier,
>and smp_mb() to represent explicit memory barriers forcing full  memory
>ordering  across  the barrier, we have the following ordering table for
>each pair of barrier(), sys_membarrier() and smp_mb():
> 
>The pair ordering is detailed as (O: ordered, X: not ordered):
> 
>   barrier()   smp_mb() sys_membarrier()
>   barrier()  X   XO
>   smp_mb()   X   OO
>   sys_membarrier()   O   OO
> 
> RETURN VALUE
>On success, these system calls return zero.  On error, -1 is  returned,
>and errno is set appropriately. For a given command, with flags
>argument set to 0, this system call is guaranteed to always return the
>same value until reboot.

I suggest "with flags argument set to MEMBARRIER_CMD_QUERY" here.

> 
> ERRORS
>ENOSYS System call is not implemented.
> 
>EINVAL Invalid arguments.
> 
> ...
>
> +SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
> +{
> + if (flags)
> + return -EINVAL;

I'm not a huge fan of this "add a flags arg to syscalls" rule.  Is
there any realistic expectation that we'll ever *use* this thing?  If
not, why add it?

You may as well put an unlikely() in there btw.

> + switch (cmd) {
> + case MEMBARRIER_CMD_QUERY:
> + return MEMBARRIER_CMD_BITMASK;
> + case MEMBARRIER_CMD_SHARED:
> + if (num_online_cpus() > 1)
> + synchronize_sched();
> + return 0;
> + default:
> + return -EINVAL;
> + }
> +}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH for v4.2 v18 1/3] sys_membarrier(): system-wide memory barrier (generic, x86)

2015-05-29 Thread Andrew Morton
On Sat, 16 May 2015 19:48:18 -0400 Mathieu Desnoyers 
mathieu.desnoy...@efficios.com wrote:

 Here is an implementation of a new system call, sys_membarrier(), which
 executes a memory barrier on all threads running on the system. It is
 implemented by calling synchronize_sched(). It can be used to distribute
 the cost of user-space memory barriers asymmetrically by transforming
 pairs of memory barriers into pairs consisting of sys_membarrier() and a
 compiler barrier. For synchronization primitives that distinguish
 between read-side and write-side (e.g. userspace RCU [1], rwlocks), the
 read-side can be accelerated significantly by moving the bulk of the
 memory barrier overhead to the write-side.

 ...


It would be nice to hear about the real world value of this syscall to
our users.  I'm seeing test results for a microbenchmark but so what. 
What actual applications or application classes are calling for this and
what results can they expect to see?

 
 membarrier(2) man page:
 --- snip ---
 MEMBARRIER(2)  Linux Programmer's Manual MEMBARRIER(2)
 
 NAME
membarrier - issue memory barriers on a set of threads
 
 SYNOPSIS
#include linux/membarrier.h
 
int membarrier(int cmd, int flags);
 
 DESCRIPTION
The cmd argument is one of the following:
 
MEMBARRIER_CMD_QUERY
   Query  the  set  of  supported commands. It returns a bitmask of
   supported commands.
 
MEMBARRIER_CMD_SHARED
   Execute a memory barrier on all threads running on  the  system.
   Upon  return from system call, the caller thread is ensured that
   all running threads have passed through a state where all memory
   accesses  to  user-space  addresses  match program order between
   entry to and return from the system  call  (non-running  threads
   are de facto in such a state). This covers threads from all 
 pro___
   cesses running on the system.  This command returns 0.
 
The flags argument needs to be 0. For future extensions.
 
All memory accesses performed  in  program  order  from  each  targeted
thread is guaranteed to be ordered with respect to sys_membarrier(). If
we use the semantic barrier() to represent a compiler barrier forcing
memory  accesses  to  be performed in program order across the barrier,
and smp_mb() to represent explicit memory barriers forcing full  memory
ordering  across  the barrier, we have the following ordering table for
each pair of barrier(), sys_membarrier() and smp_mb():
 
The pair ordering is detailed as (O: ordered, X: not ordered):
 
   barrier()   smp_mb() sys_membarrier()
   barrier()  X   XO
   smp_mb()   X   OO
   sys_membarrier()   O   OO
 
 RETURN VALUE
On success, these system calls return zero.  On error, -1 is  returned,
and errno is set appropriately. For a given command, with flags
argument set to 0, this system call is guaranteed to always return the
same value until reboot.

I suggest with flags argument set to MEMBARRIER_CMD_QUERY here.

 
 ERRORS
ENOSYS System call is not implemented.
 
EINVAL Invalid arguments.
 
 ...

 +SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
 +{
 + if (flags)
 + return -EINVAL;

I'm not a huge fan of this add a flags arg to syscalls rule.  Is
there any realistic expectation that we'll ever *use* this thing?  If
not, why add it?

You may as well put an unlikely() in there btw.

 + switch (cmd) {
 + case MEMBARRIER_CMD_QUERY:
 + return MEMBARRIER_CMD_BITMASK;
 + case MEMBARRIER_CMD_SHARED:
 + if (num_online_cpus()  1)
 + synchronize_sched();
 + return 0;
 + default:
 + return -EINVAL;
 + }
 +}

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/