Re: [lng-odp] 32-bit support in examples

2017-01-24 Thread Savolainen, Petri (Nokia - FI/Espoo)


From: Ola Liljedahl [mailto:ola.liljed...@linaro.org] 
Sent: Tuesday, January 24, 2017 12:15 PM
To: Savolainen, Petri (Nokia - FI/Espoo) <petri.savolai...@nokia-bell-labs.com>
Cc: Brian Brooks <brian.bro...@linaro.org>; Francois Ozog 
<francois.o...@linaro.org>; nd <n...@arm.com>; lng-odp@lists.linaro.org
Subject: Re: [lng-odp] 32-bit support in examples



On 23 January 2017 at 09:50, Savolainen, Petri (Nokia - FI/Espoo) 
<petri.savolai...@nokia-bell-labs.com> wrote:


> -Original Message-
> From: Brian Brooks [mailto:brian.bro...@linaro.org]
> Sent: Friday, January 20, 2017 7:47 PM
> To: Francois Ozog <francois.o...@linaro.org>
> Cc: Bill Fischofer <bill.fischo...@linaro.org>; Joe Savage
> <joe.sav...@arm.com>; Maxim Uvarov <maxim.uva...@linaro.org>; Savolainen,
> Petri (Nokia - FI/Espoo) <petri.savolai...@nokia-bell-labs.com>; lng-
> o...@lists.linaro.org; nd <n...@arm.com>
> Subject: Re: [lng-odp] 32-bit support in examples
>
> CAS is a universal primitive in the sense that you can construct those
> RMW ops by speculatively computing the updated value and the CAS to
> atomically update the value (in a retry loop).  LL/SC also universal,
> but different behavior.  Both are not the same as an atomic op
> performed deeper in the memory system.
>
> To Petri's point about ODP not supporting 128b atomics, which compiler
> does not support the __atomic_xxx built-ins or the __int128 128b
> variable?  This has impact on portability and should be explicitly
> known; is it the microblaze compiler?



Any atomics can be emulated in SW (using compiler built-ins or locks directly). 
The point here is the missing HW support:
Atomic operations can be emulated/implemented but not lock-free behaviour. GCC 
does provide a lock-based implementation of e.g. 128-bit atomics in libatomic 
so functionally all targets should support 128-bit atomics.
 
 * E.g. MIPS, Power, ARMv7 do not have 128 bit CAS
 * 128 bit fetch-and-add is not supported in any of the architectures
MIPS64r6 has Load Linked DoubleWord Paired/Store Conditional DoubleWord Paired 
(LLDP/SCDP) so identical to ARMv8/AArch64. This is all you need.

>> HTML ... HTML ..
>>
>> MIPS64r6 is pretty recent spec. Are there any SoCs in ODP scope? E.g. 
>>Octeons are older MIPS, 
>> but very much in scope of ODP.
>>
>> My comment was twofold. First lacking 128 bit CAS support. Second, even with 
>> CAS support 
>> increments/decrements/adds/subs are not HW optimized - can be done but 
>> without proper HW support 
>> should not be promoted (into ODP API). If 128 bit support is added, maybe 
>> it'll be only CAS.
>>

We need to ensure on any operations added that those can be implemented 
efficiently on most of the targets.
I think we should let 32-bit platforms wither (e.g. suffer with non-ideal 
performance), how relevant are they? Why should we be limited (in an ODP 
example) in what we can do by targets that are less and less relevant?

>> 
>> This is something that we need to consider (continuously).
>>
>> Anyway, I think there are still relevant non-ARM and ARMv7 SoCs around that 
>> can run ODP very efficiently.
>> 
>> -Petri
>>



Re: [lng-odp] 32-bit support in examples

2017-01-24 Thread Ola Liljedahl
On 24 January 2017 at 10:53, Ola Liljedahl <ola.liljed...@linaro.org> wrote:

>
>
> On 20 January 2017 at 13:15, Savolainen, Petri (Nokia - FI/Espoo) <
> petri.savolai...@nokia-bell-labs.com> wrote:
>
>>
>>
>> > -Original Message-
>> > From: Joe Savage [mailto:joe.sav...@arm.com]
>> > Sent: Friday, January 20, 2017 1:51 PM
>> > To: Savolainen, Petri (Nokia - FI/Espoo) <petri.savolainen@nokia-bell-
>> > labs.com>; Maxim Uvarov <maxim.uva...@linaro.org>; lng-
>> > o...@lists.linaro.org; Bill Fischofer <bill.fischo...@linaro.org>
>> > Cc: nd <n...@arm.com>
>> > Subject: Re: [lng-odp] 32-bit support in examples
>> >
>> > > Agree with Maxim. I which way the application is not 32 bit compliant?
>> >
>> > It uses 128-bit atomics, and so is really designed for execution on
>> 64-bit
>> > machines. It is possible to provide lockless 32-bit support in this
>> case,
>> > though, and I have an implementation that does so. Since the pointer
>> size
>> > is
>> > halved and there is a pointer in the 128-bit struct, I just have to
>> squash
>> > a
>> > few of the other fields down (managing them carefully) so that 64-bit
>> > atomics
>> > can be used instead.
>>
>> Unfortunately, ODP atomics API does not support 128 bit atomics - at
>> least currently. So, your example could not use those anyway. Not all 64
>> CPUs have 128 bit atomic instructions.
>>
>> >
>> > On reflection, I think that providing 32-bit support is probably
>> > worthwhile
>> > here, so I will do so. It does add a little complexity to the code, but
>> > it's
>> > not actually that much, and there are clear benefits from having the
>> > example
>> > be better supported on different platforms.
>> >
>> > I do think that having a place for 64-bit only examples in the future
>> > (e.g.
>> > an "example_64" directory as Bill outlined) might be useful though. It
>> > isn't
>> > always so easy to add 32-bit support.
>>
>> Good. ODP provides 32 and 64 bit atomics (also on 32 bit CPUs), so you
>> can still utilize those. In addition, synchronization / critical sections
>> should touch only a small portion of the application code base and
>> preferably in a modular way (inside enqueue() / dequeue(), push() / pop(),
>> etc functions).
>>
> This code can't use the ODP atomics (at least not as defined today). We
> are not doing atomic operations on e.g. integer counters. The (preferably
> lock-free) atomic operations are done on a structure with multiple fields.
>
Well I need to correct myself. Using unions, we can use atomics that
operate on e.g. __int128 operands. This is how we utilise ARM load/store
exclusives (ldxp/stxp). But the actual operand is not an 128-bit scalar.

Should we decide to support 128-bit atomics in ODP, we need CAS and SWAP
but nothing much else (e.g. we are *not* using 128-bit counters). Load is
preferably done non-atomically as an atomic 128-bit load is rather
expensive (needs to be implemented using CAS).



>
>
>>
>> -Petri
>>
>>
>>
>


Re: [lng-odp] 32-bit support in examples

2017-01-24 Thread Ola Liljedahl
On 23 January 2017 at 09:50, Savolainen, Petri (Nokia - FI/Espoo) <
petri.savolai...@nokia-bell-labs.com> wrote:

>
>
> > -Original Message-
> > From: Brian Brooks [mailto:brian.bro...@linaro.org]
> > Sent: Friday, January 20, 2017 7:47 PM
> > To: Francois Ozog <francois.o...@linaro.org>
> > Cc: Bill Fischofer <bill.fischo...@linaro.org>; Joe Savage
> > <joe.sav...@arm.com>; Maxim Uvarov <maxim.uva...@linaro.org>;
> Savolainen,
> > Petri (Nokia - FI/Espoo) <petri.savolai...@nokia-bell-labs.com>; lng-
> > o...@lists.linaro.org; nd <n...@arm.com>
> > Subject: Re: [lng-odp] 32-bit support in examples
> >
> > CAS is a universal primitive in the sense that you can construct those
> > RMW ops by speculatively computing the updated value and the CAS to
> > atomically update the value (in a retry loop).  LL/SC also universal,
> > but different behavior.  Both are not the same as an atomic op
> > performed deeper in the memory system.
> >
> > To Petri's point about ODP not supporting 128b atomics, which compiler
> > does not support the __atomic_xxx built-ins or the __int128 128b
> > variable?  This has impact on portability and should be explicitly
> > known; is it the microblaze compiler?
>
>
>
> Any atomics can be emulated in SW (using compiler built-ins or locks
> directly). The point here is the missing HW support:
>
Atomic operations can be emulated/implemented but not lock-free behaviour.
GCC does provide a lock-based implementation of e.g. 128-bit atomics in
libatomic so functionally all targets should support 128-bit atomics.


>  * E.g. MIPS, Power, ARMv7 do not have 128 bit CAS
>  * 128 bit fetch-and-add is not supported in any of the architectures
>
MIPS64r6 has Load Linked DoubleWord Paired/Store Conditional DoubleWord
Paired (LLDP/SCDP) so identical to ARMv8/AArch64. This is all you need.


>
> We need to ensure on any operations added that those can be implemented
> efficiently on most of the targets.
>
I think we should let 32-bit platforms wither (e.g. suffer with non-ideal
performance), how relevant are they? Why should we be limited (in an ODP
example) in what we can do by targets that are less and less relevant?


>
> -Petri
>
>
> >
> > On Fri, Jan 20, 2017 at 7:36 AM, Francois Ozog <francois.o...@linaro.org
> >
> > wrote:
> > > well, yes. But that is the only atomic operation supported. No add,
> sub,
> > > inc, xadd, bit operations
> > >
> > > Le ven. 20 janv. 2017 à 14:31, Joe Savage <joe.sav...@arm.com> a
> écrit :
> > >
> > >> > I wonder what processor supports 128 bits atomics. As far as I know
> > Intel
> > >>
> > >> > does not support it. Lock prefix is not allowed on SSE instructions.
> > >>
> > >>
> > >>
> > >> Actually, Intel does support them through a locked cmpxchg16b. And
> > ARMv8
> > >>
> > >> through load exclusive pair and store exclusive pair.
> > >>
> > >>
>


Re: [lng-odp] 32-bit support in examples

2017-01-24 Thread Ola Liljedahl
On 20 January 2017 at 14:36, Francois Ozog  wrote:

> well, yes. But that is the only atomic operation supported. No add, sub,
> inc, xadd, bit operations
>
Using lock cmpxchg16b (i.e. atomic CAS), GCC implements all of the __atomic
operations on 128-bit operands.
Intel cheats a little bit because 128-bit __atomic_load_n() is also
implemented using cmpxhg16b so does a write to the location.


>
> Le ven. 20 janv. 2017 à 14:31, Joe Savage  a écrit :
>
> > > I wonder what processor supports 128 bits atomics. As far as I know
> Intel
> >
> > > does not support it. Lock prefix is not allowed on SSE instructions.
> >
> >
> >
> > Actually, Intel does support them through a locked cmpxchg16b. And ARMv8
> >
> > through load exclusive pair and store exclusive pair.
> >
> >
>


Re: [lng-odp] 32-bit support in examples

2017-01-24 Thread Ola Liljedahl
On 20 January 2017 at 13:15, Savolainen, Petri (Nokia - FI/Espoo) <
petri.savolai...@nokia-bell-labs.com> wrote:

>
>
> > -Original Message-
> > From: Joe Savage [mailto:joe.sav...@arm.com]
> > Sent: Friday, January 20, 2017 1:51 PM
> > To: Savolainen, Petri (Nokia - FI/Espoo) <petri.savolainen@nokia-bell-
> > labs.com>; Maxim Uvarov <maxim.uva...@linaro.org>; lng-
> > o...@lists.linaro.org; Bill Fischofer <bill.fischo...@linaro.org>
> > Cc: nd <n...@arm.com>
> > Subject: Re: [lng-odp] 32-bit support in examples
> >
> > > Agree with Maxim. I which way the application is not 32 bit compliant?
> >
> > It uses 128-bit atomics, and so is really designed for execution on
> 64-bit
> > machines. It is possible to provide lockless 32-bit support in this case,
> > though, and I have an implementation that does so. Since the pointer size
> > is
> > halved and there is a pointer in the 128-bit struct, I just have to
> squash
> > a
> > few of the other fields down (managing them carefully) so that 64-bit
> > atomics
> > can be used instead.
>
> Unfortunately, ODP atomics API does not support 128 bit atomics - at least
> currently. So, your example could not use those anyway. Not all 64 CPUs
> have 128 bit atomic instructions.
>
> >
> > On reflection, I think that providing 32-bit support is probably
> > worthwhile
> > here, so I will do so. It does add a little complexity to the code, but
> > it's
> > not actually that much, and there are clear benefits from having the
> > example
> > be better supported on different platforms.
> >
> > I do think that having a place for 64-bit only examples in the future
> > (e.g.
> > an "example_64" directory as Bill outlined) might be useful though. It
> > isn't
> > always so easy to add 32-bit support.
>
> Good. ODP provides 32 and 64 bit atomics (also on 32 bit CPUs), so you can
> still utilize those. In addition, synchronization / critical sections
> should touch only a small portion of the application code base and
> preferably in a modular way (inside enqueue() / dequeue(), push() / pop(),
> etc functions).
>
This code can't use the ODP atomics (at least not as defined today). We are
not doing atomic operations on e.g. integer counters. The (preferably
lock-free) atomic operations are done on a structure with multiple fields.


>
> -Petri
>
>
>


Re: [lng-odp] 32-bit support in examples

2017-01-23 Thread Francois Ozog
The kernel has used a set of spinlocks to help dealing with atomics in
certain conditions (architecture, length of bitfield...).
A discussion on what atomi operations are needed may be of value to ODP too:
https://lwn.net/Articles/695257/


On 23 January 2017 at 10:49, Joe Savage  wrote:

> > Any atomics can be emulated in SW (using compiler built-ins or locks
> directly). The point here is the missing HW support:
> >  * E.g. MIPS, Power, ARMv7 do not have 128 bit CAS
> >  * 128 bit fetch-and-add is not supported in any of the architectures
> >
> > We need to ensure on any operations added that those can be implemented
> efficiently on most of the targets.
>
> Yes, I totally appreciate that this is important with respect to adding
> 128-bit support to the ODP atomics API. In terms of this example in
> particular, though, I would think that the more important factor is having
> support for the built-ins at all. After all, this isn't for performance
> measurement, but is merely an illustrative demonstration of fragmentation
> and
> reassembly using ODP. And we don't want to break builds without support.
>
> That being said, it would be nice if the performance was relatively
> acceptable in general in order to provide a more realistic view of what
> might
> be used in real systems. I believe that to be the case here, even for
> 32-bit
> machines, where 64-bit rather than 128-bit atomics are used. (This is good
> news for ARMv7 at least — I'm not sure if it helps MIPS out.)
>
> Unfortunately, there is a further complication here in that the doubleword
> (ARMv7) and quadword (ARMv8) atomic primitives aren't always there by
> default
> either. In my working copy, I'm currently bundling along lock-free 64-bit
> and
> 128-bit CAS implementations to fill this purpose for ARMv7 and ARMv8
> respectively. This is a slight annoyance, but saves a dependency on the
> external "libatomic" and gives a more efficient implementation than the
> lock-based solution used within this.
>



-- 
[image: Linaro] 
François-Frédéric Ozog | *Director Linaro Networking Group*
T: +33.67221.6485
francois.o...@linaro.org | Skype: ffozog


Re: [lng-odp] 32-bit support in examples

2017-01-23 Thread Joe Savage
> Any atomics can be emulated in SW (using compiler built-ins or locks 
> directly). The point here is the missing HW support:
>  * E.g. MIPS, Power, ARMv7 do not have 128 bit CAS
>  * 128 bit fetch-and-add is not supported in any of the architectures
> 
> We need to ensure on any operations added that those can be implemented 
> efficiently on most of the targets.

Yes, I totally appreciate that this is important with respect to adding
128-bit support to the ODP atomics API. In terms of this example in
particular, though, I would think that the more important factor is having
support for the built-ins at all. After all, this isn't for performance
measurement, but is merely an illustrative demonstration of fragmentation and
reassembly using ODP. And we don't want to break builds without support.

That being said, it would be nice if the performance was relatively
acceptable in general in order to provide a more realistic view of what might
be used in real systems. I believe that to be the case here, even for 32-bit
machines, where 64-bit rather than 128-bit atomics are used. (This is good
news for ARMv7 at least — I'm not sure if it helps MIPS out.)

Unfortunately, there is a further complication here in that the doubleword
(ARMv7) and quadword (ARMv8) atomic primitives aren't always there by default
either. In my working copy, I'm currently bundling along lock-free 64-bit and
128-bit CAS implementations to fill this purpose for ARMv7 and ARMv8
respectively. This is a slight annoyance, but saves a dependency on the
external "libatomic" and gives a more efficient implementation than the
lock-based solution used within this.


Re: [lng-odp] 32-bit support in examples

2017-01-23 Thread Savolainen, Petri (Nokia - FI/Espoo)


> -Original Message-
> From: Brian Brooks [mailto:brian.bro...@linaro.org]
> Sent: Friday, January 20, 2017 7:47 PM
> To: Francois Ozog <francois.o...@linaro.org>
> Cc: Bill Fischofer <bill.fischo...@linaro.org>; Joe Savage
> <joe.sav...@arm.com>; Maxim Uvarov <maxim.uva...@linaro.org>; Savolainen,
> Petri (Nokia - FI/Espoo) <petri.savolai...@nokia-bell-labs.com>; lng-
> o...@lists.linaro.org; nd <n...@arm.com>
> Subject: Re: [lng-odp] 32-bit support in examples
> 
> CAS is a universal primitive in the sense that you can construct those
> RMW ops by speculatively computing the updated value and the CAS to
> atomically update the value (in a retry loop).  LL/SC also universal,
> but different behavior.  Both are not the same as an atomic op
> performed deeper in the memory system.
> 
> To Petri's point about ODP not supporting 128b atomics, which compiler
> does not support the __atomic_xxx built-ins or the __int128 128b
> variable?  This has impact on portability and should be explicitly
> known; is it the microblaze compiler?



Any atomics can be emulated in SW (using compiler built-ins or locks directly). 
The point here is the missing HW support:
 * E.g. MIPS, Power, ARMv7 do not have 128 bit CAS
 * 128 bit fetch-and-add is not supported in any of the architectures

We need to ensure on any operations added that those can be implemented 
efficiently on most of the targets.

-Petri


> 
> On Fri, Jan 20, 2017 at 7:36 AM, Francois Ozog <francois.o...@linaro.org>
> wrote:
> > well, yes. But that is the only atomic operation supported. No add, sub,
> > inc, xadd, bit operations
> >
> > Le ven. 20 janv. 2017 à 14:31, Joe Savage <joe.sav...@arm.com> a écrit :
> >
> >> > I wonder what processor supports 128 bits atomics. As far as I know
> Intel
> >>
> >> > does not support it. Lock prefix is not allowed on SSE instructions.
> >>
> >>
> >>
> >> Actually, Intel does support them through a locked cmpxchg16b. And
> ARMv8
> >>
> >> through load exclusive pair and store exclusive pair.
> >>
> >>


Re: [lng-odp] 32-bit support in examples

2017-01-20 Thread Brian Brooks
CAS is a universal primitive in the sense that you can construct those
RMW ops by speculatively computing the updated value and the CAS to
atomically update the value (in a retry loop).  LL/SC also universal,
but different behavior.  Both are not the same as an atomic op
performed deeper in the memory system.

To Petri's point about ODP not supporting 128b atomics, which compiler
does not support the __atomic_xxx built-ins or the __int128 128b
variable?  This has impact on portability and should be explicitly
known; is it the microblaze compiler?

On Fri, Jan 20, 2017 at 7:36 AM, Francois Ozog  wrote:
> well, yes. But that is the only atomic operation supported. No add, sub,
> inc, xadd, bit operations
>
> Le ven. 20 janv. 2017 à 14:31, Joe Savage  a écrit :
>
>> > I wonder what processor supports 128 bits atomics. As far as I know Intel
>>
>> > does not support it. Lock prefix is not allowed on SSE instructions.
>>
>>
>>
>> Actually, Intel does support them through a locked cmpxchg16b. And ARMv8
>>
>> through load exclusive pair and store exclusive pair.
>>
>>


Re: [lng-odp] 32-bit support in examples

2017-01-20 Thread Francois Ozog
well, yes. But that is the only atomic operation supported. No add, sub,
inc, xadd, bit operations

Le ven. 20 janv. 2017 à 14:31, Joe Savage  a écrit :

> > I wonder what processor supports 128 bits atomics. As far as I know Intel
>
> > does not support it. Lock prefix is not allowed on SSE instructions.
>
>
>
> Actually, Intel does support them through a locked cmpxchg16b. And ARMv8
>
> through load exclusive pair and store exclusive pair.
>
>


Re: [lng-odp] 32-bit support in examples

2017-01-20 Thread Joe Savage
> I wonder what processor supports 128 bits atomics. As far as I know Intel
> does not support it. Lock prefix is not allowed on SSE instructions.

Actually, Intel does support them through a locked cmpxchg16b. And ARMv8
through load exclusive pair and store exclusive pair.


Re: [lng-odp] 32-bit support in examples

2017-01-20 Thread Francois Ozog
I wonder what processor supports 128 bits atomics. As far as I know Intel
does not support it. Lock prefix is not allowed on SSE instructions.

FF

Le ven. 20 janv. 2017 à 13:59, Joe Savage  a écrit :

> > Unfortunately, ODP atomics API does not support 128 bit atomics - at
> least
>
> > currently. So, your example could not use those anyway. Not all 64 CPUs
>
> > have 128 bit atomic instructions.
>
>
>
> Yes, this is unfortunate. For the time being I am using GCC's __atomic
>
> operations such as __atomic_compare_exchange, which can use lock-based
>
> implementations if necessary. I did look at using the ODP atomics API for
> the
>
> 64-bit case, but doing so would require even more structural changes and
>
> exceptions for the 32-bit case to the point of madness. Assuming 128-bit
>
> atomics get introduced in future, it probably makes sense to port the full
>
> example over to using the official API at that point.
>
>


Re: [lng-odp] 32-bit support in examples

2017-01-20 Thread Joe Savage
> Unfortunately, ODP atomics API does not support 128 bit atomics - at least
> currently. So, your example could not use those anyway. Not all 64 CPUs
> have 128 bit atomic instructions.

Yes, this is unfortunate. For the time being I am using GCC's __atomic
operations such as __atomic_compare_exchange, which can use lock-based
implementations if necessary. I did look at using the ODP atomics API for the
64-bit case, but doing so would require even more structural changes and
exceptions for the 32-bit case to the point of madness. Assuming 128-bit
atomics get introduced in future, it probably makes sense to port the full
example over to using the official API at that point.


Re: [lng-odp] 32-bit support in examples

2017-01-20 Thread Savolainen, Petri (Nokia - FI/Espoo)


> -Original Message-
> From: Joe Savage [mailto:joe.sav...@arm.com]
> Sent: Friday, January 20, 2017 1:51 PM
> To: Savolainen, Petri (Nokia - FI/Espoo) <petri.savolainen@nokia-bell-
> labs.com>; Maxim Uvarov <maxim.uva...@linaro.org>; lng-
> o...@lists.linaro.org; Bill Fischofer <bill.fischo...@linaro.org>
> Cc: nd <n...@arm.com>
> Subject: Re: [lng-odp] 32-bit support in examples
> 
> > Agree with Maxim. I which way the application is not 32 bit compliant?
> 
> It uses 128-bit atomics, and so is really designed for execution on 64-bit
> machines. It is possible to provide lockless 32-bit support in this case,
> though, and I have an implementation that does so. Since the pointer size
> is
> halved and there is a pointer in the 128-bit struct, I just have to squash
> a
> few of the other fields down (managing them carefully) so that 64-bit
> atomics
> can be used instead.

Unfortunately, ODP atomics API does not support 128 bit atomics - at least 
currently. So, your example could not use those anyway. Not all 64 CPUs have 
128 bit atomic instructions.

> 
> On reflection, I think that providing 32-bit support is probably
> worthwhile
> here, so I will do so. It does add a little complexity to the code, but
> it's
> not actually that much, and there are clear benefits from having the
> example
> be better supported on different platforms.
> 
> I do think that having a place for 64-bit only examples in the future
> (e.g.
> an "example_64" directory as Bill outlined) might be useful though. It
> isn't
> always so easy to add 32-bit support.

Good. ODP provides 32 and 64 bit atomics (also on 32 bit CPUs), so you can 
still utilize those. In addition, synchronization / critical sections should 
touch only a small portion of the application code base and preferably in a 
modular way (inside enqueue() / dequeue(), push() / pop(), etc functions).

-Petri




Re: [lng-odp] 32-bit support in examples

2017-01-20 Thread Joe Savage
> Agree with Maxim. I which way the application is not 32 bit compliant?

It uses 128-bit atomics, and so is really designed for execution on 64-bit
machines. It is possible to provide lockless 32-bit support in this case,
though, and I have an implementation that does so. Since the pointer size is
halved and there is a pointer in the 128-bit struct, I just have to squash a
few of the other fields down (managing them carefully) so that 64-bit atomics
can be used instead.

On reflection, I think that providing 32-bit support is probably worthwhile
here, so I will do so. It does add a little complexity to the code, but it's
not actually that much, and there are clear benefits from having the example
be better supported on different platforms.

I do think that having a place for 64-bit only examples in the future (e.g.
an "example_64" directory as Bill outlined) might be useful though. It isn't
always so easy to add 32-bit support.

Re: [lng-odp] 32-bit support in examples

2017-01-20 Thread Savolainen, Petri (Nokia - FI/Espoo)


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Maxim
> Uvarov
> Sent: Thursday, January 19, 2017 3:27 PM
> To: lng-odp@lists.linaro.org
> Subject: Re: [lng-odp] 32-bit support in examples
> 
> On 01/19/17 16:23, Joe Savage wrote:
> > Hey,
> >
> > I'm just finalising a patch to add a lock-free IPv4 fragmentation and
> > reassembly example to ODP, and it has occurred to me that as the whole
> > examples directory is compiled with odp-linux by default, I ought to be
> > careful about supporting a wide variety of platforms (ensuring that my
> change
> > doesn't break any builds).
> >
> > I was initially planning to submit the example with 64-bit support only.
> This
> > simplifies the implementation, and should cater for most of the
> important
> > cases, but could cause issues on 32-bit platforms. How valuable does the
> > community view 32-bit support in an example? If it's worthwhile, I can
> add
> > the support at the cost of complexity, but if not, perhaps I can apply
> some
> > form of conditional compilation such that my example only builds on
> > supported platforms.
> >
> > Joe
> >
> 
> 
> It's better to support both. At least for now we support both everywhere
> and CI also triggers build for 32.
> 
> Maxim.

Agree with Maxim. I which way the application is not 32 bit compliant?

-Petri




Re: [lng-odp] 32-bit support in examples

2017-01-19 Thread Bill Fischofer
On Thu, Jan 19, 2017 at 7:23 AM, Joe Savage  wrote:
> Hey,
>
> I'm just finalising a patch to add a lock-free IPv4 fragmentation and
> reassembly example to ODP, and it has occurred to me that as the whole
> examples directory is compiled with odp-linux by default, I ought to be
> careful about supporting a wide variety of platforms (ensuring that my change
> doesn't break any builds).
>
> I was initially planning to submit the example with 64-bit support only. This
> simplifies the implementation, and should cater for most of the important
> cases, but could cause issues on 32-bit platforms. How valuable does the
> community view 32-bit support in an example? If it's worthwhile, I can add
> the support at the cost of complexity, but if not, perhaps I can apply some
> form of conditional compilation such that my example only builds on
> supported platforms.
>
> Joe

All ODP APIs are 32/64-bit agnostic, as is the odp-linux reference
implementation, however there is no requirement that ODP applications
support both. So a 64-bit only example is perfectly fine, as long as
it is documented as such so there's no confusion.

32-bit is still important in the embedded space even as it's become
largely irrelevant in the datacenter, which is why ODP APis and our
implementations are careful to support both.

If we're going to start accepting modal examples, we should probably
have an example_64 directory or such as a place to put them so it's
clear to all that items in this directory are only for 64-bit systems.


Re: [lng-odp] 32-bit support in examples

2017-01-19 Thread Mike Holmes
On 19 January 2017 at 09:37, Joe Savage  wrote:
>> Is it a true very direct example of the concept, or more of a mini
>> application ?
>>
>> We are trying to lift out of examples those things that are more
>> complex mini apps to a new apps directory.
>> One distinction is the requirement to use external traffic generation
>> for example would likely make it a mini app, another is size, or the
>> need to use many ODP features making it  less clear as an example of a
>> specific ODP concept, and more of an mini application example showing
>> a networking use case.
>
> Well, originally it was supposed to be a direct example, but I think it may
> have ballooned into a mini-application. Though it handles artificially
> generated traffic, and is not intended to be a feature complete out of the
> box solution for fragmentation and reassembly (it asserts on fragment overlap
> or duplication, doesn't invoke cleanup on hitting memory limits, etc.), it's
> more complex than a simple "look, you can reassemble fragments with ODP!"
> solution. For instance, it's lock-free, which introduces some complexity in
> itself.
>
> It's around 1900 lines in total, so make of that what you will.
>
> Joe

Ok, I would not stop it going in examples, better to have it in,
especially if it can run self contained and thus contribute to
regression testing easily.

-- 
Mike Holmes
Program Manager - Linaro Networking Group
Linaro.org │ Open source software for ARM SoCs
"Work should be fun and collaborative, the rest follows"


Re: [lng-odp] 32-bit support in examples

2017-01-19 Thread Joe Savage
> Is it a true very direct example of the concept, or more of a mini
> application ?
> 
> We are trying to lift out of examples those things that are more
> complex mini apps to a new apps directory.
> One distinction is the requirement to use external traffic generation
> for example would likely make it a mini app, another is size, or the
> need to use many ODP features making it  less clear as an example of a
> specific ODP concept, and more of an mini application example showing
> a networking use case.

Well, originally it was supposed to be a direct example, but I think it may
have ballooned into a mini-application. Though it handles artificially
generated traffic, and is not intended to be a feature complete out of the
box solution for fragmentation and reassembly (it asserts on fragment overlap
or duplication, doesn't invoke cleanup on hitting memory limits, etc.), it's
more complex than a simple "look, you can reassemble fragments with ODP!"
solution. For instance, it's lock-free, which introduces some complexity in
itself.

It's around 1900 lines in total, so make of that what you will.

Joe


Re: [lng-odp] 32-bit support in examples

2017-01-19 Thread Mike Holmes
Joe,

Is it a true very direct example of the concept, or more of a mini application ?

We are trying to lift out of examples those things that are more
complex mini apps to a new apps directory.
One distinction is the requirement to use external traffic generation
for example would likely make it a mini app, another is size, or the
need to use many ODP features making it  less clear as an example of a
specific ODP concept, and more of an mini application example showing
a networking use case.

Mike




On 19 January 2017 at 08:27, Maxim Uvarov  wrote:
> On 01/19/17 16:23, Joe Savage wrote:
>> Hey,
>>
>> I'm just finalising a patch to add a lock-free IPv4 fragmentation and
>> reassembly example to ODP, and it has occurred to me that as the whole
>> examples directory is compiled with odp-linux by default, I ought to be
>> careful about supporting a wide variety of platforms (ensuring that my change
>> doesn't break any builds).
>>
>> I was initially planning to submit the example with 64-bit support only. This
>> simplifies the implementation, and should cater for most of the important
>> cases, but could cause issues on 32-bit platforms. How valuable does the
>> community view 32-bit support in an example? If it's worthwhile, I can add
>> the support at the cost of complexity, but if not, perhaps I can apply some
>> form of conditional compilation such that my example only builds on
>> supported platforms.
>>
>> Joe
>>
>
>
> It's better to support both. At least for now we support both everywhere
> and CI also triggers build for 32.
>
> Maxim.



-- 
Mike Holmes
Program Manager - Linaro Networking Group
Linaro.org │ Open source software for ARM SoCs
"Work should be fun and collaborative, the rest follows"


Re: [lng-odp] 32-bit support in examples

2017-01-19 Thread Maxim Uvarov
On 01/19/17 16:23, Joe Savage wrote:
> Hey,
> 
> I'm just finalising a patch to add a lock-free IPv4 fragmentation and
> reassembly example to ODP, and it has occurred to me that as the whole
> examples directory is compiled with odp-linux by default, I ought to be
> careful about supporting a wide variety of platforms (ensuring that my change
> doesn't break any builds).
> 
> I was initially planning to submit the example with 64-bit support only. This
> simplifies the implementation, and should cater for most of the important
> cases, but could cause issues on 32-bit platforms. How valuable does the
> community view 32-bit support in an example? If it's worthwhile, I can add
> the support at the cost of complexity, but if not, perhaps I can apply some
> form of conditional compilation such that my example only builds on
> supported platforms.
> 
> Joe
> 


It's better to support both. At least for now we support both everywhere
and CI also triggers build for 32.

Maxim.


[lng-odp] 32-bit support in examples

2017-01-19 Thread Joe Savage
Hey,

I'm just finalising a patch to add a lock-free IPv4 fragmentation and
reassembly example to ODP, and it has occurred to me that as the whole
examples directory is compiled with odp-linux by default, I ought to be
careful about supporting a wide variety of platforms (ensuring that my change
doesn't break any builds).

I was initially planning to submit the example with 64-bit support only. This
simplifies the implementation, and should cater for most of the important
cases, but could cause issues on 32-bit platforms. How valuable does the
community view 32-bit support in an example? If it's worthwhile, I can add
the support at the cost of complexity, but if not, perhaps I can apply some
form of conditional compilation such that my example only builds on
supported platforms.

Joe