On 12 May 2015 at 18:06, Ola Liljedahl <ola.liljed...@linaro.org> wrote:
>
> On 12 May 2015 at 14:28, Bala Manoharan <bala.manoha...@linaro.org> wrote:
>>
>> IMO, using atomic variable instead of atomic queues will work for this Ipsec 
>> example use-case as in this case the critical section is required only for 
>> updating the sequence number but in a generic use-case the atomicity should 
>> be protected over "a region of code" which the application wants to be 
>> executed in the ingress-order.
>
> Isn't this the atomic queues/scheduling of ODP (and several SoC's)?

Yes. But atomic queue switching is currently only through scheduler
and it involves scheduler overhead.

>
>
>>
>> If HWs are capable we should add additional APIs for scheduler based locking 
>> which can be used by application incase the critical section is small enough 
>> that going through scheduler will cause a performance impact.
>
> Isn't there a "release atomic scheduling" function? A CPU processing stage 
> can start out atomic (HW provided mutual exclusion) but convert into 
> "ordered" scheduling when the mutual exclusion is no longer necessary. Or 
> this missing from the ODP API today?

"release atomic scheduling" only talks from "Atomic --> ordered"
scenario, I believe the scenario here dictates
"Ordered --> Atomic --> Ordered" sequence. Currently this sequence
goes through the scheduler and the intent is to avoid scheduler
overhead.

Regards,
Bala

>
>
>>
>> Regards,
>> Bala
>>
>> On 12 May 2015 at 17:31, Ola Liljedahl <ola.liljed...@linaro.org> wrote:
>>>
>>> Yes the seqno counter is a shared resource and the atomic_fetch_and_add 
>>> will eventually become a bottleneck for per-SA throughput. But one could 
>>> hope that it should scale better than using atomic queues although this 
>>> depends on the actual microarchitecture.
>>>
>>> Freescale PPC has "decorated storage", can't it be used for 
>>> atomic_fetch_and_add()?
>>> ARMv8.1 has support for "far atomics" which are supposed to scale better 
>>> (again depends on the actual implementation).
>>>
>>> On 12 May 2015 at 13:47, Alexandru Badicioiu 
>>> <alexandru.badici...@linaro.org> wrote:
>>>>
>>>> Atomic increment performance gets worse by increasing the number of cores -
>>>> see 
>>>> https://www.kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.2015.01.31a.pdf
>>>>  - chapter 5) for some measurements on a conventional Intel machine.
>>>> It may be possible for this overhead to become bigger than the one 
>>>> associated with the atomic queue.
>>>>
>>>> Alex
>>>>
>>>> On 12 May 2015 at 14:20, Ola Liljedahl <ola.liljed...@linaro.org> wrote:
>>>>>
>>>>> I think it should be OK to use ordered queues instead of atomic queues, 
>>>>> e.g. for sequence number allocation (which will need an atomic variable 
>>>>> to hold the seqno counter). Packet ingress order will be maintained but 
>>>>> might not always correspond to sequence number order. This is not a 
>>>>> problem a the sliding window in the replay protection will take care of 
>>>>> that, the sliding window could be hundreds of entries (sequence numbers) 
>>>>> large (each will only take one bit). Packet ingress order is the 
>>>>> important characteristic that must be maintained. The IPsec sequence 
>>>>> number is not used for packet ordering and order restoration, it is only 
>>>>> used for replay protection.
>>>>>
>>>>> Someone with a platform which supports both ordered and atomic scheduling 
>>>>> could benchmark both designs and see how performance scales when using 
>>>>> ordered queues (and that atomic fetch_and_add) for some relevant traffic 
>>>>> patterns.
>>>>>
>>>>> On 8 May 2015 at 13:53, Bill Fischofer <bill.fischo...@linaro.org> wrote:
>>>>>>
>>>>>> Jerrin,
>>>>>>
>>>>>> Can you propose such a set of APIs for further discussion?  This would 
>>>>>> be good to discuss at the Tuesday call.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> Bill
>>>>>>
>>>>>> On Fri, May 8, 2015 at 12:07 AM, Jacob, Jerin 
>>>>>> <jerin.ja...@caviumnetworks.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>> I agree with Ola here on preserving the ingress order.
>>>>>>> However, I have experienced same performance issue as Nikhil pointed out
>>>>>>> (atomic queues have too much overhead for short critical section)
>>>>>>>
>>>>>>> I am not sure about any other HW but Cavium has support for
>>>>>>> introducing the critical section while maintain the ingress order as a 
>>>>>>> HW scheduler feature.
>>>>>>>
>>>>>>> IMO, if such support is available in other HW then 
>>>>>>> odp_schedule_ordered_lock()/unlock()
>>>>>>> kind of API will solve the performance issue for the need for short 
>>>>>>> critical section in ordered flow.
>>>>>>>
>>>>>>> /Jerin.
>>>>>>>
>>>>>>>
>>>>>>> From: lng-odp <lng-odp-boun...@lists.linaro.org> on behalf of Ola 
>>>>>>> Liljedahl <ola.liljed...@linaro.org>
>>>>>>> Sent: Thursday, May 7, 2015 9:06 PM
>>>>>>> To: nikhil.agar...@freescale.com
>>>>>>> Cc: lng-odp@lists.linaro.org
>>>>>>> Subject: Re: [lng-odp] Query regarding sequence number update in IPSEC 
>>>>>>> application
>>>>>>>
>>>>>>>
>>>>>>> Using atomic queues will preserve the ingress order when allocating and 
>>>>>>> assigning the sequence number. Also you don't need to use an expensive 
>>>>>>> atomic operation for updating the sequence number as the atomic queue 
>>>>>>> and scheduling will provide mutual  exclusion.
>>>>>>>
>>>>>>>
>>>>>>> If the packets that require a sequence number came from parallel or 
>>>>>>> ordered queues, there would be no guarantee that the sequence numbers 
>>>>>>> would be allocated in packet (ingress) order. Just using an atomic 
>>>>>>> operation (e.g. fetch_and_add or similar) only  guarantees proper 
>>>>>>> update of the sequence number variable, not any specific ordering.
>>>>>>>
>>>>>>>
>>>>>>> If you are ready to trade absolute "correctness" for performance, you 
>>>>>>> could use ordered or may even parallel (questionable for other reasons) 
>>>>>>> queues and then allocate the sequence number using an atomic 
>>>>>>> fetch_and_add. Sometimes packets  egress order will then not match the 
>>>>>>> sequence number order (for a flow/SA). For IPSec, this might affect the 
>>>>>>> replay window check & update at the receiving end but as the replay 
>>>>>>> protection uses a sliding window of sequence numbers (to handle 
>>>>>>> misordered packets),  there might not be any adverse effects in 
>>>>>>> practice. The most important aspect is probably to preserve original 
>>>>>>> packet order.
>>>>>>>
>>>>>>>
>>>>>>> -- Ola
>>>>>>>
>>>>>>>
>>>>>>> On 6 May 2015 at 11:29,  nikhil.agar...@freescale.com 
>>>>>>> <nikhil.agar...@freescale.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> In IPSEC example application, queues are used to update the sequence 
>>>>>>> number. I was wondering why we have used queues to update sequence 
>>>>>>> number which will add to scheduling delays and adversely hit the 
>>>>>>> performance throughput. Is there any  specific advantage of using 
>>>>>>> queues over atomic variables.
>>>>>>>
>>>>>>> Thanks in advance
>>>>>>>  Nikhil
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> lng-odp mailing list
>>>>>>> lng-odp@lists.linaro.org
>>>>>>> https://lists.linaro.org/mailman/listinfo/lng-odp
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> lng-odp mailing list
>>>>>>> lng-odp@lists.linaro.org
>>>>>>> https://lists.linaro.org/mailman/listinfo/lng-odp
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> lng-odp mailing list
>>>>> lng-odp@lists.linaro.org
>>>>> https://lists.linaro.org/mailman/listinfo/lng-odp
>>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> lng-odp mailing list
>>> lng-odp@lists.linaro.org
>>> https://lists.linaro.org/mailman/listinfo/lng-odp
>>>
>>
>
_______________________________________________
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp

Reply via email to