Atomic increment performance gets worse by increasing the number of cores -
see
https://www.kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.2015.01.31a.pdf
- chapter 5) for some measurements on a conventional Intel machine.
It may be possible for this overhead to become bigger than the one
associated with the atomic queue.

Alex

On 12 May 2015 at 14:20, Ola Liljedahl <ola.liljed...@linaro.org> wrote:

> I think it should be OK to use ordered queues instead of atomic queues,
> e.g. for sequence number allocation (which will need an atomic variable to
> hold the seqno counter). Packet ingress order will be maintained but might
> not always correspond to sequence number order. This is not a problem a the
> sliding window in the replay protection will take care of that, the sliding
> window could be hundreds of entries (sequence numbers) large (each will
> only take one bit). Packet ingress order is the important characteristic
> that must be maintained. The IPsec sequence number is not used for packet
> ordering and order restoration, it is only used for replay protection.
>
> Someone with a platform which supports both ordered and atomic scheduling
> could benchmark both designs and see how performance scales when using
> ordered queues (and that atomic fetch_and_add) for some relevant traffic
> patterns.
>
> On 8 May 2015 at 13:53, Bill Fischofer <bill.fischo...@linaro.org> wrote:
>
>> Jerrin,
>>
>> Can you propose such a set of APIs for further discussion?  This would be
>> good to discuss at the Tuesday call.
>>
>> Thanks.
>>
>> Bill
>>
>> On Fri, May 8, 2015 at 12:07 AM, Jacob, Jerin <
>> jerin.ja...@caviumnetworks.com> wrote:
>>
>>>
>>> I agree with Ola here on preserving the ingress order.
>>> However, I have experienced same performance issue as Nikhil pointed out
>>> (atomic queues have too much overhead for short critical section)
>>>
>>> I am not sure about any other HW but Cavium has support for
>>> introducing the critical section while maintain the ingress order as a
>>> HW scheduler feature.
>>>
>>> IMO, if such support is available in other HW then
>>> odp_schedule_ordered_lock()/unlock()
>>> kind of API will solve the performance issue for the need for short
>>> critical section in ordered flow.
>>>
>>> /Jerin.
>>>
>>>
>>> From: lng-odp <lng-odp-boun...@lists.linaro.org> on behalf of Ola
>>> Liljedahl <ola.liljed...@linaro.org>
>>> Sent: Thursday, May 7, 2015 9:06 PM
>>> To: nikhil.agar...@freescale.com
>>> Cc: lng-odp@lists.linaro.org
>>> Subject: Re: [lng-odp] Query regarding sequence number update in IPSEC
>>> application
>>>
>>>
>>> Using atomic queues will preserve the ingress order when allocating and
>>> assigning the sequence number. Also you don't need to use an expensive
>>> atomic operation for updating the sequence number as the atomic queue and
>>> scheduling will provide mutual  exclusion.
>>>
>>>
>>> If the packets that require a sequence number came from parallel or
>>> ordered queues, there would be no guarantee that the sequence numbers would
>>> be allocated in packet (ingress) order. Just using an atomic operation
>>> (e.g. fetch_and_add or similar) only  guarantees proper update of the
>>> sequence number variable, not any specific ordering.
>>>
>>>
>>> If you are ready to trade absolute "correctness" for performance, you
>>> could use ordered or may even parallel (questionable for other reasons)
>>> queues and then allocate the sequence number using an atomic fetch_and_add.
>>> Sometimes packets  egress order will then not match the sequence number
>>> order (for a flow/SA). For IPSec, this might affect the replay window check
>>> & update at the receiving end but as the replay protection uses a sliding
>>> window of sequence numbers (to handle misordered packets),  there might not
>>> be any adverse effects in practice. The most important aspect is probably
>>> to preserve original packet order.
>>>
>>>
>>> -- Ola
>>>
>>>
>>> On 6 May 2015 at 11:29,  nikhil.agar...@freescale.com <
>>> nikhil.agar...@freescale.com> wrote:
>>>
>>>
>>> Hi,
>>>
>>> In IPSEC example application, queues are used to update the sequence
>>> number. I was wondering why we have used queues to update sequence number
>>> which will add to scheduling delays and adversely hit the performance
>>> throughput. Is there any  specific advantage of using queues over atomic
>>> variables.
>>>
>>> Thanks in advance
>>>  Nikhil
>>>
>>>
>>> _______________________________________________
>>> lng-odp mailing list
>>> lng-odp@lists.linaro.org
>>> https://lists.linaro.org/mailman/listinfo/lng-odp
>>>
>>>
>>>
>>> _______________________________________________
>>> lng-odp mailing list
>>> lng-odp@lists.linaro.org
>>> https://lists.linaro.org/mailman/listinfo/lng-odp
>>>
>>
>>
>
> _______________________________________________
> lng-odp mailing list
> lng-odp@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/lng-odp
>
>
_______________________________________________
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp

Reply via email to