On 25/04/2017, 12:26, "Savolainen, Petri (Nokia - FI/Espoo)" 
<petri.savolai...@nokia-bell-labs.com<mailto:petri.savolai...@nokia-bell-labs.com>>
 wrote:



-----Original Message-----
From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Brian
Brooks
Sent: Monday, April 24, 2017 11:59 PM
To: lng-odp@lists.linaro.org<mailto:lng-odp@lists.linaro.org>
Cc: Ola Liljedahl <ola.liljed...@arm.com<mailto:ola.liljed...@arm.com>>
Subject: [lng-odp] [PATCH] test: odp_sched_latency: robust draining of
queues
From: Ola Liljedahl <ola.liljed...@arm.com<mailto:ola.liljed...@arm.com>>
In order to robustly drain all queues when the benchmark has
ended, we enqueue a special event on every queue and invoke
the scheduler until all such events have been received.

odp_schedule_pause();

while (1) {
ev = odp_schedule(&src_queue, ODP_SCHED_NO_WAIT);

if (ev == ODP_EVENT_INVALID)
break;

if (odp_queue_enq(src_queue, ev)) {
LOG_ERR("[%i] Queue enqueue failed.\n", thr);
odp_event_free(ev);
return -1;
}
}

odp_schedule_resume();

odp_barrier_wait(&globals->barrier);

clear_sched_queues();


What is the issue that this patch fixes?
The issue is that odp_schedule() (even with a timeout) returns 
ODP_EVENT_INVALID but the queues are not actually empty. In a loosely 
synchronised (e.g. using weak ordering) queue and scheduler implementation, 
odp_schedule() can spuriously return EVENT_INVALID. This happens infrequently 
on some A57 targets.

This sequence should be quite robust already since no new enqueues happen after 
the barrier. In a simple test code like this, the latency from last enq() 
(through the barrier) to schedule loop (in clear_sched_queues()) could be 
overcome just by not exiting after the first EVENT_INVALID from scheduler, but 
after N EVENT_INVALIDs in a row.
In the scalable scheduler & queue implementation, it can take some time before 
enqueued events become visible and the corresponding ODP queues pushed to some 
scheduler queue. So odp_schedule() can return ODP_EVENT_INVALID, even when 
called with a timeout. There is no timeout or no amount of INVALID_EVENT 
returns that *guarantees* that the queues have been drained.


Also in your patch, thread should exit only after scheduler returns 
EVENT_INVALID.
Since the cool_down event is the last event on all queues (as they are enqueued 
after all threads have passed the barrier), when we have received all cool_down 
events we know that there are no other events on the these queues. No need to 
call odp_schedule() until it returns ODP_EVENT_INVALID (which can happen 
spuriously anyway so doesn’t signify anything).



-Petri


Reply via email to