Re: Event order

2007-07-26 Thread Robert Collins
On Fri, 2007-07-27 at 09:08 +1000, Robert Collins wrote:
> On Thu, 2007-07-26 at 09:31 -0600, Alex Rousskov wrote:
> > Folks,
> > 
> > There are at least two known difficult-to-reproduce bugs that may be
> > explained by asynchronous calls being called out of order. I do not know
> > whether out-of-order execution is indeed their cause, but they prompted
> > me to investigate event scheduling further.
> > 
> > Currently, asynchronous calls are implemented using addEvent with
> > 'when' parameter set to zero. This means that the event time is set to
> > current_dtime in EventScheduler::schedule. However, current_dtime may
> > _decrease_ when the system clock is adjusted. If such a decrease happens
> > between the two asynchronous call submissions, the later call will be
> > fired first.
> > 
> > I see two ways of fixing this:
> > 
> > 1) Stop using addEvent for asynchronous calls. Add a special queue for
> > them and drain the queue every select loop. Pros: straightforward design
> > that is probably a little faster than addEvent because we will always
> > append the new call instead of searching for the right place in the
> > queue. This design will help treating asynchronous calls specially in
> > the future (e.g., debugging and exception trapping). Cons: lots of work
> > and current code changes.

Oh, I should note - its dead easy to add a new dispatcher that just
implements a FIFO of queued calls per event loop, using the current
event infrastructure. It may be overengineered, but it makes such
additions trivial.

-Rob
-- 
GPG key available at: .


signature.asc
Description: This is a digitally signed message part


Re: Event order

2007-07-26 Thread Robert Collins
On Thu, 2007-07-26 at 09:31 -0600, Alex Rousskov wrote:
> Folks,
> 
>   There are at least two known difficult-to-reproduce bugs that may be
> explained by asynchronous calls being called out of order. I do not know
> whether out-of-order execution is indeed their cause, but they prompted
> me to investigate event scheduling further.
> 
>   Currently, asynchronous calls are implemented using addEvent with
> 'when' parameter set to zero. This means that the event time is set to
> current_dtime in EventScheduler::schedule. However, current_dtime may
> _decrease_ when the system clock is adjusted. If such a decrease happens
> between the two asynchronous call submissions, the later call will be
> fired first.
> 
>   I see two ways of fixing this:
> 
> 1) Stop using addEvent for asynchronous calls. Add a special queue for
> them and drain the queue every select loop. Pros: straightforward design
> that is probably a little faster than addEvent because we will always
> append the new call instead of searching for the right place in the
> queue. This design will help treating asynchronous calls specially in
> the future (e.g., debugging and exception trapping). Cons: lots of work
> and current code changes.
> 
> 2) Treat when=0 events specially in addEvent. Always place them in the
> beginning of the queue, but after other special events. To mark an event
> as special, we can set its absolute timestamp to zero, for example.
> Pros: much easier to implement. Cons: it is a hack.
> 
> I am leaning towards (2) for now because it minimizes the modifications
> and risk. The attached patch implements that option.
> 
> Any comments or better ideas?

I think 1) would be good, but your hack for 2) looks eminently sane.

-Rob
-- 
GPG key available at: .


signature.asc
Description: This is a digitally signed message part


Re: Event order

2007-07-26 Thread Alex Rousskov
On Thu, 2007-07-26 at 10:30 -0600, Duane Wessels wrote:
> (1) still sounds like a good idea, although we might find some
> problems with it after implementation. 

I will probably implement (1) as a part of main loop and async call
overhaul, after v3.0 is released.

Thank you,

Alex.




Re: Event order

2007-07-26 Thread Duane Wessels




On Thu, 26 Jul 2007, Alex Rousskov wrote:


I am leaning towards (2) for now because it minimizes the modifications
and risk. The attached patch implements that option.


Your patch is simple enough that I do not find it offensive :-) My
only suggestion is to add more comments, in particular about
protecting "backwards" clocks, and maybe a note that it is okay for
timestamp to be zero.

(1) still sounds like a good idea, although we might find some
problems with it after implementation.

DW


Event order

2007-07-26 Thread Alex Rousskov
Folks,

There are at least two known difficult-to-reproduce bugs that may be
explained by asynchronous calls being called out of order. I do not know
whether out-of-order execution is indeed their cause, but they prompted
me to investigate event scheduling further.

Currently, asynchronous calls are implemented using addEvent with
'when' parameter set to zero. This means that the event time is set to
current_dtime in EventScheduler::schedule. However, current_dtime may
_decrease_ when the system clock is adjusted. If such a decrease happens
between the two asynchronous call submissions, the later call will be
fired first.

I see two ways of fixing this:

1) Stop using addEvent for asynchronous calls. Add a special queue for
them and drain the queue every select loop. Pros: straightforward design
that is probably a little faster than addEvent because we will always
append the new call instead of searching for the right place in the
queue. This design will help treating asynchronous calls specially in
the future (e.g., debugging and exception trapping). Cons: lots of work
and current code changes.

2) Treat when=0 events specially in addEvent. Always place them in the
beginning of the queue, but after other special events. To mark an event
as special, we can set its absolute timestamp to zero, for example.
Pros: much easier to implement. Cons: it is a hack.

I am leaning towards (2) for now because it minimizes the modifications
and risk. The attached patch implements that option.

Any comments or better ideas?

Thank you,

Alex.

Index: src/event.cc
===
RCS file: /cvsroot/squid/squid3/src/event.cc,v
retrieving revision 1.5.12.2
diff -u -r1.5.12.2 event.cc
--- src/event.cc	16 Apr 2007 19:47:42 -	1.5.12.2
+++ src/event.cc	26 Jul 2007 15:28:25 -
@@ -305,8 +305,9 @@
 void
 EventScheduler::schedule(const char *name, EVH * func, void *arg, double when, int weight, bool cbdata)
 {
-
-struct ev_entry *event = new ev_entry(name, func, arg, current_dtime + when, weight, cbdata);
+// Treat when=0 events specially: many are async calls that preserve order
+const double timestamp = when > 0.0 ? current_dtime + when : 0;
+struct ev_entry *event = new ev_entry(name, func, arg, timestamp, weight, cbdata);
 
 struct ev_entry **E;
 debugs(41, 7, HERE << "schedule: Adding '" << name << "', in " << when << " seconds");