Re: Build failed in Jenkins: river-qa-refactor-win #45

Patricia Shanahan Sat, 04 Jan 2014 00:59:54 -0800

Just before Christmas, you were discussing whether to fix concurrencyproblems based on theoretical analysis, or to only fix those problemsfor which there is experimental evidence.

I believe the PMC will be at cross-purposes until you resolve thatissue, and strongly advise discussing and voting on it.

This is an example of a question whose answer would be obvious andnon-controversial if you had agreement, either way, on that generalissue. "When do you claim that this happens? And what currently happensnow that is unacceptable? What is the concrete, observable problem thatyou’re trying to solve, that justifies introducing failures that requirefurther work?" is a valid, and important, set of questions if you areonly going to fix concurrency bugs for which there is experimentalevidence. It is irrelevant if you are going to fix concurrency bugsbased on theoretical analysis.


Patricia

On 1/3/2014 10:14 PM, Greg Trasuk wrote:


On Jan 4, 2014, at 12:52 AM, Peter Firmstone <[email protected]> wrote:

On 4/01/2014 3:18 PM, Greg Trasuk wrote:

I’ll also point out Patricia’s recent statement that TaskManager should be 
reasonably efficient for small task queues, but less efficient for larger task 
queues.  We don’t have solid evidence that the task queues ever get large.  
Hence, the assertion that “TaskManager doesn’t scale” is meaningless.


No, it's not about scalability, it's about the window of time when a task is removed from 
the queue in TaskManager for execution but fails and needs to be retried later.  
Task.runAfter doesn't contain the task that "should have executed" so dependant 
tasks proceed before their depenencies.

This code comment from ServiceDiscoveryManager might help:

       /** This task class, when executed, first registers to receive
         *  ServiceEvents from the given ServiceRegistrar. If the registration
         *  process succeeds (no RemoteExceptions), it then executes the
         *  LookupTask to query the given ServiceRegistrar for a "snapshot"
         *  of its current state with respect to services that match the
         *  given template.
         *
         *  Note that the order of execution of the two tasks is important.
         *  That is, the LookupTask must be executed only after registration
         *  for events has completed. This is because when an entity registers
         *  with the event mechanism of a ServiceRegistrar, the entity will
         *  only receive notification of events that occur "in the future",
         *  after the registration is made. The entity will not receive events
         *  about changes to the state of the ServiceRegistrar that may have
         *  occurred before or during the registration process.
         *
         *  Thus, if the order of these tasks were reversed and the LookupTask
         *  were to be executed prior to the RegisterListenerTask, then the
         *  possibility exists for the occurrence of a change in the
         *  ServiceRegistrar's state between the time the LookupTask retrieves
         *  a snapshot of that state, and the time the event registration
         *  process has completed, resulting in an incorrect view of the
         *  current state of the ServiceRegistrar.


When do you claim that this happens?  And what currently happens now that is 
unacceptable?  What is the concrete, observable problem that you’re trying to 
solve, that justifies introducing failures that require further work?

Re: Build failed in Jenkins: river-qa-refactor-win #45

Reply via email to