Re: Development Process - Experimental Testing, Theoretical analysis; code auditing and Static Analysis

Greg Trasuk Sat, 04 Jan 2014 11:38:00 -0800

@Patricia - I understand your interpretation, and I think I’ve misled you by 
including the words “concrete, observable” in my question.  I’m not talking 
about concurrency bugs here.  I’m perfectly OK with making changes based on 
analysis that shows a possible race condition.  However, I spent most of 
yesterday reading the Java Language Specification and I’m now more convinced 
than ever that statements like “we’re using final variables, therefore all our 
code has to change” (paraphrasing) are no substitute for reasoned analysis.  
Basically, I’m asserting the very common professional view that changes to 
existing code should be traceable to a requirement or a problem.  Further, 
those changes should be carefully considered and designed to minimize the 
chance of breaking anything else.  I don’t find the argument that “I fixed a 
concurrency problem, so it’s expected to have 35 other failures” convincing.  
From the outside, that looks a lot like introducing failures in an attempt to 
fix a poorly understood or poorly stated problem.


I teach programming.  I see this all the time.  When people make changes based 
on what they “think” “might” be happening, it’s always a disaster.

@Peter - You’re asking the wrong question.  The right question is whether we 
should have a review-then-commit policy for existing code that traces a 
well-defined (and ideally limited) package of changes to a JIRA issue, and 
includes the community in decisions around how to fix the problem that is 
stated.  And I will suggest pre-emptively that a JIRA issue that says 
“concurrency problems exist” is not specific enough to drive development.

Greg.

On Jan 4, 2014, at 4:59 AM, Peter Firmstone <[email protected]> wrote:

> Please provide your thoughts on the following:
> 
> How do we develop code for River?
> 
> Do we only use experimental based development, or do we also allow 
> theoretical development also?
> Do we only fix bugs that can be demonstrated with a test case, or do we fix 
> bugs identified by FindBugs and manual code auditing as well?
> 
> Should we allow theoretical development based on standards like the Java 
> Memory Model with visual auditing and static analysis with FindBugs, or 
> should we prohibit fixing bugs that don't include a test case demonstrating 
> the failure?
> 
> Regards,
> 
> Peter.
> 
> 
> On 4/01/2014 6:58 PM, Patricia Shanahan wrote:
>> Just before Christmas, you were discussing whether to fix concurrency 
>> problems based on theoretical analysis, or to only fix those problems for 
>> which there is experimental evidence.
>> 
>> I believe the PMC will be at cross-purposes until you resolve that issue, 
>> and strongly advise discussing and voting on it.
>> 
>> This is an example of a question whose answer would be obvious and 
>> non-controversial if you had agreement, either way, on that general issue. 
>> "When do you claim that this happens?  And what currently happens now that 
>> is unacceptable?  What is the concrete, observable problem that you’re 
>> trying to solve, that justifies introducing failures that require further 
>> work?" is a valid, and important, set of questions if you are only going to 
>> fix concurrency bugs for which there is experimental evidence. It is 
>> irrelevant if you are going to fix concurrency bugs based on theoretical 
>> analysis.
>> 
>> Patricia
>> 
>> On 1/3/2014 10:14 PM, Greg Trasuk wrote:
>>> 
>>> On Jan 4, 2014, at 12:52 AM, Peter Firmstone <[email protected]> wrote:
>>> 
>>>> On 4/01/2014 3:18 PM, Greg Trasuk wrote:
>>>>> I’ll also point out Patricia’s recent statement that TaskManager should 
>>>>> be reasonably efficient for small task queues, but less efficient for 
>>>>> larger task queues.  We don’t have solid evidence that the task queues 
>>>>> ever get large.  Hence, the assertion that “TaskManager doesn’t scale” is 
>>>>> meaningless.
>>>> 
>>>> No, it's not about scalability, it's about the window of time when a task 
>>>> is removed from the queue in TaskManager for execution but fails and needs 
>>>> to be retried later.  Task.runAfter doesn't contain the task that "should 
>>>> have executed" so dependant tasks proceed before their depenencies.
>>>> 
>>>> This code comment from ServiceDiscoveryManager might help:
>>>> 
>>>>       /** This task class, when executed, first registers to receive
>>>>         *  ServiceEvents from the given ServiceRegistrar. If the 
>>>> registration
>>>>         *  process succeeds (no RemoteExceptions), it then executes the
>>>>         *  LookupTask to query the given ServiceRegistrar for a "snapshot"
>>>>         *  of its current state with respect to services that match the
>>>>         *  given template.
>>>>         *
>>>>         *  Note that the order of execution of the two tasks is important.
>>>>         *  That is, the LookupTask must be executed only after registration
>>>>         *  for events has completed. This is because when an entity 
>>>> registers
>>>>         *  with the event mechanism of a ServiceRegistrar, the entity will
>>>>         *  only receive notification of events that occur "in the future",
>>>>         *  after the registration is made. The entity will not receive 
>>>> events
>>>>         *  about changes to the state of the ServiceRegistrar that may have
>>>>         *  occurred before or during the registration process.
>>>>         *
>>>>         *  Thus, if the order of these tasks were reversed and the 
>>>> LookupTask
>>>>         *  were to be executed prior to the RegisterListenerTask, then the
>>>>         *  possibility exists for the occurrence of a change in the
>>>>         *  ServiceRegistrar's state between the time the LookupTask 
>>>> retrieves
>>>>         *  a snapshot of that state, and the time the event registration
>>>>         *  process has completed, resulting in an incorrect view of the
>>>>         *  current state of the ServiceRegistrar.
>>>> 
>>> 
>>> When do you claim that this happens?  And what currently happens now that 
>>> is unacceptable?  What is the concrete, observable problem that you’re 
>>> trying to solve, that justifies introducing failures that require further 
>>> work?
>> 
>> 
>

Re: Development Process - Experimental Testing, Theoretical analysis; code auditing and Static Analysis

Reply via email to