[ 
https://issues.apache.org/jira/browse/OOZIE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16890073#comment-16890073
 ] 

Peter Bacsko commented on OOZIE-2566:
-------------------------------------

[~asalamon74] can't you replace this 200ms delay with some more realiable 
wait/notify logic? In the past, these kind of static delays caused a lot of 
headaches. I know here it solves the problem, but if there's a better way, we 
better try that.

> TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness()
>  is flaky
> ----------------------------------------------------------------------------------------
>
>                 Key: OOZIE-2566
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2566
>             Project: Oozie
>          Issue Type: Sub-task
>          Components: core
>            Reporter: Peter Bacsko
>            Assignee: Andras Salamon
>            Priority: Major
>         Attachments: OOZIE-2566-01.patch
>
>
> The testcase testCoordActionInputCheckXCommandUniqueness is unstable.
> We add three XCommands with the same actionId (entityKeys are different) into 
> the CallableQueueService. Only the first XCommand is expected to run.
> The reason why sometimes either the 2nd or 3rd XCommand executes is because 
> as soon as the first starts to run, its removed from the {{uniqueCallables}} 
> map immediately. If the first scheduled task runs quickly, then either the 
> 2nd or 3rd XCommand has the chance to get scheduled.
> Step by step:
> 1. Schedule first XCommand
> 2. XCommand is added to {{uniqueCallables}}
> 3. Schedule second XCommand
> 4. First XCommand starts to run in the thread pool and removes itself from 
> {{uniqueCallables}} (see {{CallableWrapper.run()}})
> 5. Second XCommand can successfully add itself to {{uniqueCallables}}
> 6. Second XCommand starts to run
> Please clarify whether this is the expected behavior of CallableQueueService.
> If not, then moving {{removeFromUniqueCallables()}} to the finally block 
> solves the problem.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to