[jira] [Commented] (OOZIE-2566) TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness() is flaky
[ https://issues.apache.org/jira/browse/OOZIE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890156#comment-16890156 ] Peter Bacsko commented on OOZIE-2566: - [~asalamon74] I don't have a definite answer to these questions right now. It could be that there's no added value, but we have to double-check this. If it's unnecessary, then we can just remove it. Let's examine this together next week. > TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness() > is flaky > > > Key: OOZIE-2566 > URL: https://issues.apache.org/jira/browse/OOZIE-2566 > Project: Oozie > Issue Type: Sub-task > Components: core >Reporter: Peter Bacsko >Assignee: Andras Salamon >Priority: Major > Attachments: OOZIE-2566-01.patch > > > The testcase testCoordActionInputCheckXCommandUniqueness is unstable. > We add three XCommands with the same actionId (entityKeys are different) into > the CallableQueueService. Only the first XCommand is expected to run. > The reason why sometimes either the 2nd or 3rd XCommand executes is because > as soon as the first starts to run, its removed from the {{uniqueCallables}} > map immediately. If the first scheduled task runs quickly, then either the > 2nd or 3rd XCommand has the chance to get scheduled. > Step by step: > 1. Schedule first XCommand > 2. XCommand is added to {{uniqueCallables}} > 3. Schedule second XCommand > 4. First XCommand starts to run in the thread pool and removes itself from > {{uniqueCallables}} (see {{CallableWrapper.run()}}) > 5. Second XCommand can successfully add itself to {{uniqueCallables}} > 6. Second XCommand starts to run > Please clarify whether this is the expected behavior of CallableQueueService. > If not, then moving {{removeFromUniqueCallables()}} to the finally block > solves the problem. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (OOZIE-2566) TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness() is flaky
[ https://issues.apache.org/jira/browse/OOZIE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890122#comment-16890122 ] Andras Salamon commented on OOZIE-2566: --- Thanks for checking the patch [~pbacsko]. To be honest I'm not sure about the whole purpose of this specific test and I collected to many questions, so I uploaded a simple fix. * It seems to me there are two similar tests: {{TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness}} and {{TestCoordKillXCommand.testCoordKillXCommandUniqueness}}. Why do we test these two methods specifically? During the tests we replace the {{execute()}} method with a simple sleep, so it's more like a {{CallableQueue}} test. We have several {{TestCallableQueue.testQueueUniqueness*}} tests, what is the added value here? * If we only want to test the uniqueness then we could simplify the tests. We don't really need to execute the tests, we could just queue them and check the {{getUniqueDump() }} but again why is it better than the {{TestCallableQueue}} tests? > TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness() > is flaky > > > Key: OOZIE-2566 > URL: https://issues.apache.org/jira/browse/OOZIE-2566 > Project: Oozie > Issue Type: Sub-task > Components: core >Reporter: Peter Bacsko >Assignee: Andras Salamon >Priority: Major > Attachments: OOZIE-2566-01.patch > > > The testcase testCoordActionInputCheckXCommandUniqueness is unstable. > We add three XCommands with the same actionId (entityKeys are different) into > the CallableQueueService. Only the first XCommand is expected to run. > The reason why sometimes either the 2nd or 3rd XCommand executes is because > as soon as the first starts to run, its removed from the {{uniqueCallables}} > map immediately. If the first scheduled task runs quickly, then either the > 2nd or 3rd XCommand has the chance to get scheduled. > Step by step: > 1. Schedule first XCommand > 2. XCommand is added to {{uniqueCallables}} > 3. Schedule second XCommand > 4. First XCommand starts to run in the thread pool and removes itself from > {{uniqueCallables}} (see {{CallableWrapper.run()}}) > 5. Second XCommand can successfully add itself to {{uniqueCallables}} > 6. Second XCommand starts to run > Please clarify whether this is the expected behavior of CallableQueueService. > If not, then moving {{removeFromUniqueCallables()}} to the finally block > solves the problem. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (OOZIE-2566) TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness() is flaky
[ https://issues.apache.org/jira/browse/OOZIE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890117#comment-16890117 ] Hadoop QA commented on OOZIE-2566: -- Testing JIRA OOZIE-2566 Cleaning local git workspace {color:green}+1 PATCH_APPLIES{color} {color:green}+1 CLEAN{color} {color:green}+1 RAW_PATCH_ANALYSIS{color} .{color:green}+1{color} the patch does not introduce any @author tags .{color:green}+1{color} the patch does not introduce any tabs .{color:green}+1{color} the patch does not introduce any trailing spaces .{color:green}+1{color} the patch does not introduce any star imports .{color:green}+1{color} the patch does not introduce any line longer than 132 .{color:green}+1{color} the patch adds/modifies 1 testcase(s) {color:green}+1 RAT{color} .{color:green}+1{color} the patch does not seem to introduce new RAT warnings {color:green}+1 JAVADOC{color} .{color:green}+1{color} Javadoc generation succeeded with the patch .{color:green}+1{color} the patch does not seem to introduce new Javadoc warning(s) {color:green}+1 COMPILE{color} .{color:green}+1{color} HEAD compiles .{color:green}+1{color} patch compiles .{color:green}+1{color} the patch does not seem to introduce new javac warnings {color:green}+1{color} There are no new bugs found in total. .{color:green}+1{color} There are no new bugs found in [fluent-job/fluent-job-api]. .{color:green}+1{color} There are no new bugs found in [tools]. .{color:green}+1{color} There are no new bugs found in [sharelib/oozie]. .{color:green}+1{color} There are no new bugs found in [sharelib/streaming]. .{color:green}+1{color} There are no new bugs found in [sharelib/spark]. .{color:green}+1{color} There are no new bugs found in [sharelib/pig]. .{color:green}+1{color} There are no new bugs found in [sharelib/sqoop]. .{color:green}+1{color} There are no new bugs found in [sharelib/git]. .{color:green}+1{color} There are no new bugs found in [sharelib/hive]. .{color:green}+1{color} There are no new bugs found in [sharelib/hcatalog]. .{color:green}+1{color} There are no new bugs found in [sharelib/distcp]. .{color:green}+1{color} There are no new bugs found in [sharelib/hive2]. .{color:green}+1{color} There are no new bugs found in [server]. .{color:green}+1{color} There are no new bugs found in [examples]. .{color:green}+1{color} There are no new bugs found in [core]. .{color:green}+1{color} There are no new bugs found in [client]. .{color:green}+1{color} There are no new bugs found in [docs]. .{color:green}+1{color} There are no new bugs found in [webapp]. {color:green}+1 BACKWARDS_COMPATIBILITY{color} .{color:green}+1{color} the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations .{color:green}+1{color} the patch does not modify JPA files {color:green}+1 TESTS{color} .Tests run: 3175 {color:green}+1 DISTRO{color} .{color:green}+1{color} distro tarball builds with the patch {color:green}*+1 Overall result, good!, no -1s*{color} The full output of the test-patch run is available at . https://builds.apache.org/job/PreCommit-OOZIE-Build/1190/ > TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness() > is flaky > > > Key: OOZIE-2566 > URL: https://issues.apache.org/jira/browse/OOZIE-2566 > Project: Oozie > Issue Type: Sub-task > Components: core >Reporter: Peter Bacsko >Assignee: Andras Salamon >Priority: Major > Attachments: OOZIE-2566-01.patch > > > The testcase testCoordActionInputCheckXCommandUniqueness is unstable. > We add three XCommands with the same actionId (entityKeys are different) into > the CallableQueueService. Only the first XCommand is expected to run. > The reason why sometimes either the 2nd or 3rd XCommand executes is because > as soon as the first starts to run, its removed from the {{uniqueCallables}} > map immediately. If the first scheduled task runs quickly, then either the > 2nd or 3rd XCommand has the chance to get scheduled. > Step by step: > 1. Schedule first XCommand > 2. XCommand is added to {{uniqueCallables}} > 3. Schedule second XCommand > 4. First XCommand starts to run in the thread pool and removes itself from > {{uniqueCallables}} (see {{CallableWrapper.run()}}) > 5. Second XCommand can successfully add itself to {{uniqueCallables}} > 6. Second XCommand starts to run > Please clarify whether this is the expected behavior of CallableQueueService. > If not, then moving {{removeFromUniqueCallables()}} to the finally block > solves the problem. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (OOZIE-2566) TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness() is flaky
[ https://issues.apache.org/jira/browse/OOZIE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890073#comment-16890073 ] Peter Bacsko commented on OOZIE-2566: - [~asalamon74] can't you replace this 200ms delay with some more realiable wait/notify logic? In the past, these kind of static delays caused a lot of headaches. I know here it solves the problem, but if there's a better way, we better try that. > TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness() > is flaky > > > Key: OOZIE-2566 > URL: https://issues.apache.org/jira/browse/OOZIE-2566 > Project: Oozie > Issue Type: Sub-task > Components: core >Reporter: Peter Bacsko >Assignee: Andras Salamon >Priority: Major > Attachments: OOZIE-2566-01.patch > > > The testcase testCoordActionInputCheckXCommandUniqueness is unstable. > We add three XCommands with the same actionId (entityKeys are different) into > the CallableQueueService. Only the first XCommand is expected to run. > The reason why sometimes either the 2nd or 3rd XCommand executes is because > as soon as the first starts to run, its removed from the {{uniqueCallables}} > map immediately. If the first scheduled task runs quickly, then either the > 2nd or 3rd XCommand has the chance to get scheduled. > Step by step: > 1. Schedule first XCommand > 2. XCommand is added to {{uniqueCallables}} > 3. Schedule second XCommand > 4. First XCommand starts to run in the thread pool and removes itself from > {{uniqueCallables}} (see {{CallableWrapper.run()}}) > 5. Second XCommand can successfully add itself to {{uniqueCallables}} > 6. Second XCommand starts to run > Please clarify whether this is the expected behavior of CallableQueueService. > If not, then moving {{removeFromUniqueCallables()}} to the finally block > solves the problem. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (OOZIE-2566) TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness() is flaky
[ https://issues.apache.org/jira/browse/OOZIE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890064#comment-16890064 ] Hadoop QA commented on OOZIE-2566: -- PreCommit-OOZIE-Build started > TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness() > is flaky > > > Key: OOZIE-2566 > URL: https://issues.apache.org/jira/browse/OOZIE-2566 > Project: Oozie > Issue Type: Sub-task > Components: core >Reporter: Peter Bacsko >Assignee: Andras Salamon >Priority: Major > Attachments: OOZIE-2566-01.patch > > > The testcase testCoordActionInputCheckXCommandUniqueness is unstable. > We add three XCommands with the same actionId (entityKeys are different) into > the CallableQueueService. Only the first XCommand is expected to run. > The reason why sometimes either the 2nd or 3rd XCommand executes is because > as soon as the first starts to run, its removed from the {{uniqueCallables}} > map immediately. If the first scheduled task runs quickly, then either the > 2nd or 3rd XCommand has the chance to get scheduled. > Step by step: > 1. Schedule first XCommand > 2. XCommand is added to {{uniqueCallables}} > 3. Schedule second XCommand > 4. First XCommand starts to run in the thread pool and removes itself from > {{uniqueCallables}} (see {{CallableWrapper.run()}}) > 5. Second XCommand can successfully add itself to {{uniqueCallables}} > 6. Second XCommand starts to run > Please clarify whether this is the expected behavior of CallableQueueService. > If not, then moving {{removeFromUniqueCallables()}} to the finally block > solves the problem. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (OOZIE-2566) TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness() is flaky
[ https://issues.apache.org/jira/browse/OOZIE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890027#comment-16890027 ] Andras Salamon commented on OOZIE-2566: --- I've encountered the same problem, hope you don't mind [~pbacsko] I'm taking this over. > TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness() > is flaky > > > Key: OOZIE-2566 > URL: https://issues.apache.org/jira/browse/OOZIE-2566 > Project: Oozie > Issue Type: Sub-task > Components: core >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > > The testcase testCoordActionInputCheckXCommandUniqueness is unstable. > We add three XCommands with the same actionId (entityKeys are different) into > the CallableQueueService. Only the first XCommand is expected to run. > The reason why sometimes either the 2nd or 3rd XCommand executes is because > as soon as the first starts to run, its removed from the {{uniqueCallables}} > map immediately. If the first scheduled task runs quickly, then either the > 2nd or 3rd XCommand has the chance to get scheduled. > Step by step: > 1. Schedule first XCommand > 2. XCommand is added to {{uniqueCallables}} > 3. Schedule second XCommand > 4. First XCommand starts to run in the thread pool and removes itself from > {{uniqueCallables}} (see {{CallableWrapper.run()}}) > 5. Second XCommand can successfully add itself to {{uniqueCallables}} > 6. Second XCommand starts to run > Please clarify whether this is the expected behavior of CallableQueueService. > If not, then moving {{removeFromUniqueCallables()}} to the finally block > solves the problem. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (OOZIE-2566) TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness() is flaky
[ https://issues.apache.org/jira/browse/OOZIE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406639#comment-15406639 ] Robert Kanter commented on OOZIE-2566: -- I read through the test. It seems very brittle. There's non-atomic access to a {{long}} and there's multiple cases of reliance on perfect timing. I'd have to think about it some more, but I think this test needs a more major refactoring. Maybe something with some fancier synchronization objects like what you did in OOZIE-2584. > TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness() > is flaky > > > Key: OOZIE-2566 > URL: https://issues.apache.org/jira/browse/OOZIE-2566 > Project: Oozie > Issue Type: Bug > Components: core >Reporter: Peter Bacsko >Assignee: Peter Bacsko > > The testcase testCoordActionInputCheckXCommandUniqueness is unstable. > We add three XCommands with the same actionId (entityKeys are different) into > the CallableQueueService. Only the first XCommand is expected to run. > The reason why sometimes either the 2nd or 3rd XCommand executes is because > as soon as the first starts to run, its removed from the {{uniqueCallables}} > map immediately. If the first scheduled task runs quickly, then either the > 2nd or 3rd XCommand has the chance to get scheduled. > Step by step: > 1. Schedule first XCommand > 2. XCommand is added to {{uniqueCallables}} > 3. Schedule second XCommand > 4. First XCommand starts to run in the thread pool and removes itself from > {{uniqueCallables}} (see {{CallableWrapper.run()}}) > 5. Second XCommand can successfully add itself to {{uniqueCallables}} > 6. Second XCommand starts to run > Please clarify whether this is the expected behavior of CallableQueueService. > If not, then moving {{removeFromUniqueCallables()}} to the finally block > solves the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2566) TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness() is flaky
[ https://issues.apache.org/jira/browse/OOZIE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404702#comment-15404702 ] Peter Bacsko commented on OOZIE-2566: - [~rkanter], [~puru], [~rohini], [~abhishekbafna] - any ideas regarding this problem? Testcase or oozie-core issue? > TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness() > is flaky > > > Key: OOZIE-2566 > URL: https://issues.apache.org/jira/browse/OOZIE-2566 > Project: Oozie > Issue Type: Bug > Components: core >Reporter: Peter Bacsko >Assignee: Peter Bacsko > > The testcase testCoordActionInputCheckXCommandUniqueness is unstable. > We add three XCommands with the same actionId (entityKeys are different) into > the CallableQueueService. Only the first XCommand is expected to run. > The reason why sometimes either the 2nd or 3rd XCommand executes is because > as soon as the first starts to run, its removed from the {{uniqueCallables}} > map immediately. If the first scheduled task runs quickly, then either the > 2nd or 3rd XCommand has the chance to get scheduled. > Step by step: > 1. Schedule first XCommand > 2. XCommand is added to {{uniqueCallables}} > 3. Schedule second XCommand > 4. First XCommand starts to run in the thread pool and removes itself from > {{uniqueCallables}} (see {{CallableWrapper.run()}}) > 5. Second XCommand can successfully add itself to {{uniqueCallables}} > 6. Second XCommand starts to run > Please clarify whether this is the expected behavior of CallableQueueService. > If not, then moving {{removeFromUniqueCallables()}} to the finally block > solves the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2566) TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness() is flaky
[ https://issues.apache.org/jira/browse/OOZIE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327232#comment-15327232 ] Peter Bacsko commented on OOZIE-2566: - If this is a test failure, the fix is very simple: {code} for (MyCoordActionInputCheckXCommand c : callables) { queueservice.queue(c, 200); // add 200ms delay before execution } {code} > TestCoordActionInputCheckXCommand.testCoordActionInputCheckXCommandUniqueness() > is flaky > > > Key: OOZIE-2566 > URL: https://issues.apache.org/jira/browse/OOZIE-2566 > Project: Oozie > Issue Type: Bug > Components: core >Reporter: Peter Bacsko >Assignee: Peter Bacsko > > The testcase testCoordActionInputCheckXCommandUniqueness is unstable. > We add three XCommands with the same actionId (entityKeys are different) into > the CallableQueueService. Only the first XCommand is expected to run. > The reason why sometimes either the 2nd or 3rd XCommand executes is because > as soon as the first starts to run, its removed from the {{uniqueCallables}} > map immediately. If the first scheduled task runs quickly, then either the > 2nd or 3rd XCommand has the chance to get scheduled. > Step by step: > 1. Schedule first XCommand > 2. XCommand is added to {{uniqueCallables}} > 3. Schedule second XCommand > 4. First XCommand starts to run in the thread pool and removes itself from > {{uniqueCallables}} (see {{CallableWrapper.run()}}) > 5. Second XCommand can successfully add itself to {{uniqueCallables}} > 6. Second XCommand starts to run > Please clarify whether this is the expected behavior of CallableQueueService. > If not, then moving {{removeFromUniqueCallables()}} to the finally block > solves the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)