[ https://issues.apache.org/jira/browse/OOZIE-2385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949048#comment-14949048 ]
Robert Kanter commented on OOZIE-2385: -------------------------------------- Yes. However, in the real world, the user isn't specifying the created time of a specific action and then trying to check that the filtered time matches that action's created time. In the real world, the user will enter some time, with minutes granularity, and any actions that are after that time will be included. So, going with my example from above, if the user used Oct 06 16:27 for the filter, it should return both \@1 and @\2 because Oct 06 16:27:24 and Oct 06 16:27:25 are both after Oct 06 16:27:00. The issue is just in the test where it's trying to check that the filter value is Oct 06 16:27:25 when it's actually the earliest action after Oct 06 16:27:00, which is Oct 06 16:27:24. Makes sense? > org.apache.oozie.TestCoordinatorEngineStreamLog.testCoordLogStreaming is > flakey > ------------------------------------------------------------------------------- > > Key: OOZIE-2385 > URL: https://issues.apache.org/jira/browse/OOZIE-2385 > Project: Oozie > Issue Type: Bug > Components: tests > Affects Versions: trunk > Reporter: Robert Kanter > Assignee: Robert Kanter > Fix For: trunk > > Attachments: OOZIE-2385.001.patch > > > {{org.apache.oozie.TestCoordinatorEngineStreamLog.testCoordLogStreaming}} is > flakey. > It can fail with an error like this: > {noformat} > junit.framework.ComparisonFailure: expected:<Tue Oct 06 16:27:2[5] PDT 2015> > but was:<Tue Oct 06 16:27:2[4] PDT 2015> > at junit.framework.Assert.assertEquals(Assert.java:85) > at junit.framework.Assert.assertEquals(Assert.java:91) > at > org.apache.oozie.TestCoordinatorEngineStreamLog.__CLR4_0_38acukov0f(TestCoordinatorEngineStreamLog.java:202) > at > org.apache.oozie.TestCoordinatorEngineStreamLog.testCoordLogStreaming(TestCoordinatorEngineStreamLog.java:103) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at junit.framework.TestCase.runTest(TestCase.java:168) > at junit.framework.TestCase.runBare(TestCase.java:134) > at junit.framework.TestResult$1.protect(TestResult.java:110) > at junit.framework.TestResult.runProtected(TestResult.java:128) > at junit.framework.TestResult.run(TestResult.java:113) > at junit.framework.TestCase.run(TestCase.java:124) > at junit.framework.TestSuite.runTest(TestSuite.java:243) > at junit.framework.TestSuite.run(TestSuite.java:238) > at > org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83) > at org.junit.runners.Suite.runChild(Suite.java:128) > at org.junit.runners.Suite.runChild(Suite.java:24) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > {noformat} > This was tricky to track down. The problem is essentially that the filtering > API uses minute granularity but the actions are seconds, or even milliseconds > apart, so there are some edge cases here. The code takes minute granularity, > goes back to milliseconds when querying the DB, and then uses seconds when > checking that the test worked correctly. > This can lead to a situation like this example: > \@1 created/nominal time: Oct 06 16:27:24 <and some milliseconds> > \@2 created/nominal time: Oct 06 16:27:25 <and some milliseconds> > The test passes \@2's created time as the starting time argument to the > filter, converted with {{DateUtill's}} UTC MASK (i.e. minutes granularity): > - Oct 06 16:27 > This gets parsed back to a Date object, which seconds/milliseconds assumed to > be 0's > - Oct 06 16:27:00 0000 > This gets turned into a TimeStamp object, and used in the DB filtering query > - gets all actions with created time > Oct 06 16:27:00, which includes \@1 > CoordinatorEngine sorts by action ID (i.e. \@#) and uses created time of > smallest, which is \@1 > - Oct 06 16:27:24 > The test checks that \@2’s created time is equal to previously determined > time from filter (i.e. \@1’s created time) > - Does a toString(), which uses seconds —-> not equal > Note: The filter querying is actually done against the nominal times, but the > example is simpler without all that and conveys the same idea. > The actions get created within 100s of milliseconds during the test (at least > on my machine), so most of them time, they have the same seconds, and the > toString() comparison is equal. When the timing works out that the first and > second action are created at different clock seconds (even if less than a > second apart), it fails. I couldn’t reproduce this on my machine, but is > more likely to happen on the generally slower Jenkins machines we run because > the time between action creations will be longer. > Basically the problem is that the filtering works at a minute granularity but > the actions are seconds, if not milliseconds, apart. In the real world this > isn’t a big deal because the API takes minutes and we simply use that, but to > verify the test is working we should simply make the actions minutes apart. > To fix this, we can just directly insert the job and actions into the DB > instead of actually running it so we can space the actions apart by minutes; > this should also make the test faster as well (the test takes about 6 minutes > on my machine as-is, but only 1 minute and 30 seconds with the fix). -- This message was sent by Atlassian JIRA (v6.3.4#6332)