[jira] [Commented] (METRON-1778) Out-of-order timestamps may delay flush in Storm Profiler
[ https://issues.apache.org/jira/browse/METRON-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16613672#comment-16613672 ] ASF GitHub Bot commented on METRON-1778: GitHub user nickwallen opened a pull request: https://github.com/apache/metron/pull/1197 METRON-1778 Out-of-order timestamps may delay flush in Storm Profiler In the Storm Profiler, when timestamps are received out-of-order there can be cases where a flush signal can be either delayed or occur prematurely. The smaller the profile period is, the more likely this is to impact the results. This is more likely to impact things like integration tests that run with small profile period values. I would not expect this to greatly impact the results of the Profiler under normal usage. ## Changes The previous implementation of `FixedFrequencyFlushSignal` set the flush time based only on the first timestamp it sees. With out-of-order timestamps the flush time can effectively change as additional data arrives. * See `FixedFrequencyFlushSignalTest.testOutOfOrderTimestamps` for an example of how this logic can cause a problem * The fix in this PR tracks the min and max timestamps that have been seen. If max > min + flushFrequency, then it is time to flush. Otherwise, it is not time to flush. * I added a fair number of unit tests and described in the comments what should happen. ## Testing There is not an easy way to manually test this as it requires a very specific sequence of timestamps to trigger. I feel the updates that were made to the unit tests, along with regression testing the environment as described in the Profiler README, is sufficient. ## Pull Request Checklist - [ ] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [ ] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)? - [ ] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [ ] Have you included steps or a guide to how the change may be verified and tested manually? - [ ] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: - [ ] Have you written or updated unit tests and or integration tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? You can merge this pull request into a Git repository by running: $ git pull https://github.com/nickwallen/metron METRON-1778 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1197.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1197 commit afa7e7cde4cf2335b5491c7e39a46bb03f4eb643 Author: Nick Allen Date: 2018-09-13T14:20:06Z METRON-1778 Out-of-order timestamps may delay flush in Spark Profiler commit 0e708b07a9e2f061d92717463618908532d476f7 Author: Nick Allen Date: 2018-09-13T15:39:54Z Added back check for out-of-order timestamps > Out-of-order timestamps may delay flush in Storm Profiler > - > > Key: METRON-1778 > URL: https://issues.apache.org/jira/browse/METRON-1778 > Project: Metron > Issue Type: Bug >Reporter: Nick Allen >Assignee: Nick Allen >Priority: Major > > When timestamps are received out-of-order there can be cases where a flush > signal should have been signalled, but is not. The flush signal can be > either delayed slightly or occur prematurely. > The smaller the profile period is, the more likely this is to impact the > results. I would not expect this to greatly impact the results of the > Profiler under normal usage. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1778) Out-of-order timestamps may delay flush in Storm Profiler
[ https://issues.apache.org/jira/browse/METRON-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652075#comment-16652075 ] ASF GitHub Bot commented on METRON-1778: Github user JonZeolla commented on the issue: https://github.com/apache/metron/pull/1197 Any idea why this is failing? ``` Running org.apache.metron.profiler.storm.integration.ProfilerIntegrationTest 2018-09-28 14:02:45 ERROR FluxTopologyComponent:198 - NotAliveException(msg:profiler is not alive) at org.apache.storm.daemon.nimbus$check_storm_active_BANG_.invoke(nimbus.clj:1017) at org.apache.storm.daemon.nimbus$fn__9109$exec_fn__1371__auto__$reify__9138.killTopologyWithOpts(nimbus.clj:1550) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28) at org.apache.storm.LocalCluster$_killTopologyWithOpts.invoke(LocalCluster.clj:90) at org.apache.storm.LocalCluster.killTopologyWithOpts(Unknown Source) at org.apache.metron.integration.components.FluxTopologyComponent.killTopology(FluxTopologyComponent.java:216) at org.apache.metron.integration.components.FluxTopologyComponent.stop(FluxTopologyComponent.java:174) at org.apache.metron.integration.ComponentRunner.stop(ComponentRunner.java:136) at org.apache.metron.profiler.storm.integration.ProfilerIntegrationTest.tearDownAfterClass(ProfilerIntegrationTest.java:399) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 289.466 sec <<< FAILURE! - in org.apache.metron.profiler.storm.integration.ProfilerIntegrationTest testProcessingTimeWithTimeToLiveFlush(org.apache.metron.profiler.storm.integration.ProfilerIntegrationTest) Time elapsed: 136.722 sec <<< FAILURE! java.lang.AssertionError at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.metron.profiler.storm.integration.ProfilerIntegrationTest.testProcessingTimeWithTimeToLiveFlush(ProfilerIntegrationTest.java:210) ... ``` > Out-of-order timestamps may delay flush in Storm Profiler > - > > Key: METRON-1778 > URL: https://issues.apache.org/jira/browse/METRON-1778 > Project: Metron > Issue Type: Bug >Reporter: Nick Allen >Assignee: Nick Allen >Priority: Major > > When timestamps are received out-of-order there can be cases where a flush > signal should have been signalled, but is not. The flush signal can be > either delayed slightly or occur prematurely. > The smaller the profile period is, the more likely this is to impact the > results. I would not expect this to greatly impact the results of the > Profiler under normal usage. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1778) Out-of-order timestamps may delay flush in Storm Profiler
[ https://issues.apache.org/jira/browse/METRON-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686957#comment-16686957 ] ASF GitHub Bot commented on METRON-1778: Github user nickwallen commented on the issue: https://github.com/apache/metron/pull/1197 @JonZeolla That error is a pre-existing condition that is tracked here;[ METRON-1810](https://issues.apache.org/jira/browse/METRON-1810). I am not sure what the root cause is. > Out-of-order timestamps may delay flush in Storm Profiler > - > > Key: METRON-1778 > URL: https://issues.apache.org/jira/browse/METRON-1778 > Project: Metron > Issue Type: Bug >Reporter: Nick Allen >Assignee: Nick Allen >Priority: Major > > When timestamps are received out-of-order there can be cases where a flush > signal should have been signalled, but is not. The flush signal can be > either delayed slightly or occur prematurely. > The smaller the profile period is, the more likely this is to impact the > results. I would not expect this to greatly impact the results of the > Profiler under normal usage. -- This message was sent by Atlassian JIRA (v7.6.3#76005)