[jira] [Commented] (METRON-1778) Out-of-order timestamps may delay flush in Storm Profiler

2018-09-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16613672#comment-16613672
 ] 

ASF GitHub Bot commented on METRON-1778:


GitHub user nickwallen opened a pull request:

https://github.com/apache/metron/pull/1197

METRON-1778 Out-of-order timestamps may delay flush in Storm Profiler

In the Storm Profiler, when timestamps are received out-of-order there can 
be cases where a flush signal can be either delayed or occur prematurely.  The 
smaller the profile period is, the more likely this is to impact the results.  
This is more likely to impact things like integration tests that run with small 
profile period values.  I would not expect this to greatly impact the results 
of the Profiler under normal usage.

## Changes

The previous implementation of `FixedFrequencyFlushSignal` set the flush 
time based only on the first timestamp it sees.  With out-of-order timestamps 
the flush time can effectively change as additional data arrives.  

* See `FixedFrequencyFlushSignalTest.testOutOfOrderTimestamps` for an 
example of how this logic can cause a problem

* The fix in this PR tracks the min and max timestamps that have been seen. 
 If max > min + flushFrequency, then it is time to flush.  Otherwise, it is not 
time to flush.  

* I added a fair number of unit tests and described in the comments what 
should happen.  

## Testing

There is not an easy way to manually test this as it requires a very 
specific sequence of timestamps to trigger.  I feel the updates that were made 
to the unit tests, along with regression testing the environment as described 
in the Profiler README, is sufficient.

## Pull Request Checklist

- [ ] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [ ] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?
- [ ] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [ ] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [ ] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
- [ ] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nickwallen/metron METRON-1778

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/1197.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1197


commit afa7e7cde4cf2335b5491c7e39a46bb03f4eb643
Author: Nick Allen 
Date:   2018-09-13T14:20:06Z

METRON-1778 Out-of-order timestamps may delay flush in Spark Profiler

commit 0e708b07a9e2f061d92717463618908532d476f7
Author: Nick Allen 
Date:   2018-09-13T15:39:54Z

Added back check for out-of-order timestamps




> Out-of-order timestamps may delay flush in Storm Profiler
> -
>
> Key: METRON-1778
> URL: https://issues.apache.org/jira/browse/METRON-1778
> Project: Metron
>  Issue Type: Bug
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> When timestamps are received out-of-order there can be cases where a flush 
> signal should have been signalled, but is not.  The flush signal can be 
> either delayed slightly or occur prematurely.
> The smaller the profile period is, the more likely this is to impact the 
> results.  I would not expect this to greatly impact the results of the 
> Profiler under normal usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1778) Out-of-order timestamps may delay flush in Storm Profiler

2018-10-16 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652075#comment-16652075
 ] 

ASF GitHub Bot commented on METRON-1778:


Github user JonZeolla commented on the issue:

https://github.com/apache/metron/pull/1197
  
Any idea why this is failing?
```
Running org.apache.metron.profiler.storm.integration.ProfilerIntegrationTest
2018-09-28 14:02:45 ERROR FluxTopologyComponent:198 - 
NotAliveException(msg:profiler is not alive)
at 
org.apache.storm.daemon.nimbus$check_storm_active_BANG_.invoke(nimbus.clj:1017)
at 
org.apache.storm.daemon.nimbus$fn__9109$exec_fn__1371__auto__$reify__9138.killTopologyWithOpts(nimbus.clj:1550)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)
at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28)
at 
org.apache.storm.LocalCluster$_killTopologyWithOpts.invoke(LocalCluster.clj:90)
at org.apache.storm.LocalCluster.killTopologyWithOpts(Unknown Source)
at 
org.apache.metron.integration.components.FluxTopologyComponent.killTopology(FluxTopologyComponent.java:216)
at 
org.apache.metron.integration.components.FluxTopologyComponent.stop(FluxTopologyComponent.java:174)
at 
org.apache.metron.integration.ComponentRunner.stop(ComponentRunner.java:136)
at 
org.apache.metron.profiler.storm.integration.ProfilerIntegrationTest.tearDownAfterClass(ProfilerIntegrationTest.java:399)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 289.466 sec 
<<< FAILURE! - in 
org.apache.metron.profiler.storm.integration.ProfilerIntegrationTest

testProcessingTimeWithTimeToLiveFlush(org.apache.metron.profiler.storm.integration.ProfilerIntegrationTest)
  Time elapsed: 136.722 sec  <<< FAILURE!
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.metron.profiler.storm.integration.ProfilerIntegrationTest.testProcessingTimeWithTimeToLiveFlush(ProfilerIntegrationTest.java:210)
...
```


> Out-of-order timestamps may delay flush in Storm Profiler
> -
>
> Key: METRON-1778
> URL: https://issues.apache.org/jira/browse/METRON-1778
> Project: Metron
>  Issue Type: Bug
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> When timestamps are received out-of-order there can be cases where a flush 
> signal should have been signalled, but is not.  The flush signal can be 
> either delayed slightly or occur prematurely.
> The smaller the profile period is, the more likely this is to impact the 
> results.  I would not expect this to greatly impact the results of the 
> Profiler under normal usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1778) Out-of-order timestamps may delay flush in Storm Profiler

2018-11-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686957#comment-16686957
 ] 

ASF GitHub Bot commented on METRON-1778:


Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/1197
  
@JonZeolla That error is a pre-existing condition that is tracked here;[ 
METRON-1810](https://issues.apache.org/jira/browse/METRON-1810).  I am not sure 
what the root cause is.


> Out-of-order timestamps may delay flush in Storm Profiler
> -
>
> Key: METRON-1778
> URL: https://issues.apache.org/jira/browse/METRON-1778
> Project: Metron
>  Issue Type: Bug
>Reporter: Nick Allen
>Assignee: Nick Allen
>Priority: Major
>
> When timestamps are received out-of-order there can be cases where a flush 
> signal should have been signalled, but is not.  The flush signal can be 
> either delayed slightly or occur prematurely.
> The smaller the profile period is, the more likely this is to impact the 
> results.  I would not expect this to greatly impact the results of the 
> Profiler under normal usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)