[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-09-16 Thread srdo
Github user srdo commented on the issue:

https://github.com/apache/storm/pull/2789
  
@zd-project I'm happy to take a look at implementing 
https://github.com/apache/storm/pull/2764 once the non-static PR is merged. I 
don't think it makes sense to mix more stuff into the current PR, and if it 
doesn't get approved, it'll be wasted work.


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-09-15 Thread zd-project
Github user zd-project commented on the issue:

https://github.com/apache/storm/pull/2789
  
#2764 also depends on resolution of this issue. Please consider merging 
that as well as part of the patch.


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-09-12 Thread srdo
Github user srdo commented on the issue:

https://github.com/apache/storm/pull/2789
  
The change is here https://github.com/apache/storm/pull/2805 and hasn't 
been merged yet.


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-09-12 Thread agresch
Github user agresch commented on the issue:

https://github.com/apache/storm/pull/2789
  
what was the status of this? I think work @srdo was doing addressed 
STORM-3173 possibly?  Was that change committed?




---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-10 Thread Ethanlm
Github user Ethanlm commented on the issue:

https://github.com/apache/storm/pull/2789
  
sure. Take your time please.


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-10 Thread srdo
Github user srdo commented on the issue:

https://github.com/apache/storm/pull/2789
  
Once the two PRs with new metrics are merged, I'll try to update the 
non-static branch and get a PR opened. It might take me a little while.


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-10 Thread Ethanlm
Github user Ethanlm commented on the issue:

https://github.com/apache/storm/pull/2789
  
That's right. Will take a look at non-static registry after #2764 and #2764


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-10 Thread srdo
Github user srdo commented on the issue:

https://github.com/apache/storm/pull/2789
  
Before spending too long on it, consider if we should fix this issue and 
https://github.com/apache/storm/pull/2714 by making the metrics registry 
non-static instead (https://github.com/apache/storm/pull/2783). I think it will 
be much easier to do.


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-10 Thread Ethanlm
Github user Ethanlm commented on the issue:

https://github.com/apache/storm/pull/2789
  
Tested it manually on a single-node cluster. `nimbus:num-shutdown-calls` 
was flushed now with this patch. Still need to figure out travis build failure 
before merging this in.


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-10 Thread Ethanlm
Github user Ethanlm commented on the issue:

https://github.com/apache/storm/pull/2789
  
I am also getting ` 
org.apache.maven.surefire.booter.SurefireBooterForkException: Error occurred in 
starting fork, check output in log` on my VM. And the tests passed without this 
patch. Still need to investigate the issue


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-07 Thread srdo
Github user srdo commented on the issue:

https://github.com/apache/storm/pull/2789
  
I've added calls to report to the branch, the tests are running at 
https://travis-ci.org/srdo/storm/builds/413332618.

Unless the test specifies another configuration file, I would expect us to 
be using https://github.com/apache/storm/blob/master/conf/defaults.yaml.

I also don't understand why heartbeatTimer would still be running in 
AsyncLocalizerTest. JUnit doesn't shut down the JVM between tests, so it would 
make sense that a thread could leak from another test into AsyncLocalizerTest, 
but it would require someone opening a Supervisor in a test somewhere and 
forgetting to close it.

Happy to help out.


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-07 Thread zd-project
Github user zd-project commented on the issue:

https://github.com/apache/storm/pull/2789
  
Thanks again for helping me out of this issue though. Hope this don't 
create too much trouble for you.


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-07 Thread zd-project
Github user zd-project commented on the issue:

https://github.com/apache/storm/pull/2789
  
Also, I just dug into the error code you suggested. It turned out to be 
caused by heartbeatTimer (instance of StormTimer) in Supervisor in 
AsyncLocalizerTest.testKeyNotFoundException. But I don't see where Supervisor 
is created. Super weird.

> 2018-08-07 16:38:11.217 [heartbeatTimer] ERROR 
org.apache.storm.daemon.supervisor.DefaultUncaughtExceptionHandler - Error when 
processing event on thread Thread[heartbeatTimer
,10,main]

...

> 2018-08-07 16:38:11.218 [heartbeatTimer] ERROR 
org.apache.storm.utils.Utils - val 202
2018-08-07 16:38:11.218 [heartbeatTimer] ERROR org.apache.storm.utils.Utils 
- Halting process: Error when processing an event
java.lang.RuntimeException: Halting process: Error when processing an event
at org.apache.storm.utils.Utils.exitProcess(Utils.java:474) 
[storm-client-2.0.0-SNAPSHOT.jar:?]
at 
org.apache.storm.daemon.supervisor.DefaultUncaughtExceptionHandler.uncaughtException(DefaultUncaughtExceptionHandler.java:25)
 [classes/:?]
at 
org.apache.storm.StormTimer$StormTimerTask.run(StormTimer.java:253) 
[storm-client-2.0.0-SNAPSHOT.jar:?]


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-07 Thread zd-project
Github user zd-project commented on the issue:

https://github.com/apache/storm/pull/2789
  
Okay that alerted me one thing. What's our config file for the testing? If 
it doesn't use ScheduledReporter for testing, this PR shouldn't make a matter 
at all.


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-07 Thread zd-project
Github user zd-project commented on the issue:

https://github.com/apache/storm/pull/2789
  
I can't see changes in your link, so I looked into your latest change in 
branch `non-static-metrics`. Please correct me if I interpret your code wrong.

Your implementation didn't actually flush the metrics for ScheduledReporter 
on exit. You want to add `reporter.report()` in ConsolePreparableReporter or 
CsvPreparableReporter and test it out. See my commit [STORM-3173: Addressing 
nits](https://github.com/apache/storm/pull/2789/commits/e5fb9d80f4af8fee2b3cff915c5401a1e0e0cde2#diff-65a7d8c48dacfbb072a99f50df9775c6R40)


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-07 Thread srdo
Github user srdo commented on the issue:

https://github.com/apache/storm/pull/2789
  
It should. Take a look at 
https://github.com/srdo/storm/compare/892179527b5a...0916fae11cb2#diff-9c4945e8e924565ca1f071435f4711e9.
 If you compare the daemon classes (Nimbus, Supervisor etc.) there should be a 
call to stopMetricsReporters in all of them, that should stop the reporters on 
shutdown.


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-07 Thread zd-project
Github user zd-project commented on the issue:

https://github.com/apache/storm/pull/2789
  
Interesting. Does your implementation also guarantee flushing all metrics 
upon exit?


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-07 Thread srdo
Github user srdo commented on the issue:

https://github.com/apache/storm/pull/2789
  
Unless it turns out to be easy to fix the test errors, I'd like to propose 
not merging this and fixing this issue as part of making the metrics registry 
non-static. I added reporter shutdowns to all the daemons on the 
non-static-metrics branch, and it works with no need for bandaid fixes 
https://travis-ci.org/srdo/storm/builds/413283316. 


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-07 Thread srdo
Github user srdo commented on the issue:

https://github.com/apache/storm/pull/2789
  
Right, sorry, didn't recheck the logs. 


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-07 Thread zd-project
Github user zd-project commented on the issue:

https://github.com/apache/storm/pull/2789
  
The lastest failure didn't provide an error code though. 


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-07 Thread srdo
Github user srdo commented on the issue:

https://github.com/apache/storm/pull/2789
  
No, but like I said, I suspect that the exit code is being set by a call to 
`Utils.exitProcess`. If you try changing some of the exit codes to be unique, 
it's likely you'll be able to tell which call to `Utils.exitProcess` is 
responsible for killing the JVM, which could help you tell what's happening.


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-07 Thread zd-project
Github user zd-project commented on the issue:

https://github.com/apache/storm/pull/2789
  
I managed to fix the original timeout error by wrapping around a nimbus 
gauge with exception handling code. However the PR still failed the 
storm-server test due to error in starting fork. I don't really know what 
caused this. @srdo Have you seen anything like this before?

Additionally, I think our tests have some flaws in manipulating zookeeper 
as I constantly see a component tries to connect to Zookeeper that is not set 
up in the test, causing timeout warning in 
https://issues.apache.org/jira/browse/STORM-3128


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-06 Thread srdo
Github user srdo commented on the issue:

https://github.com/apache/storm/pull/2789
  
The exit code is 20. Try looking for uses of `Utils.exitProcess` with `20` 
as the argument. If you change the values to be unique and rerun the tests, you 
should be able to nail down where the exit is happening.


---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-06 Thread zd-project
Github user zd-project commented on the issue:

https://github.com/apache/storm/pull/2789
  
Build logs report test crash [INFO] Running 
org.apache.storm.scheduler.resource.TestResourceAwareScheduler, but it's 
passing on my local machine. Don't know what's happening. It'll be great if you 
guys can offer some insight. @Ethanlm @srdo @HeartSaVioR 



---


[GitHub] storm issue #2789: STORM-3173: flush metrics to ScheduledReporter on shutdow...

2018-08-06 Thread zd-project
Github user zd-project commented on the issue:

https://github.com/apache/storm/pull/2789
  
https://issues.apache.org/jira/browse/STORM-3128 This bug in test appeared 
again and broke the test in this PR but I don't actually know the cause of it.


---