[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-19 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 I merged this into master and squashed the commits so I am closing the pull request, but will leave the JIRA open so we can look into pulling it back into 1.x as well. --- If your project is set up

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-19 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 @harshach it has been 6 hours so I am going to check this in because I don't want to have to try and keep it up to date as the code base changes. If you do find anything as time goes on I will be ve

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-19 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 @harshach I have +1 from enough people to merge this in and would really like to move forward with it, but if you are still looking at it I want to be sure you get a chance to finish. Please let me

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-15 Thread srdo
Github user srdo commented on the issue: https://github.com/apache/storm/pull/1642 @revans2 Thanks, looks great. +1 for merging. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-15 Thread ptgoetz
Github user ptgoetz commented on the issue: https://github.com/apache/storm/pull/1642 I have a few style-related nits, but I've refrained from pointing them out because the style(s) in the codebase are all over the place, and the style is inherited in some places. If we want to get ni

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-15 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 Not sure why travis has so many issues with the build and maven but here is my travis setup on the same branch with the build passing. https://travis-ci.org/revans2/incubator-storm/builds/160

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-15 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 @srdo Pushed a new version that hopefully addresses your review comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proje

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-15 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 @srdo on the `newAssignment.compareAndSet` you are 100% correct. I originally had it in there to clear out the assignment when it is processed, but then realized that each time a new assignment is

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-15 Thread ptgoetz
Github user ptgoetz commented on the issue: https://github.com/apache/storm/pull/1642 @revans2 I'm also in the process of reviewing this. I hope to finish by the end of the day. So far everything looks good to me. --- If your project is set up for it, you can reply to this email and

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-15 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 @harshach did you get a chance to take a look yet? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this f

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-13 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 The Travis failures are from hive dependencies not being downloaded. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project do

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-13 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1642 @revans2 Thanks for the great work. Really appreciated. +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-13 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 @HeartSaVioR I found the original comment and put it the change. I also fixed some other review comments from @abellina and also fixed an annoying failure with drpc_auth_test.clj not being able to b

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-13 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1642 @harshach There's already an another pull request for fixing race condition on old supervisor. Given that we can't set a due date for 2.0 (and maintenance date for 1.x) it would be better to have

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-13 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 @HeartSaVioR @harshach Once this goes in I would be happy to make a version for 1.x In fact we are in the process of pulling it back for the version we use internally, and I was able to keep

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-13 Thread harshach
Github user harshach commented on the issue: https://github.com/apache/storm/pull/1642 @HeartSaVioR do we want this for 1.x branch? Given that it's a java migration and pretty much a new feature that can get into 1.x-branch. I would rather keep 1.x-branch as it is. @revans2 will g

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-13 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1642 And I saw some places using lambda which can't be applied to 1.x. @revans2 Could you craft a pull request for 1.x branch (might also 1.0.x) too when we're sure it's good to merge? --- If yo

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-13 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1642 I've left a comment of regression issue, but other manual tests passed including failed things. I'd be +1 once regression issue is resolved. Great works. --- If your project is set up for it

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-06 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 @knusbaum @srdo @harshach @HeartSaVioR @abellina I really would appreciate it if you could take another look at this patch. I think everything is ready to go except a final squash before merging it.

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-05 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 Not sure what has been happening with travis not being able to get to the apache maven repo all the time, but my build in travis passed https://travis-ci.org/revans2/incubator-storm/builds/157725399

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-05 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 I could not reproduce the one failure and the rat failure is fixed by STORM-2054 https://github.com/apache/storm/pull/1648 --- If your project is set up for it, you can reply to this email and have

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-05 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 Looks like there were two failures in travis. One is a rat issue with storm-submit-tools the other is an integration test that timed out. I'll try to reproduce the issues and see if I can fix them.

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-05 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 I recovered it and fixed some issues with integration tests/rat. I think it should be good to go. --- If your project is set up for it, you can reply to this email and have your reply appear on Git

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-05 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 @srdo Ya that is what I did. Thanks for the advice. Not sure what happened somehow when I upmerged all but 2 of my commits disappeared. --- If your project is set up for it, you can reply to thi

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-05 Thread srdo
Github user srdo commented on the issue: https://github.com/apache/storm/pull/1642 Check the ref log, maybe you can recover https://git-scm.com/docs/git-reflog --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proje

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-05 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 Sorry, git just ate all of my more recent changes, and I have no idea what happened. I'll try to see if I can recover them, but I might have to start over... --- If your project is set up for it,

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-05 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 I just pushed in a much of new fixes and addressed all of the outstanding review comments. I think it is good to go. Please take another look/test it and let me know. I have run tests with

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-03 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 @HeartSaVioR I figured out what happened to make the supervisor crash. In the transition from RUNNING to KILL to speed things up slot starts localizing resources for the new assignment. In the tran

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-02 Thread d2r
Github user d2r commented on the issue: https://github.com/apache/storm/pull/1642 I was able to review most of the code. I skimmed the tests and didn't see anything strange. @revans2 knows that I will not be able to log on again for awhile. I want to leave a note for other

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-02 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 @HeartSaVioR Yes it looks like I need to think through recovery and any races with the AsyncLocalizer a bit more. I'll try to reproduce your error. --- If your project is set up for it, you can rep

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-02 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1642 I found another issue: When I rebalance 3 workers into 1 worker, all workers are killed first (expected) and AsyncLocalizer clear out topology codes since all workers are killed. But Su

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-02 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1642 With my patch (symlink issue and NPE issue) I can see workers launched and killed by Supervisor V2. (remote) Tested: - kill worker process with -9 - rebalance with different worke

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-02 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1642 Found other issue: Constructor of BasicContainer (in fact constructor of Container) throws FNF when topology dist files are deleted. So even though Slot is creating with recover mode, Superv

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-02 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1642 I merged this in my local, and do some tests, and see: 1. integration test fails from my dev. machine - I'm using OSX 10.11, Java 1.8.0_66 - I got wrong IP address from integr

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-09-01 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 I think I have addressed most of the issues so far. I have been running some manual tests and have run a cluster with run as user and cgroup enforcement enabled. I plan on doing some more tests, bu

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-08-31 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 @HeartSaVioR take your time. I want to be sure that I have plenty of eyeballs looking at this. We are doing this mostly because we started to run into a lot of race conditions in the supervisor. W

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-08-31 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1642 I'm still reviewing this. Nice work to refactor to introduce greater readability. Will leave comment here once I'm done with first pass. --- If your project is set up for it, you can reply to th

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-08-30 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 @abellina I addressed all of your comments. I also have manually run all of the unit tests and they are passing for me, despite the travis issues. --- If your project is set up for it, you can repl

[GitHub] storm issue #1642: STORM-2018: Supervisor V2.

2016-08-29 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1642 @d2r @abellina @harshach @knusbaum I believe that I am done and that the code is ready to merge. I have addressed all of the review comments to date. I have finished with updating the uni