[GitHub] storm issue #2800: STORM-3162: Fix concurrent modification bug

2018-08-16 Thread srdo
Github user srdo commented on the issue: https://github.com/apache/storm/pull/2800 Fixed the nimbus_test https://github.com/zd-project/storm/pull/1 ---

[GitHub] storm issue #2800: STORM-3162: Fix concurrent modification bug

2018-08-25 Thread danny0405
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2800 Please make sure again why the `ConcurrentModificationException` happens and attach the stack trace. ---

[GitHub] storm issue #2800: STORM-3162: Fix concurrent modification bug

2018-08-25 Thread srdo
Github user srdo commented on the issue: https://github.com/apache/storm/pull/2800 I'm sorry if this turns a bit verbose, but I'm going to write down what I see as the issue here, so we can hopefully come to a common understanding (and so I don't forget and have to look at this again)

[GitHub] storm issue #2800: STORM-3162: Fix concurrent modification bug

2018-08-25 Thread danny0405
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2800 @srdo @zd-project Thx for your explanation, that make sense for me. One thing needs to clarify is that `executorBeats` parameter for `StatsUtil#updateHeartbeatCache` is null for every

[GitHub] storm issue #2800: STORM-3162: Fix concurrent modification bug

2018-08-26 Thread srdo
Github user srdo commented on the issue: https://github.com/apache/storm/pull/2800 I don't think fixing the `executorBeats == null` branch is enough. As far as I can tell, two supervisors/workers can be in the https://github.com/apache/storm/blob/4c42ee3d259d5d90a4e7d3445d1c119601eec6

[GitHub] storm issue #2800: STORM-3162: Fix concurrent modification bug

2018-08-26 Thread srdo
Github user srdo commented on the issue: https://github.com/apache/storm/pull/2800 Regarding performance, consider that Nimbus is already copying `heartbeatCache` on writes everywhere else https://github.com/apache/storm/blob/master/storm-server/src/main/java/org/apache/storm/daemon/n

[GitHub] storm issue #2800: STORM-3162: Fix concurrent modification bug

2018-08-26 Thread danny0405
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2800 @srdo Agree that copy is a better solution. ---

[GitHub] storm issue #2800: STORM-3162: Fix concurrent modification bug

2018-08-26 Thread srdo
Github user srdo commented on the issue: https://github.com/apache/storm/pull/2800 @danny0405 Could you elaborate on why fixing the `executorBeats == null` branch is enough? My concern is that the other branch modifies a HashMap (the `cache` parameter) from multiple threads with no sy

[GitHub] storm issue #2800: STORM-3162: Fix concurrent modification bug

2018-08-26 Thread danny0405
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2800 @srdo Cause only modify it through multi-threads but not iterator over it, and the cache key is executor-id, which will only have conflict between master and supervisor. ---

[GitHub] storm issue #2800: STORM-3162: Fix concurrent modification bug

2018-08-26 Thread srdo
Github user srdo commented on the issue: https://github.com/apache/storm/pull/2800 @danny0405 That doesn't sound safe to me. I think you're right that it works fine most of the time, but if there are key collisions or an insert leads the map to get resized, I would think that two thre

[GitHub] storm issue #2800: STORM-3162: Fix concurrent modification bug

2018-09-11 Thread srdo
Github user srdo commented on the issue: https://github.com/apache/storm/pull/2800 @zd-project I'd like to finish this up. Let me know if you want to make the last couple of fixes, otherwise I'll open a new PR containing this fix. @danny0405 I thought about it a bit more, and

[GitHub] storm issue #2800: STORM-3162: Fix concurrent modification bug

2018-09-11 Thread zd-project
Github user zd-project commented on the issue: https://github.com/apache/storm/pull/2800 Agreed. I think atomic reference is really just for Clojure compatibility there. I’ll finish this up. ---

[GitHub] storm issue #2800: STORM-3162: Fix concurrent modification bug

2018-09-11 Thread danny0405
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2800 @srdo I think the only difference is that compared to `AtomicReference`, `ConcurrentHashMap` can keep thread safe but can not ensure that the value read is up to date, which will cause some

[GitHub] storm issue #2800: STORM-3162: Fix concurrent modification bug

2018-09-12 Thread srdo
Github user srdo commented on the issue: https://github.com/apache/storm/pull/2800 @danny0405 Thanks for explaining. I'm not sure I understand why the scheduling thread will see older values with `ConcurrentHashMap` than with `AtomicReference`? It was my understanding that `Concurrent

[GitHub] storm issue #2800: STORM-3162: Fix concurrent modification bug

2018-09-12 Thread danny0405
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2800 @srdo Okey, i checkout the AtomicReference for jdk8 and get the code snippet: ```java public final V getAndUpdate(UnaryOperator updateFunction) { V prev, next; do

[GitHub] storm issue #2800: STORM-3162: Fix concurrent modification bug

2018-09-15 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2800 Please take a look at #2836 as an alternative. It is a much bigger patch, but I think the refactoring it does will make things much easier longer term. Having done the other patch I think this one

[GitHub] storm issue #2800: STORM-3162: Fix concurrent modification bug

2018-09-15 Thread zd-project
Github user zd-project commented on the issue: https://github.com/apache/storm/pull/2800 As far as my understanding of the code goes I believe this fix should resolve this particular issue. However my understanding in HB cache was limited and the PR was pushed hastily during my last f