[ https://issues.apache.org/jira/browse/STORM-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Petr Janeček updated STORM-3422: -------------------------------- Description: Marking this as Major, but the problem lies in testing code. This makes integration testing hard, but the issue does not affect any production code. First, let me show you a stack trace for Storm 2.0.0: {{java.lang.RuntimeException: java.lang.NullPointerException}} {{at org.apache.storm.executor.Executor.accept(Executor.java:282) ~[storm-client-2.0.0.jar:2.0.0]}} {{at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:133) ~[storm-client-2.0.0.jar:2.0.0]}} {{at org.apache.storm.utils.JCQueue.consume(JCQueue.java:110) ~[storm-client-2.0.0.jar:2.0.0]}} {{at org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:171) ~[storm-client-2.0.0.jar:2.0.0]}} {{at org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:158) ~[storm-client-2.0.0.jar:2.0.0]}} {{at org.apache.storm.utils.Utils$1.run(Utils.java:388) [storm-client-2.0.0.jar:2.0.0]}} {{at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]}} {{Caused by: java.lang.NullPointerException}} {{at org.apache.storm.testing.TupleCaptureBolt.execute(TupleCaptureBolt.java:45) ~[storm-client-2.0.0.jar:2.0.0]}} {{at org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:234) ~[storm-client-2.0.0.jar:2.0.0]}} {{at org.apache.storm.executor.Executor.accept(Executor.java:275) ~[storm-client-2.0.0.jar:2.0.0]}} {{... 6 more}} Here's the same for Storm 1.2.2: {{java.lang.RuntimeException: java.lang.NullPointerException}} {{at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:522) ~[storm-core-1.2.2.jar:1.2.2]}} {{at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:487) ~[storm-core-1.2.2.jar:1.2.2]}} {{at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:74) ~[storm-core-1.2.2.jar:1.2.2]}} {{at org.apache.storm.daemon.executor$fn__10795$fn__10808$fn__10861.invoke(executor.clj:861) ~[storm-core-1.2.2.jar:1.2.2]}} {{at org.apache.storm.util$async_loop$fn__553.invoke(util.clj:484) [storm-core-1.2.2.jar:1.2.2]}} {{at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]}} {{at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]}} {{Caused by: java.lang.NullPointerException}} {{at org.apache.storm.testing.TupleCaptureBolt.execute(TupleCaptureBolt.java:50) ~[storm-core-1.2.2.jar:1.2.2]}} {{at org.apache.storm.daemon.executor$fn__10795$tuple_action_fn__10797.invoke(executor.clj:739) ~[storm-core-1.2.2.jar:1.2.2]}} {{at org.apache.storm.daemon.executor$mk_task_receiver$fn__10716.invoke(executor.clj:468) ~[storm-core-1.2.2.jar:1.2.2]}} {{at org.apache.storm.disruptor$clojure_handler$reify__10135.onEvent(disruptor.clj:41) ~[storm-core-1.2.2.jar:1.2.2]}} {{at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:509) ~[storm-core-1.2.2.jar:1.2.2]}} {{... 6 more}} This is a topology running as our integration test using {{Testing.completeTopology()}}. Both the stack traces point to the same code in the {{TupleCaptureBolt}} - its {{name}} field is not safely published (it should be marked {{final}}), and the internal {{HashMap}} does not safely store the data put in it. Perhaps it should be a {{ConcurrentHashMap}}? Would you accept a PR with a more detailed analysis, or are you going to investigate on your side? was: Marking this as Major, but the problem lies in testing code. This makes integration testing hard, but the issue does not affect any production code. First, let me show you a stack trace for Storm 2.0.0: {{java.lang.RuntimeException: java.lang.NullPointerException}} {{ at org.apache.storm.executor.Executor.accept(Executor.java:282) ~[storm-client-2.0.0.jar:2.0.0]}} {{ at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:133) ~[storm-client-2.0.0.jar:2.0.0]}} {{ at org.apache.storm.utils.JCQueue.consume(JCQueue.java:110) ~[storm-client-2.0.0.jar:2.0.0]}} {{ at org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:171) ~[storm-client-2.0.0.jar:2.0.0]}} {{ at org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:158) ~[storm-client-2.0.0.jar:2.0.0]}} {{ at org.apache.storm.utils.Utils$1.run(Utils.java:388) [storm-client-2.0.0.jar:2.0.0]}} {{ at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]}} {{Caused by: java.lang.NullPointerException}} {{ at org.apache.storm.testing.TupleCaptureBolt.execute(TupleCaptureBolt.java:45) ~[storm-client-2.0.0.jar:2.0.0]}} {{ at org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:234) ~[storm-client-2.0.0.jar:2.0.0]}} {{ at org.apache.storm.executor.Executor.accept(Executor.java:275) ~[storm-client-2.0.0.jar:2.0.0]}} {{ ... 6 more}} Here's the same for Storm 1.2.2: {{java.lang.RuntimeException: java.lang.NullPointerException}} {{ at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:522) ~[storm-core-1.2.2.jar:1.2.2]}} {{ at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:487) ~[storm-core-1.2.2.jar:1.2.2]}} {{ at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:74) ~[storm-core-1.2.2.jar:1.2.2]}} {{ at org.apache.storm.daemon.executor$fn__10795$fn__10808$fn__10861.invoke(executor.clj:861) ~[storm-core-1.2.2.jar:1.2.2]}} {{ at org.apache.storm.util$async_loop$fn__553.invoke(util.clj:484) [storm-core-1.2.2.jar:1.2.2]}} {{ at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]}} {{ at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]}} {{Caused by: java.lang.NullPointerException}} {{ at org.apache.storm.testing.TupleCaptureBolt.execute(TupleCaptureBolt.java:50) ~[storm-core-1.2.2.jar:1.2.2]}} {{ at org.apache.storm.daemon.executor$fn__10795$tuple_action_fn__10797.invoke(executor.clj:739) ~[storm-core-1.2.2.jar:1.2.2]}} {{ at org.apache.storm.daemon.executor$mk_task_receiver$fn__10716.invoke(executor.clj:468) ~[storm-core-1.2.2.jar:1.2.2]}} {{ at org.apache.storm.disruptor$clojure_handler$reify__10135.onEvent(disruptor.clj:41) ~[storm-core-1.2.2.jar:1.2.2]}} {{ at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:509) ~[storm-core-1.2.2.jar:1.2.2]}} {{ ... 6 more}} This is a topology running as our integration test using {{Testing.completeTopology()}}. Both the stack traces point to the same code in the {{TupleCaptureBolt}} - its {{name}} field is not safely published (it should be marked {{final}}), and the internal {{HashMap}} does not safely store the data put in it. Perhaps it should be a {{ConcurrentHashMap}}? Would you accept a PR with a more detailed analysis, or are you going to investigate on your side? > TupleCaptureBolt seems to be not thread-safe > -------------------------------------------- > > Key: STORM-3422 > URL: https://issues.apache.org/jira/browse/STORM-3422 > Project: Apache Storm > Issue Type: Bug > Components: storm-client > Affects Versions: 2.0.0, 1.2.2 > Reporter: Petr Janeček > Priority: Major > > Marking this as Major, but the problem lies in testing code. This makes > integration testing hard, but the issue does not affect any production code. > > First, let me show you a stack trace for Storm 2.0.0: > {{java.lang.RuntimeException: java.lang.NullPointerException}} > {{at org.apache.storm.executor.Executor.accept(Executor.java:282) > ~[storm-client-2.0.0.jar:2.0.0]}} > {{at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:133) > ~[storm-client-2.0.0.jar:2.0.0]}} > {{at org.apache.storm.utils.JCQueue.consume(JCQueue.java:110) > ~[storm-client-2.0.0.jar:2.0.0]}} > {{at > org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:171) > ~[storm-client-2.0.0.jar:2.0.0]}} > {{at > org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:158) > ~[storm-client-2.0.0.jar:2.0.0]}} > {{at org.apache.storm.utils.Utils$1.run(Utils.java:388) > [storm-client-2.0.0.jar:2.0.0]}} > {{at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]}} > {{Caused by: java.lang.NullPointerException}} > {{at > org.apache.storm.testing.TupleCaptureBolt.execute(TupleCaptureBolt.java:45) > ~[storm-client-2.0.0.jar:2.0.0]}} > {{at > org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:234) > ~[storm-client-2.0.0.jar:2.0.0]}} > {{at org.apache.storm.executor.Executor.accept(Executor.java:275) > ~[storm-client-2.0.0.jar:2.0.0]}} > {{... 6 more}} > > Here's the same for Storm 1.2.2: > {{java.lang.RuntimeException: java.lang.NullPointerException}} > {{at > org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:522) > ~[storm-core-1.2.2.jar:1.2.2]}} > {{at > org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:487) > ~[storm-core-1.2.2.jar:1.2.2]}} > {{at > org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:74) > ~[storm-core-1.2.2.jar:1.2.2]}} > {{at > org.apache.storm.daemon.executor$fn__10795$fn__10808$fn__10861.invoke(executor.clj:861) > ~[storm-core-1.2.2.jar:1.2.2]}} > {{at org.apache.storm.util$async_loop$fn__553.invoke(util.clj:484) > [storm-core-1.2.2.jar:1.2.2]}} > {{at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]}} > {{at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]}} > {{Caused by: java.lang.NullPointerException}} > {{at > org.apache.storm.testing.TupleCaptureBolt.execute(TupleCaptureBolt.java:50) > ~[storm-core-1.2.2.jar:1.2.2]}} > {{at > org.apache.storm.daemon.executor$fn__10795$tuple_action_fn__10797.invoke(executor.clj:739) > ~[storm-core-1.2.2.jar:1.2.2]}} > {{at > org.apache.storm.daemon.executor$mk_task_receiver$fn__10716.invoke(executor.clj:468) > ~[storm-core-1.2.2.jar:1.2.2]}} > {{at > org.apache.storm.disruptor$clojure_handler$reify__10135.onEvent(disruptor.clj:41) > ~[storm-core-1.2.2.jar:1.2.2]}} > {{at > org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:509) > ~[storm-core-1.2.2.jar:1.2.2]}} > {{... 6 more}} > > This is a topology running as our integration test using > {{Testing.completeTopology()}}. Both the stack traces point to the same code > in the {{TupleCaptureBolt}} - its {{name}} field is not safely published (it > should be marked {{final}}), and the internal {{HashMap}} does not safely > store the data put in it. Perhaps it should be a {{ConcurrentHashMap}}? > Would you accept a PR with a more detailed analysis, or are you going to > investigate on your side? -- This message was sent by Atlassian JIRA (v7.6.3#76005)