[ https://issues.apache.org/jira/browse/STORM-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ethan Li reassigned STORM-3620: ------------------------------- Assignee: Ethan Li > OutputCollector in Storm 2.x is not thread-safe > ----------------------------------------------- > > Key: STORM-3620 > URL: https://issues.apache.org/jira/browse/STORM-3620 > Project: Apache Storm > Issue Type: Bug > Reporter: Ethan Li > Assignee: Ethan Li > Priority: Major > > OutputCollector is not thread-safe in 2.x. > It can cause data corruption if multiple threads in the same executor calls > OutputCollector to emit data at the same time: > 1. Every executor has an instance of ExecutorTransfer > https://github.com/apache/storm/blob/00f48d60e75b28e11a887baba02dc77876b2bb3d/storm-client/src/jvm/org/apache/storm/executor/Executor.java#L146 > 2. Every ExecutorTransfer has its own serializer > https://github.com/apache/storm/blob/00f48d60e75b28e11a887baba02dc77876b2bb3d/storm-client/src/jvm/org/apache/storm/executor/ExecutorTransfer.java#L44 > 3. Every executor has its own outputCollector > https://github.com/apache/storm/blob/00f48d60e75b28e11a887baba02dc77876b2bb3d/storm-client/src/jvm/org/apache/storm/executor/bolt/BoltExecutor.java#L146-L147 > 4. When outputCollector is called to emit to remote workers, it uses > ExecutorTransfer to transfer data > https://github.com/apache/storm/blob/00f48d60e75b28e11a887baba02dc77876b2bb3d/storm-client/src/jvm/org/apache/storm/executor/ExecutorTransfer.java#L66 > 5. which will try to serialize data > https://github.com/apache/storm/blob/00f48d60e75b28e11a887baba02dc77876b2bb3d/storm-client/src/jvm/org/apache/storm/daemon/worker/WorkerTransfer.java#L116 > 6. But serializer is not thread-safe > https://github.com/apache/storm/blob/00f48d60e75b28e11a887baba02dc77876b2bb3d/storm-client/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java#L33-L43 > ---- > But in the doc, http://storm.apache.org/releases/2.1.0/Concepts.html, it says > outputCollector is thread-safe. > {code:java} > Its perfectly fine to launch new threads in bolts that do processing > asynchronously. OutputCollector is thread-safe and can be called at any time. > {code} > We should either fix it to make it thread-safe, or update the document to not > mislead users -- This message was sent by Atlassian Jira (v8.3.4#803005)