Github user d2r commented on a diff in the pull request: https://github.com/apache/storm/pull/349#discussion_r27413373 --- Diff: storm-core/src/clj/backtype/storm/daemon/worker.clj --- @@ -302,9 +301,8 @@ port) ] ))) - (write-locked (:endpoint-socket-lock worker) - (reset! (:cached-task->node+port worker) - (HashMap. my-assignment))) --- End diff -- OK, I am convinced we do need the lock here. The lock is not so much protecting `:cached-node+port->socket` as it is guaranteeing (along with the read-lock below) that anyone accessing the atom after we enter this lock will get the data in new HashMap that has been swapped into it. In fact, it seems there could still be a race we send to an endpoint after it is closed, but before it is removed. We might want to extend the write lock to protect readers getting a mapping with non-null, closed connections: ```Diff (write-locked (:endpoint-socket-lock worker) (reset! (:cached-task->node+port worker) - (HashMap. my-assignment))) - (doseq [endpoint remove-connections] - (.close (get @(:cached-node+port->socket worker) endpoint))) - (apply swap! - (:cached-node+port->socket worker) - #(HashMap. (apply dissoc (into {} %1) %&)) - remove-connections) + (HashMap. my-assignment)) + (doseq [endpoint remove-connections] + (.close (get @(:cached-node+port->socket worker) endpoint))) + (apply swap! + (:cached-node+port->socket worker) + #(HashMap. (apply dissoc (into {} %1) %&)) + remove-connections)) ``` It is OK if the packets are sent to a null IConnection, since TransferDrainer does a [null check](https://github.com/apache/storm/blob/36e99fa2dfdd13cd43d8fa8c558c670cd7750ed0/storm-core/src/jvm/backtype/storm/utils/TransferDrainer.java#L50). But it is not OK to try and send with a closed, but non-null connection. Maybe it would be good to add documentation near `write-locked` explaining that it is a barrier to anyone else reading from the node+port->socket mapping. Otherwise, it is not clear how it is helping.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---