Github user d2r commented on a diff in the pull request:

    https://github.com/apache/storm/pull/349#discussion_r27413373
  
    --- Diff: storm-core/src/clj/backtype/storm/daemon/worker.clj ---
    @@ -302,9 +301,8 @@
                                port)
                               ]
                              )))
    -              (write-locked (:endpoint-socket-lock worker)
    -                (reset! (:cached-task->node+port worker)
    -                        (HashMap. my-assignment)))
    --- End diff --
    
    OK, I am convinced we do need the lock here.  The lock is not so much 
protecting `:cached-node+port->socket` as it is guaranteeing (along with the 
read-lock below) that anyone accessing the atom after we enter this lock will 
get the data in new HashMap that has been swapped into it.
    
    In fact, it seems there could still be a race we send to an endpoint after 
it is closed, but before it is removed.  We might want to extend the write lock 
to protect readers getting a mapping with non-null, closed connections:
    
    ```Diff
                   (write-locked (:endpoint-socket-lock worker)
                     (reset! (:cached-task->node+port worker)
    -                        (HashMap. my-assignment)))
    -              (doseq [endpoint remove-connections]
    -                (.close (get @(:cached-node+port->socket worker) 
endpoint)))
    -              (apply swap!
    -                     (:cached-node+port->socket worker)
    -                     #(HashMap. (apply dissoc (into {} %1) %&))
    -                     remove-connections)
    +                        (HashMap. my-assignment))
    +                (doseq [endpoint remove-connections]
    +                  (.close (get @(:cached-node+port->socket worker) 
endpoint)))
    +                (apply swap!
    +                       (:cached-node+port->socket worker)
    +                       #(HashMap. (apply dissoc (into {} %1) %&))
    +                       remove-connections))
    ```
    It is OK if the packets are sent to a null IConnection, since 
TransferDrainer does a [null 
check](https://github.com/apache/storm/blob/36e99fa2dfdd13cd43d8fa8c558c670cd7750ed0/storm-core/src/jvm/backtype/storm/utils/TransferDrainer.java#L50).
  But it is not OK to try and send with a closed, but non-null connection.
    
    Maybe it would be good to add documentation near `write-locked` explaining 
that it is a barrier to anyone else reading from the node+port->socket mapping. 
 Otherwise, it is not clear how it is helping.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to