Github user clockfly commented on the pull request:
https://github.com/apache/storm/pull/268#issuecomment-67047625
@nathanmarz ,
I'd like to explain why I need to change worker.clj.
This was also motivated by a legacy TODO in in zmq.clj.
https://github.com/nathanmarz/storm/blob/0.8.2/src/clj/backtype/storm/messaging/zmq.clj#L43
```
(send [this task message]
...
(mq/send socket message ZMQ/NOBLOCK)) ;; TODO: how to do backpressure
if doing noblock?... need to only unblock if the target disappears
```
As we can see, zeromq transport will send message in non-blocking way.
If I understand this TODO correctly, it wants,
a) When target worker is not booted yet, the source worker should not send
message to target. Otherwise, as there is no backpressure, there will be
message loss during the bootup phase. If it is un unacked topology, the message
loss is permanent; if it is an acked topology, we will need to do unnecessary
replay.
b) When target worker disappears in the middle(crash?), the source worker
should drop the messages directly.
The problem is that: transport layer don't know by itself whether the
target worker is "booting up" or "crashed in the running phase", so it cannot
smartly choose between "back pressure" or "drop".
If the transport simplifiy choose "block", it is good for "booting up"
phase, but bad for "running phase". If one connection is down, it may block
messages sent to other connections.
If the transport simplify choose "drop", it is good for "running phase",
but bad for "booting up" phase. If the target worker is booted 30 seconds
later, all message between this 30 seconds will be dropped.
The changes in "worker.clj" is targeted to solve this problem.
Worker knows when the target worker connections are ready.
In the bootup phase, worker.clj will wait target worker connection is
ready, then it will activate the source worker tasks.
In the âruntime phase", the transport will simply drop the messages if
target worker crashed in the middle.
There will be several benefits:
1. During cluster bootup, for unacked topology, there will be no strange
message loss.
2. During cluster bootup, for acked topology, it can take less time to
reach the normal throughput, as there is no message loss, timeout, and replay.
3. For transport layer, the design is simplified. We can just drop the
messages if target worker is not available.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---