Github user HeartSaVioR commented on the pull request:
https://github.com/apache/storm/pull/286#issuecomment-61719213
I've got a change to discuss about this PR with @clockfly , and he also
stated if subprocess is too busy, subprocess cannot send heartbeat in time,
which I've stated first of this PR.
Actually it's better to let subprocess have heartbeat thread and send
heartbeat periodically.
But there're two things to consider.
1. ShellSpout runs with PING-PONG communication, and ShellSpout must wait
"sync" from nextTuple(). So if we change ShellSpout to have reader thread, we
should implement nextTuple() to wait for reading "sync" from reader thread,
which is a little complex than current.
2. We should ensure that main thread and heartbeat thread don't write
stdout (maybe Pipe) at the same time. GIL could let us feel free, but there
will be other languages that support real (?) thread. Writing operation should
be with lock.
Since I'm not a Javascript (nodejs) guy, and I'm a beginner to Ruby, I
cannot cover two things with .js.
So I wish to implement it to other PR when we think we can't stand its
limitation, or I have some more time.
Btw, Nimbus / Supervisor can find dead process due to subprocess hang up to
SUPERVISOR_WORKER_TIMEOUT_SECS * 2 + a (maybe), cause there're two heartbeat
check, ShellProcess checks subprocess (and suicide if subprocess cannot
respond), Nimbus / Supervisor checks ShellProcess.
(Just for @clockfly )
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---