[
https://issues.apache.org/jira/browse/STORM-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508140#comment-14508140
]
ASF GitHub Bot commented on STORM-756:
--------------------------------------
GitHub user HeartSaVioR opened a pull request:
https://github.com/apache/storm/pull/532
[STORM-756] ShellBolt can delay sending tuples on demand from subprocess
Please refer https://issues.apache.org/jira/browse/STORM-756 to see more
details.
* introduce new multilang protocol: delay [seconds]
* command: "delay", msg: "seconds(number format)"
* python: storm.delay(seconds)
* ruby: Storm::Protocol.delay(seconds)
* expose pending queue size to let users check and request delay
* python: storm.getPendingQueueSize()
* ruby: Storm::Protocol.get_pending_queue_size()
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/HeartSaVioR/storm STORM-756
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/storm/pull/532.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #532
----
commit 41b10d7f41801bfd12344c73f4c51383949521b0
Author: Jungtaek Lim <[email protected]>
Date: 2015-04-22T23:20:24Z
ShellBolt can delay sending tuples on demand from subprocess
* introduce new multilang protocol: delay [seconds]
** command: "delay", msg: "<seconds>"
** python: storm.delay(seconds)
** ruby: Storm::Protocol.delay(seconds)
* expose pending queue size to let users check and request delay
** python: storm.getPendingQueueSize()
** ruby: Storm::Protocol.get_pending_queue_size()
----
> [multilang] Introduce overflow control mechanism
> ------------------------------------------------
>
> Key: STORM-756
> URL: https://issues.apache.org/jira/browse/STORM-756
> Project: Apache Storm
> Issue Type: Improvement
> Affects Versions: 0.10.0, 0.9.4, 0.11.0
> Reporter: Jungtaek Lim
> Assignee: Jungtaek Lim
>
> It's from STORM-738,
> https://issues.apache.org/jira/browse/STORM-738?focusedCommentId=14394106&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14394106
> A. ShellBolt side control
> We can modify ShellBolt to have sent tuple ids list, and stop sending tuples
> when list exceeds configured max value. In order to achieve this, subprocess
> should notify "tuple id is complete" to ShellBolt.
> * It introduces new commands for multi-lang, "proceed" (or better name)
> * ShellBolt stores in-progress-of-processing tuples list.
> * Its overhead could be big, subprocess should always notify to ShellBolt
> when any tuples are processed.
> B. subprocess side control
> We can modify subprocess to check pending queue after reading tuple.
> If it exceeds configured max value, subprocess can request "delay" to
> ShellBolt for slowing down.
> When ShellBolt receives "delay", BoltWriterRunnable should stop polling
> pending queue and continue polling later.
> How long ShellBolt wait for resending? Its unit would be "delay time" or
> "tuple count". I don't know which is better yet.
> * It introduces new commands for multi-lang, "delay" (or better name)
> * I don't think it would be introduced soon, but subprocess can request delay
> based on own statistics. (ex. pending tuple count * average tuple processed
> time for time unit, average pending tuple count for count unit)
> ** We can leave when and how much to request "delay" to user. User can make
> his/her own algorithm to control flooding.
> In my opinion B seems to more natural cause current issue is by subprocess
> side so it would be better to let subprocess overcome it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)