[jira] [Commented] (STORM-167) proposal for storm topology online update
[ https://issues.apache.org/jira/browse/STORM-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696542#comment-14696542 ] Samuel Hsieh commented on STORM-167: Hi Parth Brahmbhatt, Any validation support needed pls feel free to let us know and many thanks for your great contribution!! proposal for storm topology online update - Key: STORM-167 URL: https://issues.apache.org/jira/browse/STORM-167 Project: Apache Storm Issue Type: New Feature Reporter: James Xu Assignee: Parth Brahmbhatt Priority: Minor https://github.com/nathanmarz/storm/issues/540 Now update topology code can only be done by kill it and re-submit a new one. During the kill and re-submit process some request may delay or fail. It is not so good for online service. So we consider to add topology online update recently. Mission update running topology code gracefully one worker after another without service total interrupted. Just update topology code, not update topology DAG structure including component, stream and task number. Proposal * client use storm update topology-name new-jar-file to submit new-jar-file update request * nimbus update stormdist dir, link topology-dir to new one * nimbus update topology version on zk * the supervisors that running this topology update it ** check topology version on zk, if it is not the same as local version, a topology update begin ** each supervisor schedule the topology's worker update at a rand(expect-max-update-time) time point ** sync-supervisor download the latest code from nimbus ** sync-process check local worker heartbeat version(to be added), if it is not the same with sync-supervisor downloaded version, kill the worker ** sync-process restart killed worker ** new worker heartbeat to zk with version(to be added), it can be displayed on web ui to check update progress. This feature is deployed in our production clusters. It's really useful for topologys handling online request waiting for response. Topology jar can be updated without entire service offline. We hope that this feature is useful for others too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-167) proposal for storm topology online update
[ https://issues.apache.org/jira/browse/STORM-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680283#comment-14680283 ] Parth Brahmbhatt commented on STORM-167: Sorry I have been busy wit some other stuff at work. Let me see if I can finish the work in coming 2 weeks. proposal for storm topology online update - Key: STORM-167 URL: https://issues.apache.org/jira/browse/STORM-167 Project: Apache Storm Issue Type: New Feature Reporter: James Xu Assignee: Parth Brahmbhatt Priority: Minor https://github.com/nathanmarz/storm/issues/540 Now update topology code can only be done by kill it and re-submit a new one. During the kill and re-submit process some request may delay or fail. It is not so good for online service. So we consider to add topology online update recently. Mission update running topology code gracefully one worker after another without service total interrupted. Just update topology code, not update topology DAG structure including component, stream and task number. Proposal * client use storm update topology-name new-jar-file to submit new-jar-file update request * nimbus update stormdist dir, link topology-dir to new one * nimbus update topology version on zk * the supervisors that running this topology update it ** check topology version on zk, if it is not the same as local version, a topology update begin ** each supervisor schedule the topology's worker update at a rand(expect-max-update-time) time point ** sync-supervisor download the latest code from nimbus ** sync-process check local worker heartbeat version(to be added), if it is not the same with sync-supervisor downloaded version, kill the worker ** sync-process restart killed worker ** new worker heartbeat to zk with version(to be added), it can be displayed on web ui to check update progress. This feature is deployed in our production clusters. It's really useful for topologys handling online request waiting for response. Topology jar can be updated without entire service offline. We hope that this feature is useful for others too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-167) proposal for storm topology online update
[ https://issues.apache.org/jira/browse/STORM-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14679476#comment-14679476 ] Samuel Hsieh commented on STORM-167: Hi Parth Brahmbhatt, Is there any news on this? It's an important feature and may we raise the priority to higher? or what support we can do for the feature/design approval? Thanks. proposal for storm topology online update - Key: STORM-167 URL: https://issues.apache.org/jira/browse/STORM-167 Project: Apache Storm Issue Type: New Feature Reporter: James Xu Assignee: Parth Brahmbhatt Priority: Minor https://github.com/nathanmarz/storm/issues/540 Now update topology code can only be done by kill it and re-submit a new one. During the kill and re-submit process some request may delay or fail. It is not so good for online service. So we consider to add topology online update recently. Mission update running topology code gracefully one worker after another without service total interrupted. Just update topology code, not update topology DAG structure including component, stream and task number. Proposal * client use storm update topology-name new-jar-file to submit new-jar-file update request * nimbus update stormdist dir, link topology-dir to new one * nimbus update topology version on zk * the supervisors that running this topology update it ** check topology version on zk, if it is not the same as local version, a topology update begin ** each supervisor schedule the topology's worker update at a rand(expect-max-update-time) time point ** sync-supervisor download the latest code from nimbus ** sync-process check local worker heartbeat version(to be added), if it is not the same with sync-supervisor downloaded version, kill the worker ** sync-process restart killed worker ** new worker heartbeat to zk with version(to be added), it can be displayed on web ui to check update progress. This feature is deployed in our production clusters. It's really useful for topologys handling online request waiting for response. Topology jar can be updated without entire service offline. We hope that this feature is useful for others too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-167) proposal for storm topology online update
[ https://issues.apache.org/jira/browse/STORM-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564443#comment-14564443 ] Roland Jungnickel commented on STORM-167: - Hi [~parth.brahmbhatt], Just wondering if there is any update on this feature? Thanks Roland proposal for storm topology online update - Key: STORM-167 URL: https://issues.apache.org/jira/browse/STORM-167 Project: Apache Storm Issue Type: New Feature Reporter: James Xu Assignee: Parth Brahmbhatt Priority: Minor https://github.com/nathanmarz/storm/issues/540 Now update topology code can only be done by kill it and re-submit a new one. During the kill and re-submit process some request may delay or fail. It is not so good for online service. So we consider to add topology online update recently. Mission update running topology code gracefully one worker after another without service total interrupted. Just update topology code, not update topology DAG structure including component, stream and task number. Proposal * client use storm update topology-name new-jar-file to submit new-jar-file update request * nimbus update stormdist dir, link topology-dir to new one * nimbus update topology version on zk * the supervisors that running this topology update it ** check topology version on zk, if it is not the same as local version, a topology update begin ** each supervisor schedule the topology's worker update at a rand(expect-max-update-time) time point ** sync-supervisor download the latest code from nimbus ** sync-process check local worker heartbeat version(to be added), if it is not the same with sync-supervisor downloaded version, kill the worker ** sync-process restart killed worker ** new worker heartbeat to zk with version(to be added), it can be displayed on web ui to check update progress. This feature is deployed in our production clusters. It's really useful for topologys handling online request waiting for response. Topology jar can be updated without entire service offline. We hope that this feature is useful for others too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-167) proposal for storm topology online update
[ https://issues.apache.org/jira/browse/STORM-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14565030#comment-14565030 ] Parth Brahmbhatt commented on STORM-167: Hey , I haven't had time to look into this yet and I am busy with some other stuff so probably wont get to it in next 2 months. If someone else wants to take it up please feel free to do so. proposal for storm topology online update - Key: STORM-167 URL: https://issues.apache.org/jira/browse/STORM-167 Project: Apache Storm Issue Type: New Feature Reporter: James Xu Assignee: Parth Brahmbhatt Priority: Minor https://github.com/nathanmarz/storm/issues/540 Now update topology code can only be done by kill it and re-submit a new one. During the kill and re-submit process some request may delay or fail. It is not so good for online service. So we consider to add topology online update recently. Mission update running topology code gracefully one worker after another without service total interrupted. Just update topology code, not update topology DAG structure including component, stream and task number. Proposal * client use storm update topology-name new-jar-file to submit new-jar-file update request * nimbus update stormdist dir, link topology-dir to new one * nimbus update topology version on zk * the supervisors that running this topology update it ** check topology version on zk, if it is not the same as local version, a topology update begin ** each supervisor schedule the topology's worker update at a rand(expect-max-update-time) time point ** sync-supervisor download the latest code from nimbus ** sync-process check local worker heartbeat version(to be added), if it is not the same with sync-supervisor downloaded version, kill the worker ** sync-process restart killed worker ** new worker heartbeat to zk with version(to be added), it can be displayed on web ui to check update progress. This feature is deployed in our production clusters. It's really useful for topologys handling online request waiting for response. Topology jar can be updated without entire service offline. We hope that this feature is useful for others too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-167) proposal for storm topology online update
[ https://issues.apache.org/jira/browse/STORM-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14369823#comment-14369823 ] Parth Brahmbhatt commented on STORM-167: This is a useful feature and I see a lot of user interest. [~xiaokang] Thanks for the original patch and I am not sure why it was not reviewed. Do you think you can upmerge this with storm/master? If not, do you mind if I take over this task? proposal for storm topology online update - Key: STORM-167 URL: https://issues.apache.org/jira/browse/STORM-167 Project: Apache Storm Issue Type: New Feature Reporter: James Xu Priority: Minor https://github.com/nathanmarz/storm/issues/540 Now update topology code can only be done by kill it and re-submit a new one. During the kill and re-submit process some request may delay or fail. It is not so good for online service. So we consider to add topology online update recently. Mission update running topology code gracefully one worker after another without service total interrupted. Just update topology code, not update topology DAG structure including component, stream and task number. Proposal * client use storm update topology-name new-jar-file to submit new-jar-file update request * nimbus update stormdist dir, link topology-dir to new one * nimbus update topology version on zk * the supervisors that running this topology update it ** check topology version on zk, if it is not the same as local version, a topology update begin ** each supervisor schedule the topology's worker update at a rand(expect-max-update-time) time point ** sync-supervisor download the latest code from nimbus ** sync-process check local worker heartbeat version(to be added), if it is not the same with sync-supervisor downloaded version, kill the worker ** sync-process restart killed worker ** new worker heartbeat to zk with version(to be added), it can be displayed on web ui to check update progress. This feature is deployed in our production clusters. It's really useful for topologys handling online request waiting for response. Topology jar can be updated without entire service offline. We hope that this feature is useful for others too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-167) proposal for storm topology online update
[ https://issues.apache.org/jira/browse/STORM-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14162240#comment-14162240 ] Dane Hammer commented on STORM-167: --- I need this. There's already a lot of work done. What can I do to get this merged to master? Looks like some rebasing and correcting merge conflicts is in order, but what about approval of the design - is this how we want to do this? proposal for storm topology online update - Key: STORM-167 URL: https://issues.apache.org/jira/browse/STORM-167 Project: Apache Storm Issue Type: New Feature Reporter: James Xu Priority: Minor https://github.com/nathanmarz/storm/issues/540 Now update topology code can only be done by kill it and re-submit a new one. During the kill and re-submit process some request may delay or fail. It is not so good for online service. So we consider to add topology online update recently. Mission update running topology code gracefully one worker after another without service total interrupted. Just update topology code, not update topology DAG structure including component, stream and task number. Proposal * client use storm update topology-name new-jar-file to submit new-jar-file update request * nimbus update stormdist dir, link topology-dir to new one * nimbus update topology version on zk * the supervisors that running this topology update it ** check topology version on zk, if it is not the same as local version, a topology update begin ** each supervisor schedule the topology's worker update at a rand(expect-max-update-time) time point ** sync-supervisor download the latest code from nimbus ** sync-process check local worker heartbeat version(to be added), if it is not the same with sync-supervisor downloaded version, kill the worker ** sync-process restart killed worker ** new worker heartbeat to zk with version(to be added), it can be displayed on web ui to check update progress. This feature is deployed in our production clusters. It's really useful for topologys handling online request waiting for response. Topology jar can be updated without entire service offline. We hope that this feature is useful for others too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)