[ 
https://issues.apache.org/jira/browse/STORM-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15001353#comment-15001353
 ] 

ASF GitHub Bot commented on STORM-794:
--------------------------------------

Github user HeartSaVioR commented on the pull request:

    https://github.com/apache/storm/pull/542#issuecomment-155947307
  
    @revans2 Upmerged.


> Trident Topology with some situation seems not handle deactivate during 
> graceful shutdown
> -----------------------------------------------------------------------------------------
>
>                 Key: STORM-794
>                 URL: https://issues.apache.org/jira/browse/STORM-794
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-core
>    Affects Versions: 0.9.3
>            Reporter: Jungtaek Lim
>            Assignee: Jungtaek Lim
>
> I met an issue from Trident Topology in production env.
> Normally, when we kill a topology via UI, Nimbus changes Topology status to 
> "killed", and when Spout determines new status, it becomes deactivated so 
> bolts can handle remain tuples within wait-time.
> AFAIK that's how Storm guarantees graceful shutdown.
> But, Trident Topology seems not handle "deactivate" while we try shutdown 
> topology gracefully.
> MasterBatchCoordinator never stops making next transaction, so Trident Spout 
> never stops emitting, bolts (function) always take care of tuples.
> Topology setting
> - 1 worker, 1 acker
> - max spout pending: 1
> - TOPOLOGY_TRIDENT_BATCH_EMIT_INTERVAL_MILLIS : 5
> -- It may be weird but MasterBatchCoordinator's default value is 1
> * Nimbus log
> {code}
> 2015-04-20 09:59:07.954 INFO  [pool-5-thread-41][nimbus] Delaying event 
> :remove for 120 secs for BFDC-topology-DynamicCollect-68c9d7b4-72-1429491015
> ...
> 2015-04-20 09:59:07.955 INFO  [pool-5-thread-41][nimbus] Updated 
> BFDC-topology-DynamicCollect-68c9d7b4-72-1429491015 with status {:type 
> :killed, :kill-time-secs 120}
> ...
> 2015-04-20 10:01:07.956 INFO  [timer][nimbus] Killing topology: 
> BFDC-topology-DynamicCollect-68c9d7b4-72-1429491015
> ...
> 2015-04-20 10:01:14.448 INFO  [timer][nimbus] Cleaning up 
> BFDC-topology-DynamicCollect-68c9d7b4-72-1429491015
> {code}
> * Supervisor log
> {code}
> 2015-04-20 10:01:07.960 INFO  [Thread-1][supervisor] Removing code for storm 
> id BFDC-topology-DynamicCollect-68c9d7b4-72-1429491015
> 2015-04-20 10:01:07.962 INFO  [Thread-2][supervisor] Shutting down and 
> clearing state for id 9719259e-528c-4336-abf9-592c1bb9a00b. Current 
> supervisor time: 1429491667. State: :disallowed, Heartbeat: 
> #backtype.storm.daemon.common.WorkerHeartbeat{:time-secs 1429491667, 
> :storm-id "BFDC-topology-DynamicCollect-68c9d7b4-72-1429491015", :executors 
> #{[2 2] [3 3] [4 4] [5 5] [6 6] [7 7] [8 8] [9 9] [10 10] [11 11] [12 12] [13 
> 13] [14 14] [-1 -1] [1 1]}, :port 6706}
> 2015-04-20 10:01:07.962 INFO  [Thread-2][supervisor] Shutting down 
> 5bc084a2-b668-4610-86f6-9b93304d40a8:9719259e-528c-4336-abf9-592c1bb9a00b
> 2015-04-20 10:01:08.974 INFO  [Thread-2][supervisor] Shut down 
> 5bc084a2-b668-4610-86f6-9b93304d40a8:9719259e-528c-4336-abf9-592c1bb9a00b
> {code}
> * Worker log
> {code}
> 2015-04-20 10:01:07.985 INFO  [Thread-33][worker] Shutting down worker 
> BFDC-topology-DynamicCollect-68c9d7b4-72-1429491015 
> 5bc084a2-b668-4610-86f6-9b93304d40a8 6706
> 2015-04-20 10:01:07.985 INFO  [Thread-33][worker] Shutting down receive thread
> 2015-04-20 10:01:07.988 WARN  [Thread-33][ExponentialBackoffRetry] maxRetries 
> too large (300). Pinning to 29
> 2015-04-20 10:01:07.988 INFO  
> [Thread-33][StormBoundedExponentialBackoffRetry] The baseSleepTimeMs [100] 
> the maxSleepTimeMs [1000] the maxRetries [300]
> 2015-04-20 10:01:07.988 INFO  [Thread-33][Client] New Netty Client, connect 
> to localhost, 6706, config: , buffer_size: 5242880
> 2015-04-20 10:01:07.991 INFO  [client-schedule-service-1][Client] Reconnect 
> started for Netty-Client-localhost/127.0.0.1:6706... [0]
> 2015-04-20 10:01:07.996 INFO  [Thread-33][loader] Shutting down 
> receiving-thread: [BFDC-topology-DynamicCollect-68c9d7b4-72-1429491015, 6706]
> ...
> 2015-04-20 10:01:08.044 INFO  [Thread-33][Client] Closing Netty Client 
> Netty-Client-localhost/127.0.0.1:6706
> 2015-04-20 10:01:08.044 INFO  [Thread-33][Client] Waiting for pending batchs 
> to be sent with Netty-Client-localhost/127.0.0.1:6706..., timeout: 600000ms, 
> pendings: 1
> {code}
> I found activating log, but cannot find deactivating log.
> {code}
> 2015-04-20 09:50:24.556 INFO  [Thread-30-$mastercoord-bg0][executor] 
> Activating spout $mastercoord-bg0:(1)
> {code}
> Please note that it doesn't work when I just push button to "deactivate" 
> topology via UI.
> We're changing our Topology to normal Spout-Bolt, but personally I'd like to 
> see it resolved. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to