[
https://issues.apache.org/jira/browse/SPARK-19001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon updated SPARK-19001:
---------------------------------
Labels: bulk-closed (was: )
> Worker will submit multiply CleanWorkDir and SendHeartbeat task with each
> RegisterWorker response
> -------------------------------------------------------------------------------------------------
>
> Key: SPARK-19001
> URL: https://issues.apache.org/jira/browse/SPARK-19001
> Project: Spark
> Issue Type: Bug
> Components: Deploy
> Affects Versions: 1.6.1
> Reporter: liujianhui
> Priority: Minor
> Labels: bulk-closed
>
> h2. scene
> worker will submit multiply task of SendHeartbeat and CleanWorkDir if the
> worker register itself again, this code as follow
> {code}
> private def handleRegisterResponse(msg: RegisterWorkerResponse): Unit =
> synchronized {
> msg match {
> case RegisteredWorker(masterRef, masterWebUiUrl) =>
> logInfo("Successfully registered with master " +
> masterRef.address.toSparkURL)
> registered = true
> changeMaster(masterRef, masterWebUiUrl)
> forwordMessageScheduler.scheduleAtFixedRate(new Runnable {
> override def run(): Unit = Utils.tryLogNonFatalError {
> self.send(SendHeartbeat)
> }
> }, 0, HEARTBEAT_MILLIS, TimeUnit.MILLISECONDS)
> if (CLEANUP_ENABLED) {
> logInfo(
> s"Worker cleanup enabled; old application directories will be
> deleted in: $workDir")
> forwordMessageScheduler.scheduleAtFixedRate(new Runnable {
> override def run(): Unit = Utils.tryLogNonFatalError {
> self.send(WorkDirCleanup)
> }
> }, CLEANUP_INTERVAL_MILLIS, CLEANUP_INTERVAL_MILLIS,
> TimeUnit.MILLISECONDS)
> }
> {code}
> log as follow
> {code}
> 2016-12-20 20:23:30,030 | Successfully registered with master
> spark://polaris-1:8090|org.apache.spark.Logging$class.logInfo(Logging.scala:58)
> 2016-12-26 20:41:58,058 | Successfully registered with master
> spark://polaris-1:8090|org.apache.spark.Logging$class.logInfo(Logging.scala:58)
> {code}
> if the worker Send RegisterWorker event multiply many times when the master
> found the heartbeat of the worker expired, then it will submit task multiply
> times
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]