[
https://issues.apache.org/jira/browse/SPARK-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon updated SPARK-1747:
--------------------------------
Labels: bulk-closed (was: )
> check for Spark on Yarn ApplicationMaster split brain
> -----------------------------------------------------
>
> Key: SPARK-1747
> URL: https://issues.apache.org/jira/browse/SPARK-1747
> Project: Spark
> Issue Type: Bug
> Components: YARN
> Affects Versions: 1.0.0
> Reporter: Thomas Graves
> Priority: Major
> Labels: bulk-closed
>
> On yarn there is a possibility that applications can end up with an issue
> referred to as "split brain". This problem is that you have one Application
> Master running, something happens like a network split that the AM can no
> longer talk to the ResourceManager. After some time the ResourceManager will
> start a new application attempt assuming the old one failed and you end up
> with 2 application masters. Note the network split could prevent it from
> talking to the RM but it could still be running along contacting regular
> executors.
> If the previous AM does not need any more resources from the RM it could try
> to commit. This could cause lots of problems where the second AM finishes and
> tries to commit too. This could potentially result in data corruption.
> I believe this same issue can happen on Spark since its using the hadoop
> output formats. One instance that has this issue is the FileOutputCommitter.
> It first writes to a temporary directory (task commit) and then moves the
> file to the final directory (job commit). The first AM could finish the job
> commit, tell the user its done, the user starts another down stream job, but
> then the second AM comes in to do the job commit and files the down stream
> job are processing could disappear until the second AM finishes the job
> commit.
> This was fixed in MR by https://issues.apache.org/jira/browse/MAPREDUCE-4832
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]