[ 
https://issues.apache.org/jira/browse/SPARK-33029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255682#comment-17255682
 ] 

Baohe Zhang commented on SPARK-33029:
-------------------------------------

With the blacklist feature enabled, by default, a node will be excluded when 2 
executors on this node have been excluded. In this case, the node is excluded 
and we will mark all executors in that node as excluded. Since we are running 
standalone mode in a single node, the driver and all executors share the same 
hostname. the driver will be marked as excluded on AppStatusListener when 
handling "SparkListenerNodeExcludedForStage" event. We can fix it by filter out 
the driver entity when handling this event, hence the UI won't show the driver 
is excluded.

> Standalone mode blacklist executors page UI marks driver as blacklisted
> -----------------------------------------------------------------------
>
>                 Key: SPARK-33029
>                 URL: https://issues.apache.org/jira/browse/SPARK-33029
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.0.0
>            Reporter: Thomas Graves
>            Priority: Major
>         Attachments: Screen Shot 2020-09-29 at 1.52.09 PM.png, Screen Shot 
> 2020-09-29 at 1.53.37 PM.png
>
>
> I am running a spark shell on a 1 node standalone cluster.  I noticed that 
> the executors page ui was marking the driver as blacklisted for the stage 
> that is running.  Attached a screen shot.
> Also, in my case one of the executors died and it doesn't seem like the 
> schedule rpicked up the new one.  It doesn't show up on the stages page and 
> just shows it as active but none of the tasks ran there.
>  
> You can reproduce this by starting a master and slave on a single node, then 
> launch a shell like where you will get multiple executors (in this case I got 
> 3)
> $SPARK_HOME/bin/spark-shell --master spark://yourhost:7077 --executor-cores 4 
> --conf spark.blacklist.enabled=true
>  
> From shell run:
> {code:java}
> import org.apache.spark.TaskContext
> val rdd = sc.makeRDD(1 to 1000, 5).mapPartitions { it =>
>  val context = TaskContext.get()
>  if (context.attemptNumber() < 2) {
>  throw new Exception("test attempt num")
>  }
>  it
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to