[ 
https://issues.apache.org/jira/browse/SPARK-21829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16144867#comment-16144867
 ] 

Luca Canali commented on SPARK-21829:
-------------------------------------

Thanks Jerry for the heads up on using node labels. Unfortunately I cannot test 
this right away, as the cluster in question is currently on Hadoop 2.6 with 
fair scheduler. There is another thought I wanted to bring forward on the 
motivation at the origin of this JIRA and it has to do more on the operations 
aspects of the issue.      
As you rightly noticed the problem I want to solve does not generate from 
Spark, it rather comes from the interaction with an external source. On one 
side, it makes a lot of sense that it should rather be solved on the external 
source, if possible, or that it should be addressed at the cluster manager 
level (either taking the nodes out of the cluster altogether as suggested by 
Sean or by using node labels in the case of YARN as you suggested).   
However, I have the additional constraint that I want to run the workload on a 
shared production YARN cluster. This makes pushing changes to the cluster a 
relatively long task, because of the need to get the proper validation in 
place. I think the key for me is that the more "knobs" I have in Spark to 
control the behavior of the jobs the faster I (and the other users of the 
system) can be on troubleshooting the issue and to do that on my(their) own 
without need to further involve cluster admins. 
This comment is of course related to my particular experience and the way we 
run Spark on our clusters at the moment, perceptions and needs may be 
completely different for others. As mentioned, for the particular issue I had, 
I could troubleshoot it simply by starting the jobs and then killing the 
executors on the 2 slow nodes. My proposed patch is a simple attempt to address 
this type of troubleshooting with a configuration in Spark, for reasons of 
speed and convenience, which I guess would make the proposed change more in the 
"debug" domain than in the new product features.
An additional comment is that the idea of implementing this functionality using 
the blacklist mechanism is just for convenience, as most of the code is already 
there with the standard blacklist mechanism and the patch is very small in 
size. The change, if needed, can be implemented differently.



> Enable config to permanently blacklist a list of nodes
> ------------------------------------------------------
>
>                 Key: SPARK-21829
>                 URL: https://issues.apache.org/jira/browse/SPARK-21829
>             Project: Spark
>          Issue Type: New Feature
>          Components: Scheduler, Spark Core
>    Affects Versions: 2.1.1, 2.2.0
>            Reporter: Luca Canali
>            Priority: Minor
>
> The idea for this proposal comes from a performance incident in a local 
> cluster where a job was found very slow because of a log tail of stragglers 
> due to 2 nodes in the cluster being slow to access a remote filesystem.
> The issue was limited to the 2 machines and was related to external 
> configurations: the 2 machines that performed badly when accessing the remote 
> file system were behaving normally for other jobs in the cluster (a shared 
> YARN cluster).
> With this new feature I propose to introduce a mechanism to allow users to 
> specify a list of nodes in the cluster where executors/tasks should not run 
> for a specific job.
> The proposed implementation that I tested (see PR) uses the Spark blacklist 
> mechanism. With the parameter spark.blacklist.alwaysBlacklistedNodes, a list 
> of user-specified nodes is added to the blacklist at the start of the Spark 
> Context and it is never expired. 
> I have tested this on a YARN cluster on a case taken from the original 
> production problem and I confirm a performance improvement of about 5x for 
> the specific test case I have. I imagine that there can be other cases where 
> Spark users may want to blacklist a set of nodes. This can be used for 
> troubleshooting, including cases where certain nodes/executors are slow for a 
> given workload and this is caused by external agents, so the anomaly is not 
> picked up by the cluster manager.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to