[jira] [Updated] (SPARK-26498) Integrate barrier execution with MMLSpark's LightGBM

2020-03-17 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-26498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-26498:
--
Affects Version/s: (was: 3.0.0)
   3.1.0

> Integrate barrier execution with MMLSpark's LightGBM
> 
>
> Key: SPARK-26498
> URL: https://issues.apache.org/jira/browse/SPARK-26498
> Project: Spark
>  Issue Type: New Feature
>  Components: ML, MLlib
>Affects Versions: 3.1.0
>Reporter: Ilya Matiach
>Priority: Major
>
> I would like to use the new barrier execution mode introduced in spark 2.4 
> with LightGBM in the spark package mmlspark but I ran into some issues.
> Currently, the LightGBM distributed learner tries to figure out the number of 
> cores on the cluster and then does a coalesce and a mapPartitions, and inside 
> the mapPartitions we do a NetworkInit (where the address:port of all workers 
> needs to be passed in the constructor) and pass the data in-memory to the 
> native layer of the distributed lightgbm learner.
> With barrier execution mode, I think the code would become much more robust.  
> However, there are several issues that I am running into when trying to move 
> my code over to the new barrier execution mode scheduler:
> Does not support dynamic allocation – however, I think it would be convenient 
> if it restarted the job when the number of workers has decreased and allowed 
> the dev to decide whether to restart the job if the number of workers 
> increased
> Does not work with DataFrame or Dataset API, but I think it would be much 
> more convenient if it did.
> How does barrier execution mode deal with #partitions > #tasks?  If the 
> number of partitions is larger than the number of “tasks” or workers, can 
> barrier execution mode automatically coalesce the dataset to have # 
> partitions == # tasks?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26498) Integrate barrier execution with MMLSpark's LightGBM

2019-07-16 Thread Dongjoon Hyun (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-26498:
--
Affects Version/s: (was: 2.4.0)
   3.0.0

> Integrate barrier execution with MMLSpark's LightGBM
> 
>
> Key: SPARK-26498
> URL: https://issues.apache.org/jira/browse/SPARK-26498
> Project: Spark
>  Issue Type: New Feature
>  Components: ML, MLlib
>Affects Versions: 3.0.0
>Reporter: Ilya Matiach
>Priority: Major
>
> I would like to use the new barrier execution mode introduced in spark 2.4 
> with LightGBM in the spark package mmlspark but I ran into some issues.
> Currently, the LightGBM distributed learner tries to figure out the number of 
> cores on the cluster and then does a coalesce and a mapPartitions, and inside 
> the mapPartitions we do a NetworkInit (where the address:port of all workers 
> needs to be passed in the constructor) and pass the data in-memory to the 
> native layer of the distributed lightgbm learner.
> With barrier execution mode, I think the code would become much more robust.  
> However, there are several issues that I am running into when trying to move 
> my code over to the new barrier execution mode scheduler:
> Does not support dynamic allocation – however, I think it would be convenient 
> if it restarted the job when the number of workers has decreased and allowed 
> the dev to decide whether to restart the job if the number of workers 
> increased
> Does not work with DataFrame or Dataset API, but I think it would be much 
> more convenient if it did.
> How does barrier execution mode deal with #partitions > #tasks?  If the 
> number of partitions is larger than the number of “tasks” or workers, can 
> barrier execution mode automatically coalesce the dataset to have # 
> partitions == # tasks?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org