[jira] [Updated] (SPARK-26498) Integrate barrier execution with MMLSpark's LightGBM
[ https://issues.apache.org/jira/browse/SPARK-26498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26498: -- Affects Version/s: (was: 3.0.0) 3.1.0 > Integrate barrier execution with MMLSpark's LightGBM > > > Key: SPARK-26498 > URL: https://issues.apache.org/jira/browse/SPARK-26498 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib >Affects Versions: 3.1.0 >Reporter: Ilya Matiach >Priority: Major > > I would like to use the new barrier execution mode introduced in spark 2.4 > with LightGBM in the spark package mmlspark but I ran into some issues. > Currently, the LightGBM distributed learner tries to figure out the number of > cores on the cluster and then does a coalesce and a mapPartitions, and inside > the mapPartitions we do a NetworkInit (where the address:port of all workers > needs to be passed in the constructor) and pass the data in-memory to the > native layer of the distributed lightgbm learner. > With barrier execution mode, I think the code would become much more robust. > However, there are several issues that I am running into when trying to move > my code over to the new barrier execution mode scheduler: > Does not support dynamic allocation – however, I think it would be convenient > if it restarted the job when the number of workers has decreased and allowed > the dev to decide whether to restart the job if the number of workers > increased > Does not work with DataFrame or Dataset API, but I think it would be much > more convenient if it did. > How does barrier execution mode deal with #partitions > #tasks? If the > number of partitions is larger than the number of “tasks” or workers, can > barrier execution mode automatically coalesce the dataset to have # > partitions == # tasks? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-26498) Integrate barrier execution with MMLSpark's LightGBM
[ https://issues.apache.org/jira/browse/SPARK-26498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26498: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Integrate barrier execution with MMLSpark's LightGBM > > > Key: SPARK-26498 > URL: https://issues.apache.org/jira/browse/SPARK-26498 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib >Affects Versions: 3.0.0 >Reporter: Ilya Matiach >Priority: Major > > I would like to use the new barrier execution mode introduced in spark 2.4 > with LightGBM in the spark package mmlspark but I ran into some issues. > Currently, the LightGBM distributed learner tries to figure out the number of > cores on the cluster and then does a coalesce and a mapPartitions, and inside > the mapPartitions we do a NetworkInit (where the address:port of all workers > needs to be passed in the constructor) and pass the data in-memory to the > native layer of the distributed lightgbm learner. > With barrier execution mode, I think the code would become much more robust. > However, there are several issues that I am running into when trying to move > my code over to the new barrier execution mode scheduler: > Does not support dynamic allocation – however, I think it would be convenient > if it restarted the job when the number of workers has decreased and allowed > the dev to decide whether to restart the job if the number of workers > increased > Does not work with DataFrame or Dataset API, but I think it would be much > more convenient if it did. > How does barrier execution mode deal with #partitions > #tasks? If the > number of partitions is larger than the number of “tasks” or workers, can > barrier execution mode automatically coalesce the dataset to have # > partitions == # tasks? -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org