[jira] [Updated] (SPARK-25309) Sci-kit Learn like Auto Pipeline Parallelization in Spark
[ https://issues.apache.org/jira/browse/SPARK-25309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-25309: -- Priority: Minor (was: Critical) > Sci-kit Learn like Auto Pipeline Parallelization in Spark > -- > > Key: SPARK-25309 > URL: https://issues.apache.org/jira/browse/SPARK-25309 > Project: Spark > Issue Type: Improvement > Components: ML, PySpark >Affects Versions: 2.3.1 >Reporter: Ravi >Priority: Minor > > SPARK-19357 and SPARK-21911 have helped parallelize Pipelines in Spark. > However, instead of setting the parallelism Parameter in the CrossValidator > it would be good to have something like njobs=-1 (like Scikit Learn) where > the Pipeline DAG could be automatically parallelized and scheduled based on > the resources allocated to the Spark Session instead of having the user pick > the integer value for this parameter. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-25309) Sci-kit Learn like Auto Pipeline Parallelization in Spark
[ https://issues.apache.org/jira/browse/SPARK-25309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-25309: - Component/s: ML > Sci-kit Learn like Auto Pipeline Parallelization in Spark > -- > > Key: SPARK-25309 > URL: https://issues.apache.org/jira/browse/SPARK-25309 > Project: Spark > Issue Type: Improvement > Components: ML, PySpark >Affects Versions: 2.3.1 >Reporter: Ravi >Priority: Critical > > SPARK-19357 and SPARK-21911 have helped parallelize Pipelines in Spark. > However, instead of setting the parallelism Parameter in the CrossValidator > it would be good to have something like njobs=-1 (like Scikit Learn) where > the Pipeline DAG could be automatically parallelized and scheduled based on > the resources allocated to the Spark Session instead of having the user pick > the integer value for this parameter. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-25309) Sci-kit Learn like Auto Pipeline Parallelization in Spark
[ https://issues.apache.org/jira/browse/SPARK-25309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi updated SPARK-25309: - Description: SPARK-19357 and SPARK-21911 have helped parallelize Pipelines in Spark. However, instead of setting the parallelism Parameter in the CrossValidator it would be good to have something like njobs=-1 (like Scikit Learn) where the Pipeline DAG could be automatically parallelized and scheduled based on the resources allocated to the Spark Session instead of having the user pick the integer value for this parameter. (was: SPARK-19357 and SPARK-21911 have helped parallelize Pipelines in Spark. However, instead of setting the parallelism Parameter in the CrossValidator it would be good to have something like njobs=-1 (like Scikit Learn) where the Pipleline DAG could be automatically parallelized and scheduled based on the resources allocated to the Spark Session instead of having the user pick the integer value for this parameter. ) > Sci-kit Learn like Auto Pipeline Parallelization in Spark > -- > > Key: SPARK-25309 > URL: https://issues.apache.org/jira/browse/SPARK-25309 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 2.3.1 >Reporter: Ravi >Priority: Critical > > SPARK-19357 and SPARK-21911 have helped parallelize Pipelines in Spark. > However, instead of setting the parallelism Parameter in the CrossValidator > it would be good to have something like njobs=-1 (like Scikit Learn) where > the Pipeline DAG could be automatically parallelized and scheduled based on > the resources allocated to the Spark Session instead of having the user pick > the integer value for this parameter. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org