[
https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiao Li updated SPARK-20392:
Component/s: SQL
> Slow performance when calling fit on ML pipeline for dataset with many
> columns but fe
[
https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiao Li updated SPARK-20392:
Priority: Major (was: Blocker)
> Slow performance when calling fit on ML pipeline for dataset with many
>
[
https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenchen Fan updated SPARK-20392:
Target Version/s: 2.3.0
> Slow performance when calling fit on ML pipeline for dataset with many
>
[
https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenchen Fan updated SPARK-20392:
Priority: Blocker (was: Major)
> Slow performance when calling fit on ML pipeline for dataset with
[
https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenchen Fan updated SPARK-20392:
Issue Type: Improvement (was: Bug)
> Slow performance when calling fit on ML pipeline for dataset
[
https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenchen Fan updated SPARK-20392:
Fix Version/s: (was: 2.3.0)
> Slow performance when calling fit on ML pipeline for dataset with
[
https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Barry Becker updated SPARK-20392:
-
Attachment: model_9756.zip
blockbuster_fewCols.csv
attaching blockbuster_fewCols.
[
https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Barry Becker updated SPARK-20392:
-
Attachment: model_9754.zip
Attaching the parquet pipeline (as zip).
> Slow performance when call
[
https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Barry Becker updated SPARK-20392:
-
Attachment: giant_query_plan_for_fitting_pipeline.txt
Giant nested query plan using when calling
[
https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Barry Becker updated SPARK-20392:
-
Attachment: blockbuster.csv
Attaching blockbuster.csv data file with many columns, but few rows.
10 matches
Mail list logo