[
https://issues.apache.org/jira/browse/SPARK-34591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julian King updated SPARK-34591:
Attachment: Reproducible example of Spark bug - no 2.pdf
> Pyspark undertakes pruning of decision
[
https://issues.apache.org/jira/browse/SPARK-34591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17359321#comment-17359321
]
Julian King commented on SPARK-34591:
-
To address any concerns about the example above being
[
https://issues.apache.org/jira/browse/SPARK-34591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358956#comment-17358956
]
Julian King commented on SPARK-34591:
-
The fact that there's no signal isn't the issue. The issue is
[
https://issues.apache.org/jira/browse/SPARK-34591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358953#comment-17358953
]
Julian King commented on SPARK-34591:
-
Here is a reproducible example of this bug which demonstrates
[
https://issues.apache.org/jira/browse/SPARK-34591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julian King updated SPARK-34591:
Attachment: Reproducible example of Spark bug.pdf
> Pyspark undertakes pruning of decision trees
[
https://issues.apache.org/jira/browse/SPARK-34591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julian King updated SPARK-34591:
Attachment: Reproducible example of Spark bug.pdf
> Pyspark undertakes pruning of decision trees
[
https://issues.apache.org/jira/browse/SPARK-34591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julian King updated SPARK-34591:
Attachment: (was: Reproducible example of Spark bug.pdf)
> Pyspark undertakes pruning of
[
https://issues.apache.org/jira/browse/SPARK-34591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julian King updated SPARK-34591:
Priority: Major (was: Minor)
> Pyspark undertakes pruning of decision trees and random forests
[
https://issues.apache.org/jira/browse/SPARK-34591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julian King updated SPARK-34591:
Affects Version/s: 2.4.4
> Pyspark undertakes pruning of decision trees and random forests
[
https://issues.apache.org/jira/browse/SPARK-34591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julian King updated SPARK-34591:
Priority: Critical (was: Major)
> Pyspark undertakes pruning of decision trees and random
[
https://issues.apache.org/jira/browse/SPARK-34591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julian King updated SPARK-34591:
Affects Version/s: 3.1.1
> Pyspark undertakes pruning of decision trees and random forests
[
https://issues.apache.org/jira/browse/SPARK-34591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julian King updated SPARK-34591:
Description:
*History of the issue*
SPARK-3159 implemented a method designed to reduce the
[
https://issues.apache.org/jira/browse/SPARK-34591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17293484#comment-17293484
]
Julian King commented on SPARK-34591:
-
FYI [~asolimando]
> Pyspark undertakes pruning of decision
Julian King created SPARK-34591:
---
Summary: Pyspark undertakes pruning of decision trees and random
forests outside the control of the user, leading to undesirable and unexpected
outcomes that are challenging to diagnose and impossible to correct
[
https://issues.apache.org/jira/browse/SPARK-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17293259#comment-17293259
]
Julian King commented on SPARK-3159:
I also need the probability estimates for the tree, not the
[
https://issues.apache.org/jira/browse/SPARK-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569855#comment-16569855
]
Julian King commented on SPARK-9478:
Has there been any progress on this in recent times? It looks
Julian King created SPARK-23730:
---
Summary: Save and expose "in bag" tracking for random forest model
Key: SPARK-23730
URL: https://issues.apache.org/jira/browse/SPARK-23730
Project: Spark
Julian King created SPARK-23704:
---
Summary: PySpark access of individual trees in random forest is
slow
Key: SPARK-23704
URL: https://issues.apache.org/jira/browse/SPARK-23704
Project: Spark
18 matches
Mail list logo