[jira] [Comment Edited] (SPARK-37913) Null Pointer Exception when Loading ML Pipeline Model with Custom Transformer

2024-01-14 Thread APeng Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795697#comment-17795697 ] APeng Zhang edited comment on SPARK-37913 at 1/15/24 12:35 AM: ---

[jira] [Commented] (SPARK-46362) calculation error

2023-12-12 Thread APeng Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795741#comment-17795741 ] APeng Zhang commented on SPARK-46362: - [~paridee] This is not a defect of Spark, it's related to the

[jira] [Comment Edited] (SPARK-37913) Null Pointer Exception when Loading ML Pipeline Model with Custom Transformer

2023-12-12 Thread APeng Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795697#comment-17795697 ] APeng Zhang edited comment on SPARK-37913 at 12/12/23 12:36 PM:

[jira] [Commented] (SPARK-45154) Pyspark DecisionTreeClassifier: results and tree structure in spark3 very different from that of the spark2 version on the same data and with the same hyperparameters.

2023-12-12 Thread APeng Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795716#comment-17795716 ] APeng Zhang commented on SPARK-45154: - [~oumarnour] I think you need to set the _seed_ param of

[jira] [Commented] (SPARK-37913) Null Pointer Exception when Loading ML Pipeline Model with Custom Transformer

2023-12-12 Thread APeng Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795697#comment-17795697 ] APeng Zhang commented on SPARK-37913: - I can reproduce this issue. h2. Solution: A simple approach

[jira] [Updated] (SPARK-30408) orderBy in sortBy clause is removed by EliminateSorts

2020-01-07 Thread APeng Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] APeng Zhang updated SPARK-30408: Description: OrderBy in sortBy clause will be removed by EliminateSorts. code to reproduce:

[jira] [Updated] (SPARK-30408) orderBy in sortBy clause is removed by EliminateSorts

2020-01-02 Thread APeng Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] APeng Zhang updated SPARK-30408: Description: OrderBy in sortBy clause will be removed by EliminateSorts. code to reproduce:

[jira] [Created] (SPARK-30408) orderBy in sortBy clause is removed by EliminateSorts

2020-01-02 Thread APeng Zhang (Jira)
APeng Zhang created SPARK-30408: --- Summary: orderBy in sortBy clause is removed by EliminateSorts Key: SPARK-30408 URL: https://issues.apache.org/jira/browse/SPARK-30408 Project: Spark Issue

[jira] [Created] (SPARK-21410) In RangePartitioner(partitions: Int, rdd: RDD[]), RangePartitioner.numPartitions is wrong if the number of elements in RDD (rdd.count()) is less than number of partition

2017-07-13 Thread APeng Zhang (JIRA)
APeng Zhang created SPARK-21410: --- Summary: In RangePartitioner(partitions: Int, rdd: RDD[]), RangePartitioner.numPartitions is wrong if the number of elements in RDD (rdd.count()) is less than number of partitions (partitions in constructor).

[jira] [Commented] (SPARK-20765) Cannot load persisted PySpark ML Pipeline that includes 3rd party stage (Transformer or Estimator) if the package name of stage is not "org.apache.spark" and "pyspark"

2017-05-16 Thread APeng Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012465#comment-16012465 ] APeng Zhang commented on SPARK-20765: - Yes, the class is on the classpath. The problem is the current

[jira] [Commented] (SPARK-20765) Cannot load persisted PySpark ML Pipeline that includes 3rd party stage (Transformer or Estimator) if the package name of stage is not "org.apache.spark" and "pyspark"

2017-05-16 Thread APeng Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012405#comment-16012405 ] APeng Zhang commented on SPARK-20765: - PySpark will get the Python calss name from Scala class name

[jira] [Created] (SPARK-20765) Cannot load persisted PySpark ML Pipeline that includes 3rd party stage (Transformer or Estimator) if the package name of stage is not "org.apache.spark" and "pyspark"

2017-05-16 Thread APeng Zhang (JIRA)
APeng Zhang created SPARK-20765: --- Summary: Cannot load persisted PySpark ML Pipeline that includes 3rd party stage (Transformer or Estimator) if the package name of stage is not "org.apache.spark" and "pyspark" Key: SPARK-20765