[jira] [Resolved] (SPARK-48970) Avoid using SparkSession.getActiveSession in spark ML reader/writer

2024-07-23 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-48970. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47453 [https://github

[jira] [Assigned] (SPARK-48970) Avoid using SparkSession.getActiveSession in spark ML reader/writer

2024-07-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-48970: -- Assignee: Weichen Xu > Avoid using SparkSession.getActiveSession in spark ML reader/writer >

[jira] [Updated] (SPARK-48970) Avoid using SparkSession.getActiveSession in spark ML reader/writer

2024-07-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-48970: --- Issue Type: Bug (was: Improvement) > Avoid using SparkSession.getActiveSession in spark ML reader/w

[jira] [Created] (SPARK-48970) Avoid using SparkSession.getActiveSession in spark ML reader/writer

2024-07-22 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-48970: -- Summary: Avoid using SparkSession.getActiveSession in spark ML reader/writer Key: SPARK-48970 URL: https://issues.apache.org/jira/browse/SPARK-48970 Project: Spark

[jira] [Assigned] (SPARK-48941) PySparkML: Replace RDD read / write API invocation with Dataframe read / write API

2024-07-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-48941: -- Assignee: Weichen Xu > PySparkML: Replace RDD read / write API invocation with Dataframe read

[jira] [Resolved] (SPARK-48941) PySparkML: Replace RDD read / write API invocation with Dataframe read / write API

2024-07-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-48941. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47411 [https://github

[jira] [Created] (SPARK-48941) PySparkML: Replace RDD read / write API invocation with Dataframe read / write API

2024-07-18 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-48941: -- Summary: PySparkML: Replace RDD read / write API invocation with Dataframe read / write API Key: SPARK-48941 URL: https://issues.apache.org/jira/browse/SPARK-48941 Proje

[jira] [Assigned] (SPARK-48883) In spark ML, replace RDD read / write API invocation with Dataframe read / write API

2024-07-12 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-48883: -- Assignee: Weichen Xu > In spark ML, replace RDD read / write API invocation with Dataframe re

[jira] [Resolved] (SPARK-48883) In spark ML, replace RDD read / write API invocation with Dataframe read / write API

2024-07-12 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-48883. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47328 [https://github

[jira] [Created] (SPARK-48883) In spark ML, replace RDD read / write API invocation with Dataframe read / write API

2024-07-12 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-48883: -- Summary: In spark ML, replace RDD read / write API invocation with Dataframe read / write API Key: SPARK-48883 URL: https://issues.apache.org/jira/browse/SPARK-48883 Proj

[jira] [Commented] (SPARK-48463) MLLib function unable to handle nested data

2024-06-21 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17856713#comment-17856713 ] Weichen Xu commented on SPARK-48463: I will try to do it this sprint. (and then cher

[jira] [Reopened] (SPARK-48463) MLLib function unable to handle nested data

2024-06-14 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reopened SPARK-48463: Assignee: Weichen Xu > MLLib function unable to handle nested data > ---

[jira] [Resolved] (SPARK-48463) MLLib function unable to handle nested data

2024-06-14 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-48463. Resolution: Not A Problem > MLLib function unable to handle nested data >

[jira] [Commented] (SPARK-48463) MLLib function unable to handle nested data

2024-06-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854214#comment-17854214 ] Weichen Xu commented on SPARK-48463: ah got it. then it is not supported :)    as

[jira] [Commented] (SPARK-48463) MLLib function unable to handle nested data

2024-06-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854053#comment-17854053 ] Weichen Xu commented on SPARK-48463: I think you don’t need to flatten the original

[jira] [Commented] (SPARK-48084) pyspark.ml.connect.evaluation not working in 3.5 client <> 4.0 server

2024-05-06 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17844112#comment-17844112 ] Weichen Xu commented on SPARK-48084: This test error {{pyspark.ml.connect.evaluation

[jira] [Commented] (SPARK-48083) session.copyFromLocalToFs failure with 3.5 client <> 4.0 server

2024-05-06 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17844111#comment-17844111 ] Weichen Xu commented on SPARK-48083: this is not an issue, {{copyFromLocalToFs}} req

[jira] [Resolved] (SPARK-47663) Add an end to end tests for checking if spark task works well with resources

2024-04-02 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-47663. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45794 [https://github

[jira] [Assigned] (SPARK-47663) Add an end to end tests for checking if spark task works well with resources

2024-04-02 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-47663: -- Assignee: Bobby Wang > Add an end to end tests for checking if spark task works well with res

[jira] [Resolved] (SPARK-46812) Make `mapInPandas` / mapInArrow` support ResourceProfile (Stage-Level scheduling)

2024-02-18 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-46812. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44852 [https://github

[jira] [Assigned] (SPARK-46812) Make `mapInPandas` / mapInArrow` support ResourceProfile (Stage-Level scheduling)

2024-02-18 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-46812: -- Assignee: Bobby Wang > Make `mapInPandas` / mapInArrow` support ResourceProfile (Stage-Level

[jira] [Resolved] (SPARK-46361) Add spark dataset chunk read API (python only)

2024-01-05 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-46361. Resolution: Won't Do > Add spark dataset chunk read API (python only) > --

[jira] [Updated] (SPARK-46361) Add spark dataset chunk read API (python only)

2023-12-12 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-46361: --- Description: *Design doc:* h1. [https://docs.google.com/document/d/1LHzwCjm2SluHkta_08cM3jxFSgfF-ni

[jira] [Assigned] (SPARK-46361) Add spark dataset chunk read API (python only)

2023-12-12 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-46361: -- Assignee: Weichen Xu > Add spark dataset chunk read API (python only) > -

[jira] [Updated] (SPARK-46361) Add spark dataset chunk read API (python only)

2023-12-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-46361: --- Description: *Proposed API:* {code:java} def persist_dataframe_as_chunks(dataframe: DataFrame) -> li

[jira] [Created] (SPARK-46361) Add spark dataset chunk read API (python only)

2023-12-11 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-46361: -- Summary: Add spark dataset chunk read API (python only) Key: SPARK-46361 URL: https://issues.apache.org/jira/browse/SPARK-46361 Project: Spark Issue Type: Improv

[jira] [Resolved] (SPARK-45397) Add vector assembler feature transformer

2023-10-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-45397. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43199 [https://github

[jira] [Assigned] (SPARK-45397) Add vector assembler feature transformer

2023-10-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-45397: -- Assignee: Weichen Xu > Add vector assembler feature transformer > ---

[jira] [Created] (SPARK-45397) Add vector assembler feature transformer

2023-10-03 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-45397: -- Summary: Add vector assembler feature transformer Key: SPARK-45397 URL: https://issues.apache.org/jira/browse/SPARK-45397 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-45396) Add doc entry for `pyspark.ml.connect` module

2023-10-02 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-45396: -- Summary: Add doc entry for `pyspark.ml.connect` module Key: SPARK-45396 URL: https://issues.apache.org/jira/browse/SPARK-45396 Project: Spark Issue Type: Sub-tas

[jira] [Created] (SPARK-45130) Avoid Spark connect ML model to change input pandas dataframe

2023-09-12 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-45130: -- Summary: Avoid Spark connect ML model to change input pandas dataframe Key: SPARK-45130 URL: https://issues.apache.org/jira/browse/SPARK-45130 Project: Spark Is

[jira] [Updated] (SPARK-45130) Avoid Spark connect ML model to change input pandas dataframe

2023-09-12 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-45130: --- Description: Currently,  > Avoid Spark connect ML model to change input pandas dataframe > -

[jira] [Created] (SPARK-45129) Add pyspark "ml-connect" extras dependencies

2023-09-12 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-45129: -- Summary: Add pyspark "ml-connect" extras dependencies Key: SPARK-45129 URL: https://issues.apache.org/jira/browse/SPARK-45129 Project: Spark Issue Type: Sub-task

[jira] [Resolved] (SPARK-44908) Fix spark connect ML crossvalidator "foldCol" param

2023-08-23 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-44908. Fix Version/s: 3.5.0 4.0.0 Resolution: Fixed Issue resolved by pull requ

[jira] [Resolved] (SPARK-44909) Skip starting torch distributor log streaming server when it is not available

2023-08-23 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-44909. Fix Version/s: 3.5.0 4.0.0 Resolution: Fixed Issue resolved by pull requ

[jira] [Assigned] (SPARK-44909) Skip starting torch distributor log streaming server when it is not available

2023-08-23 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-44909: -- Assignee: Weichen Xu > Skip starting torch distributor log streaming server when it is not av

[jira] [Created] (SPARK-44909) Skip starting torch distributor log streaming server when it is not available

2023-08-22 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-44909: -- Summary: Skip starting torch distributor log streaming server when it is not available Key: SPARK-44909 URL: https://issues.apache.org/jira/browse/SPARK-44909 Project: Sp

[jira] [Created] (SPARK-44908) Fix spark connect ML crossvalidator "foldCol" param

2023-08-22 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-44908: -- Summary: Fix spark connect ML crossvalidator "foldCol" param Key: SPARK-44908 URL: https://issues.apache.org/jira/browse/SPARK-44908 Project: Spark Issue Type: B

[jira] [Assigned] (SPARK-44908) Fix spark connect ML crossvalidator "foldCol" param

2023-08-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-44908: -- Assignee: Weichen Xu > Fix spark connect ML crossvalidator "foldCol" param >

[jira] [Updated] (SPARK-44374) Add example code

2023-07-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-44374: --- Fix Version/s: 3.5.0 > Add example code > > > Key: SPARK-44374 >

[jira] [Resolved] (SPARK-44374) Add example code

2023-07-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-44374. Resolution: Done > Add example code > > > Key: SPARK-44374 >

[jira] [Created] (SPARK-44374) Add example code

2023-07-11 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-44374: -- Summary: Add example code Key: SPARK-44374 URL: https://issues.apache.org/jira/browse/SPARK-44374 Project: Spark Issue Type: Sub-task Components: Conne

[jira] [Assigned] (SPARK-44374) Add example code

2023-07-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-44374: -- Assignee: Weichen Xu > Add example code > > > Key: SPARK-443

[jira] [Assigned] (SPARK-42471) Distributed ML <> spark connect

2023-07-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-42471: -- Assignee: Weichen Xu > Distributed ML <> spark connect > --- > >

[jira] [Resolved] (SPARK-43983) Implement cross validator estimator

2023-07-10 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43983. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41881 [https://github

[jira] [Resolved] (SPARK-44250) Implement classification evaluator

2023-07-03 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-44250. Resolution: Done > Implement classification evaluator > -- > >

[jira] [Created] (SPARK-44250) Implement classification evaluator

2023-06-29 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-44250: -- Summary: Implement classification evaluator Key: SPARK-44250 URL: https://issues.apache.org/jira/browse/SPARK-44250 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-44250) Implement classification evaluator

2023-06-29 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-44250: -- Assignee: Weichen Xu > Implement classification evaluator > -

[jira] [Resolved] (SPARK-44100) Move namespace from `pyspark.mlv2` to `pyspark.ml.connect`

2023-06-20 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-44100. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41666 [https://github

[jira] [Assigned] (SPARK-44100) Move namespace from `pyspark.mlv2` to `pyspark.ml.connect`

2023-06-19 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-44100: -- Assignee: Weichen Xu > Move namespace from `pyspark.mlv2` to `pyspark.ml.connect` > -

[jira] [Created] (SPARK-44100) Move namespace from `pyspark.mlv2` to `pyspark.ml.connect`

2023-06-19 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-44100: -- Summary: Move namespace from `pyspark.mlv2` to `pyspark.ml.connect` Key: SPARK-44100 URL: https://issues.apache.org/jira/browse/SPARK-44100 Project: Spark Issue

[jira] [Updated] (SPARK-42501) High level design doc for Distributed ML <> spark connect

2023-06-19 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-42501: --- Description: Design doc: https://docs.google.com/document/d/1LHzwCjm2SluHkta_08cM3jxFSgfF-niaCZbtIT

[jira] [Resolved] (SPARK-42501) High level design doc for Distributed ML <> spark connect

2023-06-19 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-42501. Resolution: Done > High level design doc for Distributed ML <> spark connect > ---

[jira] [Resolved] (SPARK-42412) Initial prototype implementation for PySparkML

2023-06-19 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-42412. Resolution: Done > Initial prototype implementation for PySparkML > --

[jira] [Resolved] (SPARK-43982) Implement pipeline estimator

2023-06-19 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43982. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41479 [https://github

[jira] [Resolved] (SPARK-43981) Basic saving / loading implementation

2023-06-13 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43981. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41478 [https://github

[jira] [Resolved] (SPARK-43790) Add API `copyLocalFileToHadoopFS`

2023-06-07 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43790. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41357 [https://github

[jira] [Resolved] (SPARK-43097) Implement pyspark ML logistic regression estimator on top of torch distributor

2023-06-06 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43097. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41383 [https://github

[jira] [Assigned] (SPARK-43982) Implement pipeline estimator

2023-06-06 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43982: -- Assignee: Weichen Xu > Implement pipeline estimator > > >

[jira] [Created] (SPARK-43982) Implement pipeline estimator

2023-06-06 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-43982: -- Summary: Implement pipeline estimator Key: SPARK-43982 URL: https://issues.apache.org/jira/browse/SPARK-43982 Project: Spark Issue Type: Sub-task Compo

[jira] [Assigned] (SPARK-43983) Implement cross validator estimator

2023-06-06 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43983: -- Assignee: Weichen Xu > Implement cross validator estimator >

[jira] [Created] (SPARK-43983) Implement cross validator estimator

2023-06-06 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-43983: -- Summary: Implement cross validator estimator Key: SPARK-43983 URL: https://issues.apache.org/jira/browse/SPARK-43983 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-43981) Basic saving / loading implementation

2023-06-06 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43981: --- Description: Support saving/loading  for estimator / transformer / evaluator / model. We have some

[jira] [Created] (SPARK-43981) Basic saving / loading implementation

2023-06-06 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-43981: -- Summary: Basic saving / loading implementation Key: SPARK-43981 URL: https://issues.apache.org/jira/browse/SPARK-43981 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-43981) Basic saving / loading implementation

2023-06-06 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43981: -- Assignee: Weichen Xu > Basic saving / loading implementation > --

[jira] [Updated] (SPARK-43981) Basic saving / loading implementation

2023-06-06 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43981: --- Component/s: Connect ML > Basic saving / loading implementation > -

[jira] [Resolved] (SPARK-43715) Add spark DataFrame binary file format writer

2023-06-06 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43715. Resolution: Won't Do > Add spark DataFrame binary file format writer > ---

[jira] [Assigned] (SPARK-43788) Enable SummarizerTests.test_summarize_dataframe for pandas 2.0.0.

2023-06-05 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43788: -- Assignee: Weichen Xu > Enable SummarizerTests.test_summarize_dataframe for pandas 2.0.0. > --

[jira] [Resolved] (SPARK-43788) Enable SummarizerTests.test_summarize_dataframe for pandas 2.0.0.

2023-06-05 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43788. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41456 [https://github

[jira] [Resolved] (SPARK-43784) Enable FeatureTests.test_max_abs_scaler for pandas 2.0.0.

2023-06-05 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43784. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41456 [https://github

[jira] [Assigned] (SPARK-43784) Enable FeatureTests.test_max_abs_scaler for pandas 2.0.0.

2023-06-05 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43784: -- Assignee: Weichen Xu > Enable FeatureTests.test_max_abs_scaler for pandas 2.0.0. > --

[jira] [Resolved] (SPARK-43783) Enable FeatureTests.test_standard_scaler for pandas 2.0.0.

2023-06-05 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43783. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41456 [https://github

[jira] [Assigned] (SPARK-43783) Enable FeatureTests.test_standard_scaler for pandas 2.0.0.

2023-06-05 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43783: -- Assignee: Weichen Xu > Enable FeatureTests.test_standard_scaler for pandas 2.0.0. > -

[jira] [Updated] (SPARK-43790) Add API `copyLocalFileToHadoopFS`

2023-05-24 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43790: --- Description: In new distributed spark ML module (designed to support spark connect and support loca

[jira] [Assigned] (SPARK-43790) Add API `copyLocalFileToHadoopFS`

2023-05-24 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43790: -- Assignee: Weichen Xu > Add API `copyLocalFileToHadoopFS` > -

[jira] [Created] (SPARK-43790) Add API `copyLocalFileToHadoopFS`

2023-05-24 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-43790: -- Summary: Add API `copyLocalFileToHadoopFS` Key: SPARK-43790 URL: https://issues.apache.org/jira/browse/SPARK-43790 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-43516) Basic estimator / transformer / model / evaluator interfaces and basic transformer / evaluator implementation

2023-05-24 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43516: --- Description: * Define basic interfaces of Evaluator / Transformer / Model / Evaluator, these interf

[jira] [Updated] (SPARK-43516) Basic estimator / transformer / model / evaluator interfaces and basic transformer / evaluator implementation

2023-05-24 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43516: --- Summary: Basic estimator / transformer / model / evaluator interfaces and basic transformer / evalua

[jira] [Resolved] (SPARK-43516) Basic estimator / transformer / model / evaluator interfaces

2023-05-24 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43516. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41176 [https://github

[jira] [Commented] (SPARK-42501) High level design doc for Distributed ML <> spark connect

2023-05-23 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725614#comment-17725614 ] Weichen Xu commented on SPARK-42501: doc is linked. > High level design doc for Dis

[jira] [Updated] (SPARK-42501) High level design doc for Distributed ML <> spark connect

2023-05-23 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-42501: --- Summary: High level design doc for Distributed ML <> spark connect (was: High level design doc for

[jira] [Updated] (SPARK-42471) Distributed ML <> spark connect

2023-05-23 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-42471: --- Summary: Distributed ML <> spark connect (was: Feature parity: ML API in Spark Connect) > Distribu

[jira] [Updated] (SPARK-43715) Add spark DataFrame binary file format writer

2023-05-23 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43715: --- Description: In new distributed spark ML module (designed to support spark connect and support loca

[jira] [Updated] (SPARK-43715) Add spark DataFrame binary file format writer

2023-05-23 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43715: --- Summary: Add spark DataFrame binary file format writer (was: Add spark DataFrame binary file reader

[jira] [Assigned] (SPARK-43715) Add spark DataFrame binary file reader / writer

2023-05-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43715: -- Assignee: Weichen Xu > Add spark DataFrame binary file reader / writer >

[jira] [Updated] (SPARK-43715) Add spark DataFrame binary file reader / writer

2023-05-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43715: --- Description: In new distributed spark ML module (designed to support spark connect and support loca

[jira] [Updated] (SPARK-43715) Add spark DataFrame binary file reader / writer

2023-05-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43715: --- Description: In new distributed spark ML module (designed to support spark connect and support loca

[jira] [Updated] (SPARK-43715) Add spark DataFrame binary file reader / writer

2023-05-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43715: --- Description: In new distributed spark ML module (designed to support spark connect and support loca

[jira] [Updated] (SPARK-43715) Add spark DataFrame binary file writer

2023-05-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43715: --- Description: In new distributed spark ML module (designed to support spark connect and support loca

[jira] [Updated] (SPARK-43715) Add spark DataFrame binary file reader / writer

2023-05-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43715: --- Summary: Add spark DataFrame binary file reader / writer (was: Add spark DataFrame binary file writ

[jira] [Updated] (SPARK-43715) Add spark DataFrame binary file writer

2023-05-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43715: --- Description: In distributed  > Add spark DataFrame binary file writer >

[jira] [Created] (SPARK-43715) Add spark DataFrame binary file writer

2023-05-22 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-43715: -- Summary: Add spark DataFrame binary file writer Key: SPARK-43715 URL: https://issues.apache.org/jira/browse/SPARK-43715 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-43516) Basic estimator / transformer / model / evaluator interfaces

2023-05-15 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43516: -- Assignee: Weichen Xu > Basic estimator / transformer / model / evaluator interfaces > ---

[jira] [Created] (SPARK-43516) Basic estimator / transformer / model / evaluator interfaces

2023-05-15 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-43516: -- Summary: Basic estimator / transformer / model / evaluator interfaces Key: SPARK-43516 URL: https://issues.apache.org/jira/browse/SPARK-43516 Project: Spark Iss

[jira] [Resolved] (SPARK-43081) Add torch distributor data loader that loads data from spark partition data

2023-04-30 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43081. Target Version/s: 3.5.0 Resolution: Done > Add torch distributor data loader that loads

[jira] [Assigned] (SPARK-43289) PySpark UDF supports python package dependencies

2023-04-25 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43289: -- Assignee: Weichen Xu > PySpark UDF supports python package dependencies > ---

[jira] [Created] (SPARK-43289) PySpark UDF supports python package dependencies

2023-04-25 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-43289: -- Summary: PySpark UDF supports python package dependencies Key: SPARK-43289 URL: https://issues.apache.org/jira/browse/SPARK-43289 Project: Spark Issue Type: New

[jira] [Assigned] (SPARK-43097) Implement pyspark ML logistic regression estimator on top of torch distributor

2023-04-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43097: -- Assignee: Weichen Xu > Implement pyspark ML logistic regression estimator on top of torch dis

[jira] [Created] (SPARK-43097) Implement pyspark ML logistic regression estimator on top of torch distributor

2023-04-11 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-43097: -- Summary: Implement pyspark ML logistic regression estimator on top of torch distributor Key: SPARK-43097 URL: https://issues.apache.org/jira/browse/SPARK-43097 Project: S

[jira] [Assigned] (SPARK-43081) Add torch distributor data loader that loads data from spark partition data

2023-04-10 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43081: -- Assignee: Weichen Xu > Add torch distributor data loader that loads data from spark partition

  1   2   3   4   5   6   7   >