[jira] [Commented] (SPARK-48084) pyspark.ml.connect.evaluation not working in 3.5 client <> 4.0 server

2024-05-06 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844112#comment-17844112 ] Weichen Xu commented on SPARK-48084: This test error {{pyspark.ml.connect.evaluation not working in

[jira] [Commented] (SPARK-48083) session.copyFromLocalToFs failure with 3.5 client <> 4.0 server

2024-05-06 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844111#comment-17844111 ] Weichen Xu commented on SPARK-48083: this is not an issue, {{copyFromLocalToFs}} requires spark

[jira] [Resolved] (SPARK-47663) Add an end to end tests for checking if spark task works well with resources

2024-04-02 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-47663. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45794

[jira] [Assigned] (SPARK-47663) Add an end to end tests for checking if spark task works well with resources

2024-04-02 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-47663: -- Assignee: Bobby Wang > Add an end to end tests for checking if spark task works well with

[jira] [Resolved] (SPARK-46812) Make `mapInPandas` / mapInArrow` support ResourceProfile (Stage-Level scheduling)

2024-02-18 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-46812. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44852

[jira] [Assigned] (SPARK-46812) Make `mapInPandas` / mapInArrow` support ResourceProfile (Stage-Level scheduling)

2024-02-18 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-46812: -- Assignee: Bobby Wang > Make `mapInPandas` / mapInArrow` support ResourceProfile (Stage-Level

[jira] [Resolved] (SPARK-46361) Add spark dataset chunk read API (python only)

2024-01-05 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-46361. Resolution: Won't Do > Add spark dataset chunk read API (python only) >

[jira] [Updated] (SPARK-46361) Add spark dataset chunk read API (python only)

2023-12-12 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-46361: --- Description: *Design doc:* h1.

[jira] [Assigned] (SPARK-46361) Add spark dataset chunk read API (python only)

2023-12-12 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-46361: -- Assignee: Weichen Xu > Add spark dataset chunk read API (python only) >

[jira] [Updated] (SPARK-46361) Add spark dataset chunk read API (python only)

2023-12-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-46361: --- Description: *Proposed API:* {code:java} def persist_dataframe_as_chunks(dataframe: DataFrame) ->

[jira] [Created] (SPARK-46361) Add spark dataset chunk read API (python only)

2023-12-11 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-46361: -- Summary: Add spark dataset chunk read API (python only) Key: SPARK-46361 URL: https://issues.apache.org/jira/browse/SPARK-46361 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-45397) Add vector assembler feature transformer

2023-10-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-45397. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43199

[jira] [Assigned] (SPARK-45397) Add vector assembler feature transformer

2023-10-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-45397: -- Assignee: Weichen Xu > Add vector assembler feature transformer >

[jira] [Created] (SPARK-45397) Add vector assembler feature transformer

2023-10-03 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-45397: -- Summary: Add vector assembler feature transformer Key: SPARK-45397 URL: https://issues.apache.org/jira/browse/SPARK-45397 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-45396) Add doc entry for `pyspark.ml.connect` module

2023-10-03 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-45396: -- Summary: Add doc entry for `pyspark.ml.connect` module Key: SPARK-45396 URL: https://issues.apache.org/jira/browse/SPARK-45396 Project: Spark Issue Type:

[jira] [Created] (SPARK-45130) Avoid Spark connect ML model to change input pandas dataframe

2023-09-12 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-45130: -- Summary: Avoid Spark connect ML model to change input pandas dataframe Key: SPARK-45130 URL: https://issues.apache.org/jira/browse/SPARK-45130 Project: Spark

[jira] [Updated] (SPARK-45130) Avoid Spark connect ML model to change input pandas dataframe

2023-09-12 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-45130: --- Description: Currently,  > Avoid Spark connect ML model to change input pandas dataframe >

[jira] [Created] (SPARK-45129) Add pyspark "ml-connect" extras dependencies

2023-09-12 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-45129: -- Summary: Add pyspark "ml-connect" extras dependencies Key: SPARK-45129 URL: https://issues.apache.org/jira/browse/SPARK-45129 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-44908) Fix spark connect ML crossvalidator "foldCol" param

2023-08-23 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-44908. Fix Version/s: 3.5.0 4.0.0 Resolution: Fixed Issue resolved by pull

[jira] [Resolved] (SPARK-44909) Skip starting torch distributor log streaming server when it is not available

2023-08-23 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-44909. Fix Version/s: 3.5.0 4.0.0 Resolution: Fixed Issue resolved by pull

[jira] [Assigned] (SPARK-44909) Skip starting torch distributor log streaming server when it is not available

2023-08-23 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-44909: -- Assignee: Weichen Xu > Skip starting torch distributor log streaming server when it is not

[jira] [Created] (SPARK-44909) Skip starting torch distributor log streaming server when it is not available

2023-08-22 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-44909: -- Summary: Skip starting torch distributor log streaming server when it is not available Key: SPARK-44909 URL: https://issues.apache.org/jira/browse/SPARK-44909 Project:

[jira] [Created] (SPARK-44908) Fix spark connect ML crossvalidator "foldCol" param

2023-08-22 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-44908: -- Summary: Fix spark connect ML crossvalidator "foldCol" param Key: SPARK-44908 URL: https://issues.apache.org/jira/browse/SPARK-44908 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-44908) Fix spark connect ML crossvalidator "foldCol" param

2023-08-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-44908: -- Assignee: Weichen Xu > Fix spark connect ML crossvalidator "foldCol" param >

[jira] [Updated] (SPARK-44374) Add example code

2023-07-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-44374: --- Fix Version/s: 3.5.0 > Add example code > > > Key: SPARK-44374 >

[jira] [Resolved] (SPARK-44374) Add example code

2023-07-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-44374. Resolution: Done > Add example code > > > Key: SPARK-44374 >

[jira] [Created] (SPARK-44374) Add example code

2023-07-11 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-44374: -- Summary: Add example code Key: SPARK-44374 URL: https://issues.apache.org/jira/browse/SPARK-44374 Project: Spark Issue Type: Sub-task Components:

[jira] [Assigned] (SPARK-44374) Add example code

2023-07-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-44374: -- Assignee: Weichen Xu > Add example code > > > Key:

[jira] [Assigned] (SPARK-42471) Distributed ML <> spark connect

2023-07-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-42471: -- Assignee: Weichen Xu > Distributed ML <> spark connect > --- > >

[jira] [Resolved] (SPARK-43983) Implement cross validator estimator

2023-07-10 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43983. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41881

[jira] [Resolved] (SPARK-44250) Implement classification evaluator

2023-07-04 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-44250. Resolution: Done > Implement classification evaluator > -- > >

[jira] [Created] (SPARK-44250) Implement classification evaluator

2023-06-29 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-44250: -- Summary: Implement classification evaluator Key: SPARK-44250 URL: https://issues.apache.org/jira/browse/SPARK-44250 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-44250) Implement classification evaluator

2023-06-29 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-44250: -- Assignee: Weichen Xu > Implement classification evaluator >

[jira] [Resolved] (SPARK-44100) Move namespace from `pyspark.mlv2` to `pyspark.ml.connect`

2023-06-20 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-44100. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41666

[jira] [Assigned] (SPARK-44100) Move namespace from `pyspark.mlv2` to `pyspark.ml.connect`

2023-06-19 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-44100: -- Assignee: Weichen Xu > Move namespace from `pyspark.mlv2` to `pyspark.ml.connect` >

[jira] [Created] (SPARK-44100) Move namespace from `pyspark.mlv2` to `pyspark.ml.connect`

2023-06-19 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-44100: -- Summary: Move namespace from `pyspark.mlv2` to `pyspark.ml.connect` Key: SPARK-44100 URL: https://issues.apache.org/jira/browse/SPARK-44100 Project: Spark Issue

[jira] [Updated] (SPARK-42501) High level design doc for Distributed ML <> spark connect

2023-06-19 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-42501: --- Description: Design doc:

[jira] [Resolved] (SPARK-42501) High level design doc for Distributed ML <> spark connect

2023-06-19 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-42501. Resolution: Done > High level design doc for Distributed ML <> spark connect >

[jira] [Resolved] (SPARK-42412) Initial prototype implementation for PySparkML

2023-06-19 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-42412. Resolution: Done > Initial prototype implementation for PySparkML >

[jira] [Resolved] (SPARK-43982) Implement pipeline estimator

2023-06-19 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43982. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41479

[jira] [Resolved] (SPARK-43981) Basic saving / loading implementation

2023-06-13 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43981. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41478

[jira] [Resolved] (SPARK-43790) Add API `copyLocalFileToHadoopFS`

2023-06-07 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43790. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41357

[jira] [Resolved] (SPARK-43097) Implement pyspark ML logistic regression estimator on top of torch distributor

2023-06-06 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43097. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41383

[jira] [Assigned] (SPARK-43982) Implement pipeline estimator

2023-06-06 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43982: -- Assignee: Weichen Xu > Implement pipeline estimator > > >

[jira] [Created] (SPARK-43982) Implement pipeline estimator

2023-06-06 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-43982: -- Summary: Implement pipeline estimator Key: SPARK-43982 URL: https://issues.apache.org/jira/browse/SPARK-43982 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-43983) Implement cross validator estimator

2023-06-06 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43983: -- Assignee: Weichen Xu > Implement cross validator estimator >

[jira] [Created] (SPARK-43983) Implement cross validator estimator

2023-06-06 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-43983: -- Summary: Implement cross validator estimator Key: SPARK-43983 URL: https://issues.apache.org/jira/browse/SPARK-43983 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-43981) Basic saving / loading implementation

2023-06-06 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43981: --- Description: Support saving/loading  for estimator / transformer / evaluator / model. We have some

[jira] [Created] (SPARK-43981) Basic saving / loading implementation

2023-06-06 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-43981: -- Summary: Basic saving / loading implementation Key: SPARK-43981 URL: https://issues.apache.org/jira/browse/SPARK-43981 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-43981) Basic saving / loading implementation

2023-06-06 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43981: -- Assignee: Weichen Xu > Basic saving / loading implementation >

[jira] [Updated] (SPARK-43981) Basic saving / loading implementation

2023-06-06 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43981: --- Component/s: Connect ML > Basic saving / loading implementation >

[jira] [Resolved] (SPARK-43715) Add spark DataFrame binary file format writer

2023-06-06 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43715. Resolution: Won't Do > Add spark DataFrame binary file format writer >

[jira] [Assigned] (SPARK-43788) Enable SummarizerTests.test_summarize_dataframe for pandas 2.0.0.

2023-06-05 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43788: -- Assignee: Weichen Xu > Enable SummarizerTests.test_summarize_dataframe for pandas 2.0.0. >

[jira] [Resolved] (SPARK-43788) Enable SummarizerTests.test_summarize_dataframe for pandas 2.0.0.

2023-06-05 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43788. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41456

[jira] [Resolved] (SPARK-43784) Enable FeatureTests.test_max_abs_scaler for pandas 2.0.0.

2023-06-05 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43784. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41456

[jira] [Assigned] (SPARK-43784) Enable FeatureTests.test_max_abs_scaler for pandas 2.0.0.

2023-06-05 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43784: -- Assignee: Weichen Xu > Enable FeatureTests.test_max_abs_scaler for pandas 2.0.0. >

[jira] [Resolved] (SPARK-43783) Enable FeatureTests.test_standard_scaler for pandas 2.0.0.

2023-06-05 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43783. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41456

[jira] [Assigned] (SPARK-43783) Enable FeatureTests.test_standard_scaler for pandas 2.0.0.

2023-06-05 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43783: -- Assignee: Weichen Xu > Enable FeatureTests.test_standard_scaler for pandas 2.0.0. >

[jira] [Updated] (SPARK-43790) Add API `copyLocalFileToHadoopFS`

2023-05-24 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43790: --- Description: In new distributed spark ML module (designed to support spark connect and support

[jira] [Assigned] (SPARK-43790) Add API `copyLocalFileToHadoopFS`

2023-05-24 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43790: -- Assignee: Weichen Xu > Add API `copyLocalFileToHadoopFS` > -

[jira] [Created] (SPARK-43790) Add API `copyLocalFileToHadoopFS`

2023-05-24 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-43790: -- Summary: Add API `copyLocalFileToHadoopFS` Key: SPARK-43790 URL: https://issues.apache.org/jira/browse/SPARK-43790 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-43516) Basic estimator / transformer / model / evaluator interfaces and basic transformer / evaluator implementation

2023-05-24 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43516: --- Description: * Define basic interfaces of Evaluator / Transformer / Model / Evaluator, these

[jira] [Updated] (SPARK-43516) Basic estimator / transformer / model / evaluator interfaces and basic transformer / evaluator implementation

2023-05-24 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43516: --- Summary: Basic estimator / transformer / model / evaluator interfaces and basic transformer /

[jira] [Resolved] (SPARK-43516) Basic estimator / transformer / model / evaluator interfaces

2023-05-24 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43516. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 41176

[jira] [Commented] (SPARK-42501) High level design doc for Distributed ML <> spark connect

2023-05-23 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17725614#comment-17725614 ] Weichen Xu commented on SPARK-42501: doc is linked. > High level design doc for Distributed ML <>

[jira] [Updated] (SPARK-42501) High level design doc for Distributed ML <> spark connect

2023-05-23 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-42501: --- Summary: High level design doc for Distributed ML <> spark connect (was: High level design doc for

[jira] [Updated] (SPARK-42471) Distributed ML <> spark connect

2023-05-23 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-42471: --- Summary: Distributed ML <> spark connect (was: Feature parity: ML API in Spark Connect) >

[jira] [Updated] (SPARK-43715) Add spark DataFrame binary file format writer

2023-05-23 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43715: --- Description: In new distributed spark ML module (designed to support spark connect and support

[jira] [Updated] (SPARK-43715) Add spark DataFrame binary file format writer

2023-05-23 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43715: --- Summary: Add spark DataFrame binary file format writer (was: Add spark DataFrame binary file

[jira] [Assigned] (SPARK-43715) Add spark DataFrame binary file reader / writer

2023-05-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43715: -- Assignee: Weichen Xu > Add spark DataFrame binary file reader / writer >

[jira] [Updated] (SPARK-43715) Add spark DataFrame binary file reader / writer

2023-05-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43715: --- Description: In new distributed spark ML module (designed to support spark connect and support

[jira] [Updated] (SPARK-43715) Add spark DataFrame binary file reader / writer

2023-05-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43715: --- Description: In new distributed spark ML module (designed to support spark connect and support

[jira] [Updated] (SPARK-43715) Add spark DataFrame binary file reader / writer

2023-05-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43715: --- Description: In new distributed spark ML module (designed to support spark connect and support

[jira] [Updated] (SPARK-43715) Add spark DataFrame binary file writer

2023-05-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43715: --- Description: In new distributed spark ML module (designed to support spark connect and support

[jira] [Updated] (SPARK-43715) Add spark DataFrame binary file reader / writer

2023-05-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43715: --- Summary: Add spark DataFrame binary file reader / writer (was: Add spark DataFrame binary file

[jira] [Updated] (SPARK-43715) Add spark DataFrame binary file writer

2023-05-22 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43715: --- Description: In distributed  > Add spark DataFrame binary file writer >

[jira] [Created] (SPARK-43715) Add spark DataFrame binary file writer

2023-05-22 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-43715: -- Summary: Add spark DataFrame binary file writer Key: SPARK-43715 URL: https://issues.apache.org/jira/browse/SPARK-43715 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-43516) Basic estimator / transformer / model / evaluator interfaces

2023-05-15 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43516: -- Assignee: Weichen Xu > Basic estimator / transformer / model / evaluator interfaces >

[jira] [Created] (SPARK-43516) Basic estimator / transformer / model / evaluator interfaces

2023-05-15 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-43516: -- Summary: Basic estimator / transformer / model / evaluator interfaces Key: SPARK-43516 URL: https://issues.apache.org/jira/browse/SPARK-43516 Project: Spark

[jira] [Resolved] (SPARK-43081) Add torch distributor data loader that loads data from spark partition data

2023-04-30 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-43081. Target Version/s: 3.5.0 Resolution: Done > Add torch distributor data loader that loads

[jira] [Assigned] (SPARK-43289) PySpark UDF supports python package dependencies

2023-04-25 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43289: -- Assignee: Weichen Xu > PySpark UDF supports python package dependencies >

[jira] [Created] (SPARK-43289) PySpark UDF supports python package dependencies

2023-04-25 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-43289: -- Summary: PySpark UDF supports python package dependencies Key: SPARK-43289 URL: https://issues.apache.org/jira/browse/SPARK-43289 Project: Spark Issue Type: New

[jira] [Assigned] (SPARK-43097) Implement pyspark ML logistic regression estimator on top of torch distributor

2023-04-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43097: -- Assignee: Weichen Xu > Implement pyspark ML logistic regression estimator on top of torch

[jira] [Created] (SPARK-43097) Implement pyspark ML logistic regression estimator on top of torch distributor

2023-04-11 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-43097: -- Summary: Implement pyspark ML logistic regression estimator on top of torch distributor Key: SPARK-43097 URL: https://issues.apache.org/jira/browse/SPARK-43097 Project:

[jira] [Assigned] (SPARK-43081) Add torch distributor data loader that loads data from spark partition data

2023-04-10 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-43081: -- Assignee: Weichen Xu > Add torch distributor data loader that loads data from spark

[jira] [Updated] (SPARK-43081) Add torch distributor data loader that loads data from spark partition data

2023-04-10 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43081: --- Description: Add torch distributor data loader that loads data from spark partition data.   We

[jira] [Updated] (SPARK-43081) Add torch distributor data loader that loads data from spark partition data

2023-04-10 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-43081: --- Description: Add torch distributor data loader that loads data from spark partition data.   We

[jira] [Created] (SPARK-43081) Add torch distributor data loader that loads data from spark partition data

2023-04-10 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-43081: -- Summary: Add torch distributor data loader that loads data from spark partition data Key: SPARK-43081 URL: https://issues.apache.org/jira/browse/SPARK-43081 Project:

[jira] [Resolved] (SPARK-42929) make mapInPandas / mapInArrow support "is_barrier"

2023-03-27 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-42929. Fix Version/s: 3.5.0 Target Version/s: 3.5.0 Resolution: Fixed > make

[jira] [Assigned] (SPARK-42929) make mapInPandas / mapInArrow support "is_barrier"

2023-03-27 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-42929: -- Assignee: Weichen Xu > make mapInPandas / mapInArrow support "is_barrier" >

[jira] [Created] (SPARK-42929) make mapInPandas / mapInArrow support "is_barrier"

2023-03-27 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-42929: -- Summary: make mapInPandas / mapInArrow support "is_barrier" Key: SPARK-42929 URL: https://issues.apache.org/jira/browse/SPARK-42929 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-42896) Make `mapInPandas` / mapInArrow` support barrier mode execution

2023-03-26 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-42896: -- Assignee: Weichen Xu > Make `mapInPandas` / mapInArrow` support barrier mode execution >

[jira] [Resolved] (SPARK-42896) Make `mapInPandas` / mapInArrow` support barrier mode execution

2023-03-26 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-42896. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 40520

[jira] [Created] (SPARK-42896) Make `mapInPandas` / mapInArrow` support barrier mode execution

2023-03-22 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-42896: -- Summary: Make `mapInPandas` / mapInArrow` support barrier mode execution Key: SPARK-42896 URL: https://issues.apache.org/jira/browse/SPARK-42896 Project: Spark

[jira] [Assigned] (SPARK-42732) Support spark connect session getActiveSession

2023-03-14 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-42732: -- Assignee: Weichen Xu > Support spark connect session getActiveSession >

[jira] [Resolved] (SPARK-42732) Support spark connect session getActiveSession

2023-03-14 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-42732. Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 40353

[jira] [Created] (SPARK-42732) Support spark connect session getActiveSession

2023-03-09 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-42732: -- Summary: Support spark connect session getActiveSession Key: SPARK-42732 URL: https://issues.apache.org/jira/browse/SPARK-42732 Project: Spark Issue Type: New

[jira] [Comment Edited] (SPARK-42501) High level design doc for Spark ML

2023-02-20 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17691169#comment-17691169 ] Weichen Xu edited comment on SPARK-42501 at 2/20/23 1:25 PM: - The doc is not

[jira] [Updated] (SPARK-42501) High level design doc for Spark ML

2023-02-20 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-42501: --- Description: (was: Please find the HLD doc for spark ML via spark connect

[jira] [Commented] (SPARK-42501) High level design doc for Spark ML

2023-02-20 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17691169#comment-17691169 ] Weichen Xu commented on SPARK-42501: CC [~mengxr] [~grundprinzip-db] [~podongfeng] [~srowen] Thanks!

  1   2   3   4   5   6   7   >