[jira] [Updated] (SPARK-30762) Add dtype="float32" support to vector_to_array UDF

2020-02-17 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-30762: --- Fix Version/s: 3.1.0 3.0.0 > Add dtype="float32" support to vector_to_array UDF

[jira] [Updated] (SPARK-30791) Dataframe add sameResult and sementicHash method

2020-02-17 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-30791: --- Fix Version/s: 3.1.0 > Dataframe add sameResult and sementicHash method >

[jira] [Updated] (SPARK-30791) Dataframe add sameResult and sementicHash method

2020-02-17 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-30791: --- Affects Version/s: (was: 3.0.0) 3.1.0 > Dataframe add sameResult and

[jira] [Updated] (SPARK-30791) Dataframe add sameResult and sementicHash method

2020-02-17 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-30791: --- Target Version/s: 3.1.0 (was: 3.0.0, 3.1.0) > Dataframe add sameResult and sementicHash method >

[jira] [Resolved] (SPARK-30791) Dataframe add sameResult and sementicHash method

2020-02-17 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-30791. Target Version/s: 3.0.0, 3.1.0 Resolution: Done Resolved by

[jira] [Resolved] (SPARK-30762) Add dtype="float32" support to vector_to_array UDF

2020-02-13 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-30762. Target Version/s: 3.0.0, 3.1.0 Resolution: Done Resolved by

[jira] [Updated] (SPARK-30762) Add dtype="float32" support to vector_to_array UDF

2020-02-13 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-30762: --- Fix Version/s: (was: 3.1.0) (was: 3.0.0) > Add dtype="float32" support

[jira] [Updated] (SPARK-30762) Add dtype="float32" support to vector_to_array UDF

2020-02-13 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-30762: --- Fix Version/s: 3.1.0 3.0.0 > Add dtype="float32" support to vector_to_array UDF

[jira] [Updated] (SPARK-30791) Dataframe add sameResult and sementicHash method

2020-02-12 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-30791: --- Description: Sometimes, we want to check whether two dataframes are the same. There is already an

[jira] [Updated] (SPARK-30791) Dataframe add sameResult and sementicHash method

2020-02-12 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-30791: --- Description: Sometimes, we want to check whether two dataframes are the same. There is already an

[jira] [Updated] (SPARK-30791) Dataframe add sameResult and sementicHash method

2020-02-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-30791: --- Description: Sometimes, we want to check whether two dataframes are the same. There is already an

[jira] [Commented] (SPARK-30791) Dataframe add sameResult and sementicHash method

2020-02-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034475#comment-17034475 ] Weichen Xu commented on SPARK-30791: [~liangz] will work on this. :) > Dataframe add sameResult and

[jira] [Assigned] (SPARK-30791) Dataframe add sameResult and sementicHash method

2020-02-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-30791: -- Assignee: Liang Zhang > Dataframe add sameResult and sementicHash method >

[jira] [Created] (SPARK-30791) Dataframe add sameResult and sementicHash method

2020-02-11 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-30791: -- Summary: Dataframe add sameResult and sementicHash method Key: SPARK-30791 URL: https://issues.apache.org/jira/browse/SPARK-30791 Project: Spark Issue Type: New

[jira] [Assigned] (SPARK-30154) PySpark UDF to convert MLlib vectors to dense arrays

2019-12-07 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-30154: -- Assignee: Weichen Xu > PySpark UDF to convert MLlib vectors to dense arrays >

[jira] [Commented] (SPARK-30154) PySpark UDF to convert MLlib vectors to dense arrays

2019-12-07 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990696#comment-16990696 ] Weichen Xu commented on SPARK-30154: I began working on this. Thanks :) > PySpark UDF to convert

[jira] [Assigned] (SPARK-29048) Query optimizer slow when using Column.isInCollection() with a large size collection

2019-09-11 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-29048: -- Assignee: Weichen Xu > Query optimizer slow when using Column.isInCollection() with a large

[jira] [Created] (SPARK-29048) Query optimizer slow when using Column.isInCollection() with a large size collection

2019-09-11 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-29048: -- Summary: Query optimizer slow when using Column.isInCollection() with a large size collection Key: SPARK-29048 URL: https://issues.apache.org/jira/browse/SPARK-29048

[jira] [Created] (SPARK-28957) Copy any "spark.hive.foo=bar" spark properties into hadoop conf as "hive.foo=bar"

2019-09-03 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-28957: -- Summary: Copy any "spark.hive.foo=bar" spark properties into hadoop conf as "hive.foo=bar" Key: SPARK-28957 URL: https://issues.apache.org/jira/browse/SPARK-28957

[jira] [Updated] (SPARK-28621) CheckCartesianProducts may throw some error which mismatch generated physical plan

2019-08-20 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-28621: --- Description: CheckCartesianProducts check logical plan which mismatch the physical plan. So when

[jira] [Updated] (SPARK-28621) may throw some mismatching error

2019-08-20 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-28621: --- Summary: may throw some mismatching error (was: CheckCartesianProducts do not work correctly) >

[jira] [Updated] (SPARK-28621) CheckCartesianProducts may throw some error which mismatch generated physical plan

2019-08-20 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-28621: --- Summary: CheckCartesianProducts may throw some error which mismatch generated physical plan (was:

[jira] [Updated] (SPARK-28621) CheckCartesianProducts do not work correctly

2019-08-20 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-28621: --- Description: CheckCartesianProducts do not work correctly. 1) CheckCartesianProducts check logical

[jira] [Updated] (SPARK-28621) CheckCartesianProducts do not work correctly

2019-08-20 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-28621: --- Description: CheckCartesianProducts check logical plan which mismatch the physical plan. So when

[jira] [Created] (SPARK-28782) explode() fails on aggregate expressions

2019-08-20 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-28782: -- Summary: explode() fails on aggregate expressions Key: SPARK-28782 URL: https://issues.apache.org/jira/browse/SPARK-28782 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-28280) Convert and port 'group-by.sql' into UDF test base

2019-08-06 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900855#comment-16900855 ] Weichen Xu commented on SPARK-28280: User 'skonto' has created a pull request for this issue:

[jira] [Updated] (SPARK-28621) CheckCartesianProducts do not work correctly

2019-08-05 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-28621: --- Description: CheckCartesianProducts do not work correctly. There're some cases: providing: {code}

[jira] [Created] (SPARK-28621) CheckCartesianProducts do not work correctly

2019-08-05 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-28621: -- Summary: CheckCartesianProducts do not work correctly Key: SPARK-28621 URL: https://issues.apache.org/jira/browse/SPARK-28621 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-28615) Add a guide line for dataframe functions to say column signature function is by default

2019-08-04 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-28615: -- Summary: Add a guide line for dataframe functions to say column signature function is by default Key: SPARK-28615 URL: https://issues.apache.org/jira/browse/SPARK-28615

[jira] [Created] (SPARK-28598) Few date time manipulation functions does not provide versions supporting Column as input through the Dataframe API

2019-08-01 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-28598: -- Summary: Few date time manipulation functions does not provide versions supporting Column as input through the Dataframe API Key: SPARK-28598 URL:

[jira] [Created] (SPARK-28582) Pyspark daemon exit failed when receive SIGTERM on py3.7

2019-07-31 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-28582: -- Summary: Pyspark daemon exit failed when receive SIGTERM on py3.7 Key: SPARK-28582 URL: https://issues.apache.org/jira/browse/SPARK-28582 Project: Spark Issue

[jira] [Commented] (SPARK-28476) Support ALTER DATABASE SET LOCATION

2019-07-27 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894612#comment-16894612 ] Weichen Xu commented on SPARK-28476: I am working on this. :) > Support ALTER DATABASE SET LOCATION

[jira] [Updated] (SPARK-28483) Canceling a spark job using barrier mode but barrier tasks do not exit

2019-07-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-28483: --- Summary: Canceling a spark job using barrier mode but barrier tasks do not exit (was: Canceling a

[jira] [Updated] (SPARK-28483) Canceling a spark job using barrier mode but tasks still being printing messages

2019-07-23 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-28483: --- Description: Reproduce code: {code:java} import time from pyspark import BarrierTaskContext n = 4

[jira] [Created] (SPARK-28483) Canceling a spark job using barrier mode but tasks still being printing messages

2019-07-23 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-28483: -- Summary: Canceling a spark job using barrier mode but tasks still being printing messages Key: SPARK-28483 URL: https://issues.apache.org/jira/browse/SPARK-28483

[jira] [Commented] (SPARK-25349) Support sample pushdown in Data Source V2

2019-07-22 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890623#comment-16890623 ] Weichen Xu commented on SPARK-25349: I will work on this. Thanks! > Support sample pushdown in Data

[jira] [Created] (SPARK-28452) CSV datasource writer do not support maxCharsPerColumn option

2019-07-19 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-28452: -- Summary: CSV datasource writer do not support maxCharsPerColumn option Key: SPARK-28452 URL: https://issues.apache.org/jira/browse/SPARK-28452 Project: Spark

[jira] [Created] (SPARK-28431) CSV datasource throw com.univocity.parsers.common.TextParsingException with large size message

2019-07-17 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-28431: -- Summary: CSV datasource throw com.univocity.parsers.common.TextParsingException with large size message Key: SPARK-28431 URL: https://issues.apache.org/jira/browse/SPARK-28431

[jira] [Updated] (SPARK-28366) Logging in driver when loading single large unsplittable file

2019-07-16 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-28366: --- Summary: Logging in driver when loading single large unsplittable file (was: Logging in driver

[jira] [Created] (SPARK-28366) Logging in driver when loading single large gzipped file via sc.textFile

2019-07-12 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-28366: -- Summary: Logging in driver when loading single large gzipped file via sc.textFile Key: SPARK-28366 URL: https://issues.apache.org/jira/browse/SPARK-28366 Project: Spark

[jira] [Commented] (SPARK-27889) Make development scripts under dev/ support Python 3

2019-07-09 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881663#comment-16881663 ] Weichen Xu commented on SPARK-27889: Discussed with [~mengxr] offline. I will work on this. > Make

[jira] [Commented] (SPARK-25382) Remove ImageSchema.readImages in 3.0

2019-07-09 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881662#comment-16881662 ] Weichen Xu commented on SPARK-25382: I will work on this. Thank! > Remove ImageSchema.readImages in

[jira] [Created] (SPARK-28185) Trigger pandas iterator UDF closing stuff when iterator stop early

2019-06-27 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-28185: -- Summary: Trigger pandas iterator UDF closing stuff when iterator stop early Key: SPARK-28185 URL: https://issues.apache.org/jira/browse/SPARK-28185 Project: Spark

[jira] [Created] (SPARK-27990) Provide a way to recursively load data from datasource

2019-06-10 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-27990: -- Summary: Provide a way to recursively load data from datasource Key: SPARK-27990 URL: https://issues.apache.org/jira/browse/SPARK-27990 Project: Spark Issue

[jira] [Updated] (SPARK-27870) Flush each batch for pandas UDF (for improving pandas UDFs pipeline)

2019-05-28 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-27870: --- Description: Flush each batch for pandas UDF. This could improve performance when multiple pandas

[jira] [Updated] (SPARK-27870) Flush each batch for pandas UDF (for improving pandas UDFs pipeline)

2019-05-28 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-27870: --- Summary: Flush each batch for pandas UDF (for improving pandas UDFs pipeline) (was: Flush each

[jira] [Updated] (SPARK-27870) Flush each batch for python UDF

2019-05-28 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-27870: --- Description: Flush each batch for python UDF. This could improve performance when multiple python

[jira] [Updated] (SPARK-27870) Flush each batch for python UDF

2019-05-28 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-27870: --- Summary: Flush each batch for python UDF (was: Flush each batch for pandas UDF) > Flush each

[jira] [Created] (SPARK-27870) Flush each batch for pandas UDF

2019-05-28 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-27870: -- Summary: Flush each batch for pandas UDF Key: SPARK-27870 URL: https://issues.apache.org/jira/browse/SPARK-27870 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames or Arrow batches

2019-05-09 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836480#comment-16836480 ] Weichen Xu commented on SPARK-26412: Discuss with [~mengxr] , discard proposal (2), this should be

[jira] [Comment Edited] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames or Arrow batches

2019-05-09 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836377#comment-16836377 ] Weichen Xu edited comment on SPARK-26412 at 5/9/19 3:19 PM: [~mengxr]  

[jira] [Comment Edited] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames or Arrow batches

2019-05-09 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836377#comment-16836377 ] Weichen Xu edited comment on SPARK-26412 at 5/9/19 3:18 PM: [~mengxr]  

[jira] [Commented] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames or Arrow batches

2019-05-09 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836377#comment-16836377 ] Weichen Xu commented on SPARK-26412: [~mengxr]   There's one issue:   There're 2 proposals in the

[jira] [Commented] (SPARK-27534) Do not load `content` column in binary data source if it is not selected

2019-04-23 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824554#comment-16824554 ] Weichen Xu commented on SPARK-27534: I am working on this. :) > Do not load `content` column in

[jira] [Updated] (SPARK-27454) Spark image datasource fail when encounter some illegal images

2019-04-12 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-27454: --- Description: Spark image datasource fail when encounter some illegal images. Such as exception

[jira] [Created] (SPARK-27454) Spark image datasource fail when encounter some illegal images

2019-04-12 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-27454: -- Summary: Spark image datasource fail when encounter some illegal images Key: SPARK-27454 URL: https://issues.apache.org/jira/browse/SPARK-27454 Project: Spark

[jira] [Commented] (SPARK-25348) Data source for binary files

2019-04-07 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812090#comment-16812090 ] Weichen Xu commented on SPARK-25348: I am working on this. :) > Data source for binary files >

[jira] [Commented] (SPARK-25793) Loading model bug in BisectingKMeans

2018-10-22 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659938#comment-16659938 ] Weichen Xu commented on SPARK-25793: [~dongjoon] Sorry. Change type to Bug and priority Major  >

[jira] [Updated] (SPARK-25793) Loading model bug in BisectingKMeans

2018-10-22 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25793: --- Issue Type: Bug (was: Documentation) > Loading model bug in BisectingKMeans >

[jira] [Updated] (SPARK-25793) Loading model bug in BisectingKMeans

2018-10-22 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25793: --- Priority: Major (was: Minor) > Loading model bug in BisectingKMeans >

[jira] [Created] (SPARK-25793) Loading model bug in BisectingKMeans

2018-10-21 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-25793: -- Summary: Loading model bug in BisectingKMeans Key: SPARK-25793 URL: https://issues.apache.org/jira/browse/SPARK-25793 Project: Spark Issue Type: Documentation

[jira] [Updated] (SPARK-25524) Spark datasource for image/libsvm user guide

2018-09-25 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25524: --- Summary: Spark datasource for image/libsvm user guide (was: Spark datasource for image/libsvm

[jira] [Updated] (SPARK-25524) Spark datasource for image/libsvm user guide

2018-09-25 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25524: --- Description: Add Spark datasource for image/libsvm user guide. (was: Add Spark datasource for

[jira] [Created] (SPARK-25524) Spark datasource for image/libsvm document

2018-09-25 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-25524: -- Summary: Spark datasource for image/libsvm document Key: SPARK-25524 URL: https://issues.apache.org/jira/browse/SPARK-25524 Project: Spark Issue Type:

[jira] [Commented] (SPARK-25319) Spark MLlib, GraphX 2.4 QA umbrella

2018-09-23 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625013#comment-16625013 ] Weichen Xu commented on SPARK-25319: Sure! > Spark MLlib, GraphX 2.4 QA umbrella >

[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs

2018-09-20 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621862#comment-16621862 ] Weichen Xu commented on SPARK-25321: [~mengxr] mleap is NOT compatible with the tree Node breaking

[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs

2018-09-11 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610379#comment-16610379 ] Weichen Xu commented on SPARK-25321: [~josephkb] There're 2 changes which break compatibility we

[jira] [Updated] (SPARK-25345) Deprecate public APIs from ImageSchema

2018-09-06 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25345: --- Description: After SPARK-22666, we can deprecate the public APIs in ImageSchema (Scala/Python) and

[jira] [Updated] (SPARK-25319) Spark MLlib, GraphX 2.4 QA umbrella

2018-09-03 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25319: --- Target Version/s: 2.4.0 (was: 2.3.0) > Spark MLlib, GraphX 2.4 QA umbrella >

[jira] [Updated] (SPARK-25319) Spark MLlib, GraphX 2.4 QA umbrella

2018-09-03 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25319: --- Fix Version/s: (was: 2.3.0) 2.4.0 > Spark MLlib, GraphX 2.4 QA umbrella >

[jira] [Updated] (SPARK-25325) ML, Graph 2.4 QA: Update user guide for new features & APIs

2018-09-03 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25325: --- Summary: ML, Graph 2.4 QA: Update user guide for new features & APIs (was: ML, Graph 2.3 QA:

[jira] [Updated] (SPARK-25327) Update MLlib, GraphX websites for 2.4

2018-09-03 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25327: --- Affects Version/s: 2.4.0 Target Version/s: 2.4.0 Summary: Update MLlib, GraphX

[jira] [Updated] (SPARK-25325) ML, Graph 2.3 QA: Update user guide for new features & APIs

2018-09-03 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25325: --- Affects Version/s: 2.4.0 Target Version/s: 2.4.0 Fix Version/s: (was: 2.3.0)

[jira] [Updated] (SPARK-25326) ML, Graph 2.4 QA: Programming guide update and migration guide

2018-09-03 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25326: --- Affects Version/s: (was: 2.3.0) 2.4.0 Target Version/s: 2.4.0 (was:

[jira] [Updated] (SPARK-25324) ML 2.4 QA: API: Java compatibility, docs

2018-09-03 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25324: --- Affects Version/s: 2.4.0 Target Version/s: 2.4.0 Fix Version/s: (was: 2.3.0)

[jira] [Updated] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs

2018-09-03 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25321: --- Affects Version/s: (was: 2.3.0) 2.4.0 Target Version/s: 2.4.0 (was:

[jira] [Updated] (SPARK-25323) ML 2.4 QA: API: Python API coverage

2018-09-03 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25323: --- Target Version/s: 2.4.0 Summary: ML 2.4 QA: API: Python API coverage (was: CLONE - ML

[jira] [Updated] (SPARK-25323) CLONE - ML 2.3 QA: API: Python API coverage

2018-09-03 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25323: --- Affects Version/s: (was: 2.3.0) 2.4.0 Target Version/s: (was:

[jira] [Updated] (SPARK-25320) ML, Graph 2.4 QA: API: Binary incompatible changes

2018-09-03 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25320: --- Affects Version/s: (was: 2.3.0) 2.4.0 Target Version/s: 2.4.0 (was:

[jira] [Updated] (SPARK-25322) ML, Graph 2.4 QA: API: Experimental, DeveloperApi, final, sealed audit

2018-09-03 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25322: --- Affects Version/s: 2.4.0 Fix Version/s: (was: 2.3.0) Summary: ML, Graph

[jira] [Updated] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs

2018-09-03 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25321: --- Description: Audit new public Scala APIs added to MLlib & GraphX. Take note of: * Protected/public

[jira] [Updated] (SPARK-25319) Spark MLlib, GraphX 2.4 QA umbrella

2018-09-03 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25319: --- Description: This JIRA lists tasks for the next Spark release's QA period for MLlib and GraphX.

[jira] [Created] (SPARK-25326) CLONE - ML, Graph 2.3 QA: Programming guide update and migration guide

2018-09-03 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-25326: -- Summary: CLONE - ML, Graph 2.3 QA: Programming guide update and migration guide Key: SPARK-25326 URL: https://issues.apache.org/jira/browse/SPARK-25326 Project: Spark

[jira] [Created] (SPARK-25324) CLONE - ML 2.3 QA: API: Java compatibility, docs

2018-09-03 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-25324: -- Summary: CLONE - ML 2.3 QA: API: Java compatibility, docs Key: SPARK-25324 URL: https://issues.apache.org/jira/browse/SPARK-25324 Project: Spark Issue Type:

[jira] [Created] (SPARK-25323) CLONE - ML 2.3 QA: API: Python API coverage

2018-09-03 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-25323: -- Summary: CLONE - ML 2.3 QA: API: Python API coverage Key: SPARK-25323 URL: https://issues.apache.org/jira/browse/SPARK-25323 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-25320) ML, Graph 2.4 QA: API: Binary incompatible changes

2018-09-03 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-25320: --- Summary: ML, Graph 2.4 QA: API: Binary incompatible changes (was: CLONE - ML, Graph 2.3 QA: API:

[jira] [Created] (SPARK-25320) CLONE - ML, Graph 2.3 QA: API: Binary incompatible changes

2018-09-03 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-25320: -- Summary: CLONE - ML, Graph 2.3 QA: API: Binary incompatible changes Key: SPARK-25320 URL: https://issues.apache.org/jira/browse/SPARK-25320 Project: Spark Issue

[jira] [Created] (SPARK-25322) CLONE - ML, Graph 2.3 QA: API: Experimental, DeveloperApi, final, sealed audit

2018-09-03 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-25322: -- Summary: CLONE - ML, Graph 2.3 QA: API: Experimental, DeveloperApi, final, sealed audit Key: SPARK-25322 URL: https://issues.apache.org/jira/browse/SPARK-25322 Project:

[jira] [Created] (SPARK-25327) CLONE - Update MLlib, GraphX websites for 2.3

2018-09-03 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-25327: -- Summary: CLONE - Update MLlib, GraphX websites for 2.3 Key: SPARK-25327 URL: https://issues.apache.org/jira/browse/SPARK-25327 Project: Spark Issue Type:

[jira] [Created] (SPARK-25321) CLONE - ML, Graph 2.3 QA: API: New Scala APIs, docs

2018-09-03 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-25321: -- Summary: CLONE - ML, Graph 2.3 QA: API: New Scala APIs, docs Key: SPARK-25321 URL: https://issues.apache.org/jira/browse/SPARK-25321 Project: Spark Issue Type:

[jira] [Created] (SPARK-25325) CLONE - ML, Graph 2.3 QA: Update user guide for new features & APIs

2018-09-03 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-25325: -- Summary: CLONE - ML, Graph 2.3 QA: Update user guide for new features & APIs Key: SPARK-25325 URL: https://issues.apache.org/jira/browse/SPARK-25325 Project: Spark

[jira] [Created] (SPARK-25319) Spark MLlib, GraphX 2.4 QA umbrella

2018-09-03 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-25319: -- Summary: Spark MLlib, GraphX 2.4 QA umbrella Key: SPARK-25319 URL: https://issues.apache.org/jira/browse/SPARK-25319 Project: Spark Issue Type: Umbrella

[jira] [Commented] (SPARK-15784) Add Power Iteration Clustering to spark.ml

2018-06-04 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501020#comment-16501020 ] Weichen Xu commented on SPARK-15784: [~wm624] Thanks for your enthusiasm, but we need this to be

[jira] [Updated] (SPARK-24231) Python API: Provide evaluateEachIteration method or equivalent for spark.ml GBTs

2018-05-09 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-24231: --- Summary: Python API: Provide evaluateEachIteration method or equivalent for spark.ml GBTs (was:

[jira] [Created] (SPARK-24231) Provide evaluateEachIteration method or equivalent for spark.ml GBTs: Python API

2018-05-09 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-24231: -- Summary: Provide evaluateEachIteration method or equivalent for spark.ml GBTs: Python API Key: SPARK-24231 URL: https://issues.apache.org/jira/browse/SPARK-24231

[jira] [Updated] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2018-05-07 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-20114: --- Component/s: (was: PySpark) > spark.ml parity for sequential pattern mining - PrefixSpan >

[jira] [Updated] (SPARK-24146) spark.ml parity for sequential pattern mining - PrefixSpan: Python API

2018-05-07 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-24146: --- Component/s: PySpark > spark.ml parity for sequential pattern mining - PrefixSpan: Python API >

[jira] [Updated] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2018-05-07 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-20114: --- Component/s: PySpark > spark.ml parity for sequential pattern mining - PrefixSpan >

[jira] [Updated] (SPARK-24146) spark.ml parity for sequential pattern mining - PrefixSpan: Python API

2018-05-02 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-24146: --- Issue Type: Sub-task (was: New Feature) Parent: SPARK-14501 > spark.ml parity for

[jira] [Commented] (SPARK-24146) spark.ml parity for sequential pattern mining - PrefixSpan: Python API

2018-05-02 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16460804#comment-16460804 ] Weichen Xu commented on SPARK-24146: I will create PR soon. :) > spark.ml parity for sequential

<    1   2   3   4   5   6   7   >