[jira] [Updated] (SPARK-21523) Fix bug of strong wolfe linesearch `init` parameter lose effectiveness

2017-07-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-21523: --- Priority: Minor (was: Major) > Fix bug of strong wolfe linesearch `init` parameter lose effectivenes

[jira] [Created] (SPARK-21523) Fix bug of strong wolfe linesearch `init` parameter lose effectiveness

2017-07-24 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-21523: -- Summary: Fix bug of strong wolfe linesearch `init` parameter lose effectiveness Key: SPARK-21523 URL: https://issues.apache.org/jira/browse/SPARK-21523 Project: Spark

[jira] [Commented] (SPARK-21523) Fix bug of strong wolfe linesearch `init` parameter lose effectiveness

2017-07-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16099225#comment-16099225 ] Weichen Xu commented on SPARK-21523: I will work on this once the breeze cut a new ve

[jira] [Commented] (SPARK-17025) Cannot persist PySpark ML Pipeline model that includes custom Transformer

2017-07-26 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16101994#comment-16101994 ] Weichen Xu commented on SPARK-17025: Because currently, scala calling python will be

[jira] [Commented] (SPARK-21087) CrossValidator, TrainValidationSplit should preserve all models after fitting: Scala

2017-07-26 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16102048#comment-16102048 ] Weichen Xu commented on SPARK-21087: I will work on it. > CrossValidator, TrainValid

[jira] [Commented] (SPARK-11215) Add multiple columns support to StringIndexer

2017-07-26 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16102274#comment-16102274 ] Weichen Xu commented on SPARK-11215: I will take over this feature and create a PR so

[jira] [Commented] (SPARK-20418) multi-label classification support

2017-07-26 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16102280#comment-16102280 ] Weichen Xu commented on SPARK-20418: I will work on this. > multi-label classificati

[jira] [Created] (SPARK-16345) Extract graphx programming guide example snippets from source files instead of hard code them

2016-07-01 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-16345: -- Summary: Extract graphx programming guide example snippets from source files instead of hard code them Key: SPARK-16345 URL: https://issues.apache.org/jira/browse/SPARK-16345

[jira] [Updated] (SPARK-16345) Extract graphx programming guide example snippets from source files instead of hard code them

2016-07-01 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-16345: --- Description: Currently, all example snippets in the graphx programming guide are hard-coded, which c

[jira] [Commented] (SPARK-16377) Spark MLlib: MultilayerPerceptronClassifier - error while training

2016-07-05 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15362419#comment-15362419 ] Weichen Xu commented on SPARK-16377: hi, the exception: java.lang.ArrayIndexOutOfBou

[jira] [Commented] (SPARK-16377) Spark MLlib: MultilayerPerceptronClassifier - error while training

2016-07-05 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15362426#comment-15362426 ] Weichen Xu commented on SPARK-16377: This exception still exists on master code. ERRO

[jira] [Commented] (SPARK-16377) Spark MLlib: MultilayerPerceptronClassifier - error while training

2016-07-05 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15362429#comment-15362429 ] Weichen Xu commented on SPARK-16377: And I test on master version, also encounter the

[jira] [Comment Edited] (SPARK-16377) Spark MLlib: MultilayerPerceptronClassifier - error while training

2016-07-05 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15362429#comment-15362429 ] Weichen Xu edited comment on SPARK-16377 at 7/5/16 12:49 PM: -

[jira] [Created] (SPARK-16470) ml.regression.LinearRegression training data do not check whether the result actually reach convergence

2016-07-09 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-16470: -- Summary: ml.regression.LinearRegression training data do not check whether the result actually reach convergence Key: SPARK-16470 URL: https://issues.apache.org/jira/browse/SPARK-1647

[jira] [Updated] (SPARK-16470) ml.regression.LinearRegression training data do not check whether the result actually reach convergence

2016-07-09 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-16470: --- Description: In `ml.regression.LinearRegression`, it use breeze `LBFGS` and `OWLQN` optimizer to do

[jira] [Created] (SPARK-16499) Improve applyInPlace function for matrix in ANN code

2016-07-12 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-16499: -- Summary: Improve applyInPlace function for matrix in ANN code Key: SPARK-16499 URL: https://issues.apache.org/jira/browse/SPARK-16499 Project: Spark Issue Type:

[jira] [Created] (SPARK-16500) Add LBFG training not convergence warning for all ML algorithm

2016-07-12 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-16500: -- Summary: Add LBFG training not convergence warning for all ML algorithm Key: SPARK-16500 URL: https://issues.apache.org/jira/browse/SPARK-16500 Project: Spark I

[jira] [Commented] (SPARK-16500) Add LBFG training not convergence warning for all ML algorithm

2016-07-12 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15373062#comment-15373062 ] Weichen Xu commented on SPARK-16500: OK. I'll keep it in mind in future task. Thanks!

[jira] [Updated] (SPARK-16500) Add LBFG training not convergence warning for all ML algorithm

2016-07-12 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-16500: --- Component/s: Optimizer > Add LBFG training not convergence warning for all ML algorithm > ---

[jira] [Created] (SPARK-16546) Dataframe.drop supported multi-columns in spark api and should make python api also support it.

2016-07-14 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-16546: -- Summary: Dataframe.drop supported multi-columns in spark api and should make python api also support it. Key: SPARK-16546 URL: https://issues.apache.org/jira/browse/SPARK-16546

[jira] [Updated] (SPARK-16546) Dataframe.drop supported multi-columns in spark api and should make python api also support it.

2016-07-14 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-16546: --- Component/s: SQL PySpark > Dataframe.drop supported multi-columns in spark api and s

[jira] [Created] (SPARK-16561) Potential numerial problem in MultivariateOnlineSummarizer min/max

2016-07-14 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-16561: -- Summary: Potential numerial problem in MultivariateOnlineSummarizer min/max Key: SPARK-16561 URL: https://issues.apache.org/jira/browse/SPARK-16561 Project: Spark

[jira] [Updated] (SPARK-16561) Potential numerial problem in MultivariateOnlineSummarizer min/max

2016-07-14 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-16561: --- Description: In `MultivariateOnlineSummarizer` min/max method, use judgement nnz(i) < weightSum, it

[jira] [Updated] (SPARK-16561) Potential numerial problem in MultivariateOnlineSummarizer min/max

2016-07-14 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-16561: --- Description: In `MultivariateOnlineSummarizer` min/max method, use judgement "nnz(i) < weightSum", i

[jira] [Updated] (SPARK-16561) Potential numerial problem in MultivariateOnlineSummarizer min/max

2016-07-14 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-16561: --- Description: In `MultivariateOnlineSummarizer` min/max method, use judgement "nnz(i) < weightSum", i

[jira] [Created] (SPARK-16568) update sql programing guide refreshTable API

2016-07-15 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-16568: -- Summary: update sql programing guide refreshTable API Key: SPARK-16568 URL: https://issues.apache.org/jira/browse/SPARK-16568 Project: Spark Issue Type: Improvem

[jira] [Created] (SPARK-16600) fix latex formula syntax error in mllib

2016-07-18 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-16600: -- Summary: fix latex formula syntax error in mllib Key: SPARK-16600 URL: https://issues.apache.org/jira/browse/SPARK-16600 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-16638) The L2 regularization of LinearRegression seems wrong when standardization is false

2016-07-19 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-16638: -- Summary: The L2 regularization of LinearRegression seems wrong when standardization is false Key: SPARK-16638 URL: https://issues.apache.org/jira/browse/SPARK-16638 Proje

[jira] [Updated] (SPARK-16638) The L2 regularization of LinearRegression seems wrong when standardization is false

2016-07-19 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-16638: --- Description: The original L2 is 0.5 * effectiveL2regParam * sigma( wi^2 ) (wi is the coefficients we

[jira] [Commented] (SPARK-16638) The L2 regularization of LinearRegression seems wrong when standardization is false

2016-07-19 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385306#comment-15385306 ] Weichen Xu commented on SPARK-16638: seems i'm wrong, the intention of author may be

[jira] [Closed] (SPARK-16638) The L2 regularization of LinearRegression seems wrong when standardization is false

2016-07-19 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu closed SPARK-16638. -- Resolution: Not A Problem > The L2 regularization of LinearRegression seems wrong when standardization

[jira] [Created] (SPARK-16653) Make convergence tolerance param in ANN default value consistent with other algorithm using LBFGS

2016-07-20 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-16653: -- Summary: Make convergence tolerance param in ANN default value consistent with other algorithm using LBFGS Key: SPARK-16653 URL: https://issues.apache.org/jira/browse/SPARK-16653

[jira] [Updated] (SPARK-16653) Make convergence tolerance param in ANN default value consistent with other algorithm using LBFGS

2016-07-20 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-16653: --- Component/s: Optimizer ML > Make convergence tolerance param in ANN default value co

[jira] [Created] (SPARK-16662) The HiveContext deprecate warning in python always shown even if do not use HiveContext

2016-07-21 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-16662: -- Summary: The HiveContext deprecate warning in python always shown even if do not use HiveContext Key: SPARK-16662 URL: https://issues.apache.org/jira/browse/SPARK-16662 P

[jira] [Updated] (SPARK-16662) The HiveContext deprecate warning in python always shown even if do not use HiveContext

2016-07-21 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-16662: --- Issue Type: Bug (was: Improvement) > The HiveContext deprecate warning in python always shown even i

[jira] [Created] (SPARK-16696) unused broadcast variables should call destroy instead of unpersist

2016-07-24 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-16696: -- Summary: unused broadcast variables should call destroy instead of unpersist Key: SPARK-16696 URL: https://issues.apache.org/jira/browse/SPARK-16696 Project: Spark

[jira] [Updated] (SPARK-16696) unused broadcast variables should call destroy instead of unpersist

2016-07-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-16696: --- Description: Unused broadcast variables should call destroy() instead of unpersist() so that the mem

[jira] [Created] (SPARK-16697) redundant RDD computation in LDAOptimizer

2016-07-24 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-16697: -- Summary: redundant RDD computation in LDAOptimizer Key: SPARK-16697 URL: https://issues.apache.org/jira/browse/SPARK-16697 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-16697) redundant RDD computation in LDAOptimizer

2016-07-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-16697: --- Description: In mllib.clustering.LDAOptimizer the submitMiniBatch method, the stats: RDD do not pers

[jira] [Updated] (SPARK-16696) unused broadcast variables should call destroy instead of unpersist

2016-07-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-16696: --- Issue Type: Improvement (was: Bug) > unused broadcast variables should call destroy instead of unper

[jira] [Commented] (SPARK-25348) Data source for binary files

2019-04-07 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812090#comment-16812090 ] Weichen Xu commented on SPARK-25348: I am working on this. :) > Data source for bin

[jira] [Created] (SPARK-27454) Spark image datasource fail when encounter some illegal images

2019-04-12 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-27454: -- Summary: Spark image datasource fail when encounter some illegal images Key: SPARK-27454 URL: https://issues.apache.org/jira/browse/SPARK-27454 Project: Spark I

[jira] [Updated] (SPARK-27454) Spark image datasource fail when encounter some illegal images

2019-04-12 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-27454: --- Description: Spark image datasource fail when encounter some illegal images. Such as exception foll

[jira] [Commented] (SPARK-27534) Do not load `content` column in binary data source if it is not selected

2019-04-23 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16824554#comment-16824554 ] Weichen Xu commented on SPARK-27534: I am working on this. :) > Do not load `conten

[jira] [Commented] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames or Arrow batches

2019-05-09 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836377#comment-16836377 ] Weichen Xu commented on SPARK-26412: [~mengxr]   There's one issue:   There're 2 p

[jira] [Comment Edited] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames or Arrow batches

2019-05-09 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836377#comment-16836377 ] Weichen Xu edited comment on SPARK-26412 at 5/9/19 3:18 PM:

[jira] [Comment Edited] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames or Arrow batches

2019-05-09 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836377#comment-16836377 ] Weichen Xu edited comment on SPARK-26412 at 5/9/19 3:19 PM:

[jira] [Commented] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames or Arrow batches

2019-05-09 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836480#comment-16836480 ] Weichen Xu commented on SPARK-26412: Discuss with [~mengxr] , discard proposal (2),

[jira] [Created] (SPARK-18051) Custom PartitionCoalescer cause serialization exception

2016-10-21 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-18051: -- Summary: Custom PartitionCoalescer cause serialization exception Key: SPARK-18051 URL: https://issues.apache.org/jira/browse/SPARK-18051 Project: Spark Issue Typ

[jira] [Created] (SPARK-18078) Add option for customize zipPartition task preferred locations

2016-10-24 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-18078: -- Summary: Add option for customize zipPartition task preferred locations Key: SPARK-18078 URL: https://issues.apache.org/jira/browse/SPARK-18078 Project: Spark I

[jira] [Updated] (SPARK-18078) Add option for customize zipPartition task preferred locations

2016-10-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-18078: --- Description: `RDD.zipPartitions` task preferred locations strategy will use the intersection of corr

[jira] [Updated] (SPARK-18078) Add option for customize zipPartition task preferred locations

2016-10-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-18078: --- Description: `RDD.zipPartitions` task preferred locations strategy will use the intersection of corr

[jira] [Updated] (SPARK-18078) Add option for customize zipPartition task preferred locations

2016-10-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-18078: --- Priority: Minor (was: Major) > Add option for customize zipPartition task preferred locations >

[jira] [Created] (SPARK-18095) There is a display problem in spark UI storage tab when rdd was persisted in multiple replicas

2016-10-25 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-18095: -- Summary: There is a display problem in spark UI storage tab when rdd was persisted in multiple replicas Key: SPARK-18095 URL: https://issues.apache.org/jira/browse/SPARK-18095

[jira] [Updated] (SPARK-18095) There is a display problem in spark UI storage tab when rdd was persisted in multiple replicas

2016-10-25 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-18095: --- Description: There is a display problem in spark UI storage tab when rdd was persisted in multiple r

[jira] [Commented] (SPARK-18095) There is a display problem in spark UI storage tab when rdd was persisted in multiple replicas

2016-10-25 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15605712#comment-15605712 ] Weichen Xu commented on SPARK-18095: I am working on it... > There is a display prob

[jira] [Issue Comment Deleted] (SPARK-18095) There is a display problem in spark UI storage tab when rdd was persisted in multiple replicas

2016-10-25 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-18095: --- Comment: was deleted (was: I am working on it...) > There is a display problem in spark UI storage t

[jira] [Commented] (SPARK-18036) Decision Trees do not handle edge cases

2016-10-25 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15607357#comment-15607357 ] Weichen Xu commented on SPARK-18036: i am working on this... > Decision Trees do no

[jira] [Created] (SPARK-18201) add toDense and toSparse into Matrix trait, like Vector design

2016-11-01 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-18201: -- Summary: add toDense and toSparse into Matrix trait, like Vector design Key: SPARK-18201 URL: https://issues.apache.org/jira/browse/SPARK-18201 Project: Spark I

[jira] [Closed] (SPARK-18201) add toDense and toSparse into Matrix trait, like Vector design

2016-11-01 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu closed SPARK-18201. -- Resolution: Duplicate It will fix in this PR https://github.com/apache/spark/pull/15628 > add toDense

[jira] [Created] (SPARK-18218) Optimize BlockMatrix multiplication, which may cause OOM and low parallelism usage problem in several cases

2016-11-02 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-18218: -- Summary: Optimize BlockMatrix multiplication, which may cause OOM and low parallelism usage problem in several cases Key: SPARK-18218 URL: https://issues.apache.org/jira/browse/SPARK-

[jira] [Updated] (SPARK-18218) Optimize BlockMatrix multiplication, which may cause OOM and low parallelism usage problem in several cases

2016-11-02 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-18218: --- Issue Type: Improvement (was: Bug) > Optimize BlockMatrix multiplication, which may cause OOM and lo

[jira] [Commented] (SPARK-18286) Add Scala/Java/Python examples for MinHash and RandomProjection

2016-11-04 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15638806#comment-15638806 ] Weichen Xu commented on SPARK-18286: I will work on it, thanks~ > Add Scala/Java/Pyt

[jira] [Created] (SPARK-15203) The spark daemon shell script error, daemon process start successfully but script output fail message.

2016-05-07 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-15203: -- Summary: The spark daemon shell script error, daemon process start successfully but script output fail message. Key: SPARK-15203 URL: https://issues.apache.org/jira/browse/SPARK-15203

[jira] [Updated] (SPARK-15203) The spark daemon shell script error, daemon process start successfully but script output fail message.

2016-05-07 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-15203: --- Fix Version/s: (was: 2.1.0) (was: 1.6.2) (was: 1.6.1

[jira] [Updated] (SPARK-15203) The spark daemon shell script error, daemon process start successfully but script output fail message.

2016-05-07 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-15203: --- Target Version/s: (was: 2.1.0) > The spark daemon shell script error, daemon process start successf

[jira] [Updated] (SPARK-15203) The spark daemon shell script error, daemon process start successfully but script output fail message.

2016-05-07 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-15203: --- Description: When using sbin/start-master.sh to start spark master daemon, sometimes the daemon serv

[jira] [Created] (SPARK-15212) CVS file reader when read file with first line schema do not filter blank in schema column name

2016-05-08 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-15212: -- Summary: CVS file reader when read file with first line schema do not filter blank in schema column name Key: SPARK-15212 URL: https://issues.apache.org/jira/browse/SPARK-15212

[jira] [Updated] (SPARK-15212) CSV file reader when read file with first line schema do not filter blank in schema column name

2016-05-08 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-15212: --- Summary: CSV file reader when read file with first line schema do not filter blank in schema column n

[jira] [Commented] (SPARK-15212) CSV file reader when read file with first line schema do not filter blank in schema column name

2016-05-09 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276275#comment-15276275 ] Weichen Xu commented on SPARK-15212: en...but still may cause problem, for example, t

[jira] [Created] (SPARK-15226) CSV file data-line with newline at first line load error

2016-05-09 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-15226: -- Summary: CSV file data-line with newline at first line load error Key: SPARK-15226 URL: https://issues.apache.org/jira/browse/SPARK-15226 Project: Spark Issue Ty

[jira] [Created] (SPARK-15322) update deprecate accumulator usage into accumulatorV2 in mllib

2016-05-13 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-15322: -- Summary: update deprecate accumulator usage into accumulatorV2 in mllib Key: SPARK-15322 URL: https://issues.apache.org/jira/browse/SPARK-15322 Project: Spark I

[jira] [Updated] (SPARK-15322) update deprecate accumulator usage into accumulatorV2 in mllib

2016-05-13 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-15322: --- Component/s: ML > update deprecate accumulator usage into accumulatorV2 in mllib > --

[jira] [Created] (SPARK-15350) Add unit test function for LogisticRegressionWithLBFGS in JavaLogisticRegressionSuite

2016-05-16 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-15350: -- Summary: Add unit test function for LogisticRegressionWithLBFGS in JavaLogisticRegressionSuite Key: SPARK-15350 URL: https://issues.apache.org/jira/browse/SPARK-15350 Pro

[jira] [Updated] (SPARK-15350) Add unit test function for LogisticRegressionWithLBFGS in JavaLogisticRegressionSuite

2016-05-16 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-15350: --- Priority: Minor (was: Major) > Add unit test function for LogisticRegressionWithLBFGS in > JavaLogi

[jira] [Created] (SPARK-15446) catalyst using BigInteger.longValueExact that not supporting java 7 and compile error

2016-05-20 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-15446: -- Summary: catalyst using BigInteger.longValueExact that not supporting java 7 and compile error Key: SPARK-15446 URL: https://issues.apache.org/jira/browse/SPARK-15446 Pro

[jira] [Closed] (SPARK-15446) catalyst using BigInteger.longValueExact that not supporting java 7 and compile error

2016-05-20 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu closed SPARK-15446. -- Resolution: Fixed > catalyst using BigInteger.longValueExact that not supporting java 7 and > compile

[jira] [Commented] (SPARK-15446) catalyst using BigInteger.longValueExact that not supporting java 7 and compile error

2016-05-20 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293599#comment-15293599 ] Weichen Xu commented on SPARK-15446: OK. I see. > catalyst using BigInteger.longValu

[jira] [Created] (SPARK-15461) modify python test script using default version 2.7

2016-05-20 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-15461: -- Summary: modify python test script using default version 2.7 Key: SPARK-15461 URL: https://issues.apache.org/jira/browse/SPARK-15461 Project: Spark Issue Type: B

[jira] [Updated] (SPARK-15461) modify python test script using default version 2.7

2016-05-20 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-15461: --- Component/s: Tests PySpark > modify python test script using default version 2.7 > -

[jira] [Updated] (SPARK-15461) modify python test script using default version 2.7

2016-05-20 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-15461: --- Description: To Spark 2.0, the python test script do not support python 2.6, so need to update the d

[jira] [Updated] (SPARK-15461) modify python test script using default version 2.7

2016-05-20 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-15461: --- Priority: Major (was: Critical) > modify python test script using default version 2.7 >

[jira] [Commented] (SPARK-15461) modify python test script using default version 2.7

2016-05-21 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15295057#comment-15295057 ] Weichen Xu commented on SPARK-15461: Oh..it really still support python 2.6 but need

[jira] [Commented] (SPARK-15461) modify python test script using default version 2.7

2016-05-21 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15295056#comment-15295056 ] Weichen Xu commented on SPARK-15461: Oh..it really still support python 2.6 but need

[jira] [Closed] (SPARK-15461) modify python test script using default version 2.7

2016-05-21 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu closed SPARK-15461. -- Resolution: Fixed install unittest2 then we can use python 2.6 run spark python tests. > modify python

[jira] [Created] (SPARK-15464) Replace SQLContext and SparkContext with SparkSession using builder pattern in python test code

2016-05-21 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-15464: -- Summary: Replace SQLContext and SparkContext with SparkSession using builder pattern in python test code Key: SPARK-15464 URL: https://issues.apache.org/jira/browse/SPARK-15464

[jira] [Updated] (SPARK-15464) Replace SQLContext and SparkContext with SparkSession using builder pattern in python testsuites

2016-05-21 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-15464: --- Summary: Replace SQLContext and SparkContext with SparkSession using builder pattern in python testsu

[jira] [Updated] (SPARK-15464) Replace SQLContext and SparkContext with SparkSession using builder pattern in python testsuites

2016-05-21 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-15464: --- Labels: test (was: ) > Replace SQLContext and SparkContext with SparkSession using builder pattern

[jira] [Commented] (SPARK-15446) catalyst using BigInteger.longValueExact that not supporting java 7 and compile error

2016-05-21 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15295409#comment-15295409 ] Weichen Xu commented on SPARK-15446: OK. I got it, thanks. > catalyst using BigInteg

[jira] [Created] (SPARK-15499) Add python testsuite with remote debug and single test parameter to help developer debug code easier.

2016-05-24 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-15499: -- Summary: Add python testsuite with remote debug and single test parameter to help developer debug code easier. Key: SPARK-15499 URL: https://issues.apache.org/jira/browse/SPARK-15499

[jira] [Updated] (SPARK-15499) Add python testsuite with remote debug and single test parameter to help developer debug code easier.

2016-05-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-15499: --- Description: To python/run-tests.py script, I add the following parameters: --single-test=SINGLE_T

[jira] [Updated] (SPARK-15499) Add python testsuite with remote debug and single test parameter to help developer debug code easier.

2016-05-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-15499: --- Issue Type: New Feature (was: Improvement) > Add python testsuite with remote debug and single test

[jira] [Assigned] (SPARK-34080) Add UnivariateFeatureSelector to deprecate existing selectors

2021-01-15 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu reassigned SPARK-34080: -- Assignee: Huaxin Gao > Add UnivariateFeatureSelector to deprecate existing selectors > --

[jira] [Resolved] (SPARK-34080) Add UnivariateFeatureSelector to deprecate existing selectors

2021-01-15 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-34080. Fix Version/s: 3.2.0 3.1.1 Resolution: Fixed Issue resolved by pull requ

[jira] [Created] (SPARK-34463) toPandas failed with error: buffer source array is read-only

2021-02-18 Thread Weichen Xu (Jira)
Weichen Xu created SPARK-34463: -- Summary: toPandas failed with error: buffer source array is read-only Key: SPARK-34463 URL: https://issues.apache.org/jira/browse/SPARK-34463 Project: Spark Iss

[jira] [Updated] (SPARK-34463) toPandas failed with error: buffer source array is read-only

2021-02-18 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-34463: --- Description: Environment: apache/park master pandas version > 1.0.5 Reproduce code: {code} spark.

[jira] [Updated] (SPARK-34463) toPandas failed with error: buffer source array is read-only

2021-02-18 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-34463: --- Description: Environment: apache/park master pandas version > 1.0.5 Reproduce code: {code} spark.

[jira] [Commented] (SPARK-34463) toPandas failed with error: buffer source array is read-only

2021-02-18 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286469#comment-17286469 ] Weichen Xu commented on SPARK-34463: [~bryanc] [~lidavidm] [~hyukjin.kwon] Any idea

[jira] [Comment Edited] (SPARK-34463) toPandas failed with error: buffer source array is read-only

2021-02-18 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286469#comment-17286469 ] Weichen Xu edited comment on SPARK-34463 at 2/18/21, 1:15 PM:

[jira] [Updated] (SPARK-34463) toPandas failed with error: buffer source array is read-only

2021-02-18 Thread Weichen Xu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-34463: --- Description: Environment: apache/spark master pandas version > 1.0.5 Reproduce code: {code} spark

  1   2   3   4   5   6   7   8   >