[jira] [Updated] (SPARK-3573) Dataset

2014-09-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3573: - Shepherd: Michael Armbrust > Dataset > --- > > Key: SPARK-3573 >

[jira] [Updated] (SPARK-3569) Add metadata field to StructField

2014-09-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3569: - Description: Want to add a metadata field to StructField that can be used by other applications l

[jira] [Updated] (SPARK-3569) Add metadata field to StructField

2014-09-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3569: - Description: Want to add a metadata field to StructField that can be used by other applications l

[jira] [Updated] (SPARK-3418) [MLlib] Additional BLAS and Local Sparse Matrix support

2014-09-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3418: - Assignee: Burak Yavuz > [MLlib] Additional BLAS and Local Sparse Matrix support >

[jira] [Commented] (SPARK-3509) Method for generating random LabeledPoints for testing

2014-09-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138041#comment-14138041 ] Xiangrui Meng commented on SPARK-3509: -- You can implement a special RandomDataGenerat

[jira] [Updated] (SPARK-2672) support compressed file in wholeFile()

2014-09-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2672: - Fix Version/s: (was: 1.2.0) > support compressed file in wholeFile() > ---

[jira] [Updated] (SPARK-3525) Gradient boosting in MLLib

2014-09-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3525: - Fix Version/s: (was: 1.2.0) > Gradient boosting in MLLib > -- > >

[jira] [Updated] (SPARK-3507) Create RegressionLearner trait and make some currect code implement it

2014-09-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3507: - Fix Version/s: (was: 1.2.0) > Create RegressionLearner trait and make some currect code implem

[jira] [Updated] (SPARK-2505) Weighted Regularizer

2014-09-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2505: - Fix Version/s: (was: 1.2.0) > Weighted Regularizer > > >

[jira] [Commented] (SPARK-3403) NaiveBayes crashes with blas/lapack native libraries for breeze (netlib-java)

2014-09-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138057#comment-14138057 ] Xiangrui Meng commented on SPARK-3403: -- 1) Yes, I saw this issue in Linux. 2) In Spar

[jira] [Updated] (SPARK-3270) Spark API for Application Extensions

2014-09-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3270: - Issue Type: New Feature (was: Improvement) > Spark API for Application Extensions > -

[jira] [Commented] (SPARK-3403) NaiveBayes crashes with blas/lapack native libraries for breeze (netlib-java)

2014-09-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139461#comment-14139461 ] Xiangrui Meng commented on SPARK-3403: -- Sorry, it should be netlib-java, but the real

[jira] [Updated] (SPARK-3573) Dataset

2014-09-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3573: - Assignee: Xiangrui Meng > Dataset > --- > > Key: SPARK-3573 >

[jira] [Updated] (SPARK-3573) Dataset

2014-09-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3573: - Description: This JIRA is for discussion of ML dataset, essentially a SchemaRDD with extra ML-spe

[jira] [Commented] (SPARK-3530) Pipeline and Parameters

2014-09-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139600#comment-14139600 ] Xiangrui Meng commented on SPARK-3530: -- [~eustache] The default implementation of mul

[jira] [Comment Edited] (SPARK-3530) Pipeline and Parameters

2014-09-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139600#comment-14139600 ] Xiangrui Meng edited comment on SPARK-3530 at 9/18/14 10:06 PM:

[jira] [Updated] (SPARK-3573) Dataset

2014-09-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3573: - Description: This JIRA is for discussion of ML dataset, essentially a SchemaRDD with extra ML-spe

[jira] [Resolved] (SPARK-3418) [MLlib] Additional BLAS and Local Sparse Matrix support

2014-09-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3418. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2294 [https://githu

[jira] [Created] (SPARK-3600) RandomRDDs doesn't create primitive typed RDDs

2014-09-18 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-3600: Summary: RandomRDDs doesn't create primitive typed RDDs Key: SPARK-3600 URL: https://issues.apache.org/jira/browse/SPARK-3600 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-3600) RDD[Double] doesn't use primitive arrays for caching

2014-09-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3600: - Summary: RDD[Double] doesn't use primitive arrays for caching (was: RandomRDDs doesn't create pri

[jira] [Updated] (SPARK-3600) RDD[Double] doesn't use primitive arrays for caching

2014-09-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3600: - Description: RDD's classTag is not passed in through CacheManager. So RDD[Double] uses object arra

[jira] [Updated] (SPARK-3600) RDD[Double] doesn't use primitive arrays for caching

2014-09-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3600: - Issue Type: Improvement (was: Bug) > RDD[Double] doesn't use primitive arrays for caching > -

[jira] [Updated] (SPARK-3600) RDD[Double] doesn't use primitive arrays for caching

2014-09-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3600: - Component/s: (was: MLlib) > RDD[Double] doesn't use primitive arrays for caching > ---

[jira] [Updated] (SPARK-3600) RDD[Double] doesn't use primitive arrays for caching

2014-09-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3600: - Assignee: (was: Xiangrui Meng) > RDD[Double] doesn't use primitive arrays for caching > --

[jira] [Updated] (SPARK-3600) RDD[Double] doesn't use primitive arrays for caching

2014-09-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3600: - Target Version/s: (was: 1.1.1, 1.2.0) > RDD[Double] doesn't use primitive arrays for caching > -

[jira] [Commented] (SPARK-3573) Dataset

2014-09-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14141271#comment-14141271 ] Xiangrui Meng commented on SPARK-3573: -- [~sandyr] SQL/Streaming/GraphX provide comput

[jira] [Resolved] (SPARK-3491) Use pickle to serialize the data in MLlib Python

2014-09-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3491. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2378 [https://githu

[jira] [Assigned] (SPARK-3541) Improve ALS internal storage

2014-09-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reassigned SPARK-3541: Assignee: Xiangrui Meng > Improve ALS internal storage > > >

[jira] [Resolved] (SPARK-1484) MLlib should warn if you are using an iterative algorithm on non-cached data

2014-09-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-1484. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2347 [https://githu

[jira] [Updated] (SPARK-1484) MLlib should warn if you are using an iterative algorithm on non-cached data

2014-09-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1484: - Assignee: Aaron Staple > MLlib should warn if you are using an iterative algorithm on non-cached d

[jira] [Commented] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2014-09-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148529#comment-14148529 ] Xiangrui Meng commented on SPARK-1405: -- [~Guoqiang Li] and [~pedrorodriguez], since t

[jira] [Updated] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2014-09-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1405: - Assignee: Guoqiang Li (was: Xusen Yin) > parallel Latent Dirichlet Allocation (LDA) atop of spark

[jira] [Updated] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2014-09-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1405: - Shepherd: Xiangrui Meng > parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib >

[jira] [Commented] (SPARK-1241) Support sliding in RDD

2014-09-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148549#comment-14148549 ] Xiangrui Meng commented on SPARK-1241: -- This is implemented MLlib: https://github.co

[jira] [Commented] (SPARK-3588) Gaussian Mixture Model clustering

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148900#comment-14148900 ] Xiangrui Meng commented on SPARK-3588: -- Please follow the instructions at https://cw

[jira] [Commented] (SPARK-3541) Improve ALS internal storage

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148995#comment-14148995 ] Xiangrui Meng commented on SPARK-3541: -- Wrote a new implementation that gives ~5x spe

[jira] [Commented] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149623#comment-14149623 ] Xiangrui Meng commented on SPARK-1405: -- [~pedrorodriguez] Thanks for the update and s

[jira] [Resolved] (SPARK-3614) Filter on minimum occurrences of a term in IDF

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3614. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2494 [https://githu

[jira] [Commented] (SPARK-2516) Bootstrapping

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149721#comment-14149721 ] Xiangrui Meng commented on SPARK-2516: -- The plan was to implement Bag of Little Boots

[jira] [Updated] (SPARK-2516) Bootstrapping

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2516: - Assignee: Yu Ishikawa > Bootstrapping > - > > Key: SPARK-2516 >

[jira] [Resolved] (SPARK-3525) Gradient boosting in MLLib

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3525. -- Resolution: Duplicate [~epakhomov] Just realized that this duplicates SPARK-1547, which was assi

[jira] [Updated] (SPARK-1547) Add gradient boosting algorithm to MLlib

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1547: - Shepherd: Joseph K. Bradley > Add gradient boosting algorithm to MLlib > -

[jira] [Updated] (SPARK-1547) Add gradient boosting algorithm to MLlib

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1547: - Target Version/s: 1.2.0 > Add gradient boosting algorithm to MLlib > -

[jira] [Updated] (SPARK-3700) Improve the performance of JSON parser

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3700: - Assignee: (was: Yin Huai) > Improve the performance of JSON parser > -

[jira] [Updated] (SPARK-3701) Some clean-up work after the refactoring of MLlib's SerDe for PySpark

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3701: - Priority: Minor (was: Major) > Some clean-up work after the refactoring of MLlib's SerDe for PySp

[jira] [Created] (SPARK-3701) Some clean-up work after the refactoring of MLlib's SerDe for PySpark

2014-09-26 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-3701: Summary: Some clean-up work after the refactoring of MLlib's SerDe for PySpark Key: SPARK-3701 URL: https://issues.apache.org/jira/browse/SPARK-3701 Project: Spark

[jira] [Updated] (SPARK-3702) Standardize MLlib classes for learners, models

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3702: - Assignee: Joseph K. Bradley > Standardize MLlib classes for learners, models > ---

[jira] [Updated] (SPARK-1545) Add Random Forest algorithm to MLlib

2014-09-28 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1545: - Assignee: Joseph K. Bradley (was: Manish Amde) > Add Random Forest algorithm to MLlib > -

[jira] [Resolved] (SPARK-1545) Add Random Forest algorithm to MLlib

2014-09-28 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-1545. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2435 [https://githu

[jira] [Updated] (SPARK-3366) Compute best splits distributively in decision tree

2014-09-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3366: - Assignee: Qiping Li > Compute best splits distributively in decision tree > --

[jira] [Resolved] (SPARK-2885) All-pairs similarity via DIMSUM

2014-09-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-2885. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 1778 [https://githu

[jira] [Created] (SPARK-3735) Sending the factor directly or AtA based on the cost in ALS

2014-09-29 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-3735: Summary: Sending the factor directly or AtA based on the cost in ALS Key: SPARK-3735 URL: https://issues.apache.org/jira/browse/SPARK-3735 Project: Spark Is

[jira] [Commented] (SPARK-3434) Distributed block matrix

2014-09-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152636#comment-14152636 ] Xiangrui Meng commented on SPARK-3434: -- [~shivaram] Could you post the design of the

[jira] [Updated] (SPARK-3568) Add metrics for ranking algorithms

2014-09-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3568: - Priority: Major (was: Minor) Target Version/s: 1.2.0 Shepherd: Xiangrui Me

[jira] [Updated] (SPARK-3366) Compute best splits distributively in decision tree

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3366: - Target Version/s: 1.2.0 > Compute best splits distributively in decision tree > --

[jira] [Updated] (SPARK-3436) Streaming SVM

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3436: - Summary: Streaming SVM (was: [MLlib]Streaming SVM ) > Streaming SVM > -- > >

[jira] [Updated] (SPARK-3486) Add PySpark support for Word2Vec

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3486: - Summary: Add PySpark support for Word2Vec (was: [MLlib]Add PySpark support for Word2Vec) > Add P

[jira] [Updated] (SPARK-3158) Avoid 1 extra aggregation for DecisionTree training

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3158: - Target Version/s: 1.2.0 > Avoid 1 extra aggregation for DecisionTree training > --

[jira] [Updated] (SPARK-3158) Avoid 1 extra aggregation for DecisionTree training

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3158: - Priority: Major (was: Minor) > Avoid 1 extra aggregation for DecisionTree training >

[jira] [Updated] (SPARK-3161) Cache example-node map for DecisionTree training

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3161: - Priority: Major (was: Minor) Target Version/s: 1.2.0 > Cache example-node map for Dec

[jira] [Commented] (SPARK-3541) Improve ALS internal storage

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14153911#comment-14153911 ] Xiangrui Meng commented on SPARK-3541: -- I put the implementation at https://github.c

[jira] [Resolved] (SPARK-3701) Some clean-up work after the refactoring of MLlib's SerDe for PySpark

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3701. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2548 [https://githu

[jira] [Updated] (SPARK-3751) DecisionTreeRunner functionality improvement

2014-10-01 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3751: - Assignee: Joseph K. Bradley > DecisionTreeRunner functionality improvement > -

[jira] [Resolved] (SPARK-3751) DecisionTreeRunner functionality improvement

2014-10-01 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3751. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2604 [https://githu

[jira] [Updated] (SPARK-3572) Support register UserType in SQL

2014-10-02 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3572: - Assignee: Joseph K. Bradley > Support register UserType in SQL >

[jira] [Resolved] (SPARK-3366) Compute best splits distributively in decision tree

2014-10-03 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3366. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2595 [https://githu

[jira] [Updated] (SPARK-1655) In naive Bayes, store conditional probabilities distributively.

2014-10-03 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1655: - Assignee: Aaron Staple > In naive Bayes, store conditional probabilities distributively. > ---

[jira] [Created] (SPARK-3820) Specialize columnSimilarity() without any threshold

2014-10-06 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-3820: Summary: Specialize columnSimilarity() without any threshold Key: SPARK-3820 URL: https://issues.apache.org/jira/browse/SPARK-3820 Project: Spark Issue Type:

[jira] [Commented] (SPARK-3803) ArrayIndexOutOfBoundsException found in executing computePrincipalComponents

2014-10-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161281#comment-14161281 ] Xiangrui Meng commented on SPARK-3803: -- In `computeCovariance`, we generate a warning

[jira] [Updated] (SPARK-1006) MLlib ALS gets stack overflow with too many iterations

2014-10-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1006: - Component/s: MLlib > MLlib ALS gets stack overflow with too many iterations >

[jira] [Closed] (SPARK-3370) The simple test error

2014-10-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-3370. Resolution: Duplicate This is a known issue. We can fix it by checkpointing intermediate RDDs. For

[jira] [Updated] (SPARK-3424) KMeans Plus Plus is too slow

2014-10-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3424: - Assignee: Derrick Burns > KMeans Plus Plus is too slow > > >

[jira] [Updated] (SPARK-3261) KMeans clusterer can return duplicate cluster centers

2014-10-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3261: - Assignee: Derrick Burns > KMeans clusterer can return duplicate cluster centers >

[jira] [Commented] (SPARK-3828) Spark returns inconsistent results when building with different Hadoop version

2014-10-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161540#comment-14161540 ] Xiangrui Meng commented on SPARK-3828: -- `text8` doesn't contain any line feed charact

[jira] [Commented] (SPARK-3434) Distributed block matrix

2014-10-07 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14162156#comment-14162156 ] Xiangrui Meng commented on SPARK-3434: -- [~shivaram] and [~ConcreteVitamin] Any update

[jira] [Reopened] (SPARK-3828) Spark returns inconsistent results when building with different Hadoop version

2014-10-07 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reopened SPARK-3828: -- > Spark returns inconsistent results when building with different Hadoop > version > -

[jira] [Commented] (SPARK-3828) Spark returns inconsistent results when building with different Hadoop version

2014-10-07 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14162439#comment-14162439 ] Xiangrui Meng commented on SPARK-3828: -- I re-opened this because it may be a serious

[jira] [Created] (SPARK-3838) Python code example for Word2Vec in user guide

2014-10-07 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-3838: Summary: Python code example for Word2Vec in user guide Key: SPARK-3838 URL: https://issues.apache.org/jira/browse/SPARK-3838 Project: Spark Issue Type: Sub-

[jira] [Resolved] (SPARK-3790) CosineSimilarity via DIMSUM example

2014-10-07 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3790. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2622 [https://githu

[jira] [Resolved] (SPARK-3486) Add PySpark support for Word2Vec

2014-10-07 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3486. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2356 [https://githu

[jira] [Updated] (SPARK-3838) Python code example for Word2Vec in user guide

2014-10-07 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3838: - Assignee: (was: Liquan Pei) > Python code example for Word2Vec in user guide > ---

[jira] [Updated] (SPARK-3832) Upgrade Breeze dependency to 0.10

2014-10-07 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3832: - Assignee: DB Tsai > Upgrade Breeze dependency to 0.10 > - > >

[jira] [Resolved] (SPARK-3832) Upgrade Breeze dependency to 0.10

2014-10-07 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3832. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2693 [https://githu

[jira] [Created] (SPARK-3844) Truncate appName in WebUI if it is too long

2014-10-08 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-3844: Summary: Truncate appName in WebUI if it is too long Key: SPARK-3844 URL: https://issues.apache.org/jira/browse/SPARK-3844 Project: Spark Issue Type: Improve

[jira] [Updated] (SPARK-3844) Truncate appName in WebUI if it is too long

2014-10-08 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3844: - Attachment: long-title.png > Truncate appName in WebUI if it is too long > ---

[jira] [Updated] (SPARK-3158) Avoid 1 extra aggregation for DecisionTree training

2014-10-08 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3158: - Assignee: Qiping Li > Avoid 1 extra aggregation for DecisionTree training > --

[jira] [Resolved] (SPARK-3841) Pretty-print Params case classes for tests

2014-10-08 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3841. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2700 [https://githu

[jira] [Created] (SPARK-3856) Clean deprecated usage after breeze 0.10 upgrade

2014-10-08 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-3856: Summary: Clean deprecated usage after breeze 0.10 upgrade Key: SPARK-3856 URL: https://issues.apache.org/jira/browse/SPARK-3856 Project: Spark Issue Type: Im

[jira] [Updated] (SPARK-3838) Python code example for Word2Vec in user guide

2014-10-08 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3838: - Assignee: Liquan Pei > Python code example for Word2Vec in user guide > --

[jira] [Updated] (SPARK-3439) Add Canopy Clustering Algorithm

2014-10-08 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3439: - Target Version/s: (was: 1.2.0) > Add Canopy Clustering Algorithm > -

[jira] [Resolved] (SPARK-3856) Clean deprecated usage after breeze 0.10 upgrade

2014-10-08 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3856. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2718 [https://githu

[jira] [Resolved] (SPARK-3158) Avoid 1 extra aggregation for DecisionTree training

2014-10-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3158. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2708 [https://githu

[jira] [Updated] (SPARK-1486) Support multi-model training in MLlib

2014-10-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1486: - Assignee: (was: Burak Yavuz) > Support multi-model training in MLlib > ---

[jira] [Updated] (SPARK-1486) Support multi-model training in MLlib

2014-10-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1486: - Assignee: Burak Yavuz > Support multi-model training in MLlib > --

[jira] [Assigned] (SPARK-1503) Implement Nesterov's accelerated first-order method

2014-10-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reassigned SPARK-1503: Assignee: Xiangrui Meng > Implement Nesterov's accelerated first-order method > ---

[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2014-10-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14166465#comment-14166465 ] Xiangrui Meng commented on SPARK-1503: -- [~staple] Thanks for picking up this JIRA! TF

[jira] [Updated] (SPARK-1503) Implement Nesterov's accelerated first-order method

2014-10-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1503: - Assignee: Aaron Staple (was: Xiangrui Meng) > Implement Nesterov's accelerated first-order method

[jira] [Updated] (SPARK-3903) Create general data loading method for LabeledPoints

2014-10-10 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3903: - Assignee: Joseph K. Bradley > Create general data loading method for LabeledPoints > -

[jira] [Updated] (SPARK-3838) Python code example for Word2Vec in user guide

2014-10-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3838: - Assignee: Anant Daksh Asthana (was: Liquan Pei) > Python code example for Word2Vec in user guide

[jira] [Commented] (SPARK-3838) Python code example for Word2Vec in user guide

2014-10-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14168960#comment-14168960 ] Xiangrui Meng commented on SPARK-3838: -- [~slcclimber] Thanks! Please follow instructi

<    4   5   6   7   8   9   10   11   12   13   >