[jira] [Updated] (SPARK-4588) Add API for feature attributes

2015-03-04 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4588: - Description: Feature attributes, e.g., continuous/categorical, feature names, feature dimension,

[jira] [Updated] (SPARK-6137) G-Means clustering algorithm implementation

2015-03-04 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6137: - Labels: clustering (was: ) > G-Means clustering algorithm implementation > --

[jira] [Updated] (SPARK-5692) Model import/export for Word2Vec

2015-03-04 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5692: - Assignee: ANUPAM MEDIRATTA > Model import/export for Word2Vec > >

[jira] [Commented] (SPARK-5692) Model import/export for Word2Vec

2015-03-04 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14348212#comment-14348212 ] Xiangrui Meng commented on SPARK-5692: -- Done. The Parquet data file should have two c

[jira] [Resolved] (SPARK-6090) Add BinaryClassificationMetrics in PySpark/MLlib

2015-03-05 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6090. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 4863 [https://githu

[jira] [Created] (SPARK-6192) Enhance MLlib's Python API (GSoC 2015)

2015-03-05 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-6192: Summary: Enhance MLlib's Python API (GSoC 2015) Key: SPARK-6192 URL: https://issues.apache.org/jira/browse/SPARK-6192 Project: Spark Issue Type: Umbrella

[jira] [Updated] (SPARK-6192) Enhance MLlib's Python API (GSoC 2015)

2015-03-05 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6192: - Labels: gsoc gsoc2015 mentor (was: gsoc gsoc2015) > Enhance MLlib's Python API (GSoC 2015) >

[jira] [Updated] (SPARK-6095) Support model save/load in Python's linear models

2015-03-05 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6095: - Assignee: Yanbo Liang > Support model save/load in Python's linear models > --

[jira] [Updated] (SPARK-6095) Support model save/load in Python's linear models

2015-03-05 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6095: - Target Version/s: 1.4.0 > Support model save/load in Python's linear models >

[jira] [Commented] (SPARK-6192) Enhance MLlib's Python API (GSoC 2015)

2015-03-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353347#comment-14353347 ] Xiangrui Meng commented on SPARK-6192: -- [~Manglano] and [~leckie-chn] Thanks for your

[jira] [Resolved] (SPARK-4355) OnlineSummarizer doesn't merge mean correctly

2015-03-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-4355. -- Resolution: Fixed Target Version/s: 1.2.0, 1.1.1, 1.0.3 (was: 1.1.1, 1.2.0, 1.0.3) >

[jira] [Updated] (SPARK-4355) OnlineSummarizer doesn't merge mean correctly

2015-03-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4355: - Fix Version/s: 1.0.3 > OnlineSummarizer doesn't merge mean correctly > ---

[jira] [Commented] (SPARK-3278) Isotonic regression

2015-03-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353419#comment-14353419 ] Xiangrui Meng commented on SPARK-3278: -- I don't know any. It really depends on how ma

[jira] [Commented] (SPARK-6234) 10% Performance regression with Breeze upgrade

2015-03-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353780#comment-14353780 ] Xiangrui Meng commented on SPARK-6234: -- [~nravi] Which KMeans implementation did you

[jira] [Commented] (SPARK-6234) 10% Performance regression with Breeze upgrade

2015-03-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353909#comment-14353909 ] Xiangrui Meng commented on SPARK-6234: -- [~nravi] This seems to be a regression in Bre

[jira] [Commented] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355337#comment-14355337 ] Xiangrui Meng commented on SPARK-6244: -- Agree with Sean that this is not a vector spa

[jira] [Resolved] (SPARK-5986) Model import/export for KMeansModel

2015-03-11 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-5986. -- Resolution: Fixed Fix Version/s: 1.4.0 > Model import/export for KMeansModel > --

[jira] [Created] (SPARK-6278) Mention the change of step size in the migration guide

2015-03-11 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-6278: Summary: Mention the change of step size in the migration guide Key: SPARK-6278 URL: https://issues.apache.org/jira/browse/SPARK-6278 Project: Spark Issue Ty

[jira] [Created] (SPARK-6288) Pyrolite calls hashCode to cache previously serialized objects

2015-03-11 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-6288: Summary: Pyrolite calls hashCode to cache previously serialized objects Key: SPARK-6288 URL: https://issues.apache.org/jira/browse/SPARK-6288 Project: Spark

[jira] [Reopened] (SPARK-5186) Vector.equals and Vector.hashCode are very inefficient and fail on SparseVectors with large size

2015-03-11 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reopened SPARK-5186: -- Re-opened this issue for branch-1.2. > Vector.equals and Vector.hashCode are very inefficient and

[jira] [Updated] (SPARK-5186) Vector.equals and Vector.hashCode are very inefficient and fail on SparseVectors with large size

2015-03-11 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5186: - Target Version/s: 1.1.2, 1.2.2 Affects Version/s: 1.1.1 1.2.1 > Vector

[jira] [Updated] (SPARK-6294) PySpark task may hang while call take() on in Java/Scala

2015-03-11 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6294: - Target Version/s: 1.2.2, 1.4.0, 1.3.1 (was: 1.2.2, 1.3.1) > PySpark task may hang while call take

[jira] [Updated] (SPARK-6294) PySpark task may hang while call take() on in Java/Scala

2015-03-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6294: - Fix Version/s: 1.3.1 1.4.0 > PySpark task may hang while call take() on in Java

[jira] [Resolved] (SPARK-5814) Remove JBLAS from runtime dependencies

2015-03-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-5814. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 4699 [https://githu

[jira] [Resolved] (SPARK-5186) Vector.equals and Vector.hashCode are very inefficient and fail on SparseVectors with large size

2015-03-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-5186. -- Resolution: Fixed Fix Version/s: (was: 1.3.0) 1.2.2 Issue resolved

[jira] [Updated] (SPARK-4001) Add FP-growth algorithm to Spark MLlib

2015-03-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4001: - Summary: Add FP-growth algorithm to Spark MLlib (was: Add Apriori algorithm to Spark MLlib) > Ad

[jira] [Commented] (SPARK-3424) KMeans Plus Plus is too slow

2015-03-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359241#comment-14359241 ] Xiangrui Meng commented on SPARK-3424: -- Ah, sorry! I typed your email manually in the

[jira] [Resolved] (SPARK-6268) KMeans parameter getter methods

2015-03-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6268. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 4974 [https://githu

[jira] [Resolved] (SPARK-6294) PySpark task may hang while call take() on in Java/Scala

2015-03-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6294. -- Resolution: Fixed Fix Version/s: (was: 1.3.1) (was: 1.4.0)

[jira] [Resolved] (SPARK-4588) Add API for feature attributes

2015-03-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-4588. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 4925 [https://githu

[jira] [Created] (SPARK-6308) VectorUDT is displayed as `vecto` in dtypes

2015-03-12 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-6308: Summary: VectorUDT is displayed as `vecto` in dtypes Key: SPARK-6308 URL: https://issues.apache.org/jira/browse/SPARK-6308 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-6309) Add MatrixUDT to support dense/sparse matrices in DataFrames

2015-03-12 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-6309: Summary: Add MatrixUDT to support dense/sparse matrices in DataFrames Key: SPARK-6309 URL: https://issues.apache.org/jira/browse/SPARK-6309 Project: Spark I

[jira] [Assigned] (SPARK-3735) Sending the factor directly or AtA based on the cost in ALS

2015-03-13 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reassigned SPARK-3735: Assignee: Xiangrui Meng > Sending the factor directly or AtA based on the cost in ALS > ---

[jira] [Commented] (SPARK-6323) Large rank matrix factorization with Nonlinear loss and constraints

2015-03-13 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360679#comment-14360679 ] Xiangrui Meng commented on SPARK-6323: -- [~debasish83] Please help me understand some

[jira] [Comment Edited] (SPARK-6323) Large rank matrix factorization with Nonlinear loss and constraints

2015-03-13 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360679#comment-14360679 ] Xiangrui Meng edited comment on SPARK-6323 at 3/13/15 5:13 PM: -

[jira] [Updated] (SPARK-6278) Mention the change of step size in the migration guide

2015-03-13 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6278: - Fix Version/s: 1.4.0 > Mention the change of step size in the migration guide > --

[jira] [Resolved] (SPARK-6278) Mention the change of step size in the migration guide

2015-03-13 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6278. -- Resolution: Fixed Fix Version/s: 1.3.1 Issue resolved by pull request 4978 [https://githu

[jira] [Resolved] (SPARK-6252) Scala NaiveBayes should expose getLambda

2015-03-13 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6252. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 4969 [https://githu

[jira] [Updated] (SPARK-6288) Pyrolite calls hashCode to cache previously serialized objects

2015-03-13 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6288: - Attachment: Screen Shot 2015-03-13 at 10.45.35 AM.png Attached a screenshot of YourKit profiling r

[jira] [Commented] (SPARK-6288) Pyrolite calls hashCode to cache previously serialized objects

2015-03-13 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360784#comment-14360784 ] Xiangrui Meng commented on SPARK-6288: -- [~joshrosen] The memoLookup cost is actually

[jira] [Commented] (SPARK-6288) Pyrolite calls hashCode to cache previously serialized objects

2015-03-13 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360883#comment-14360883 ] Xiangrui Meng commented on SPARK-6288: -- Yes, `memo` is a private variable. I sent a p

[jira] [Commented] (SPARK-6288) Pyrolite calls hashCode to cache previously serialized objects

2015-03-13 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14361037#comment-14361037 ] Xiangrui Meng commented on SPARK-6288: -- Test the following code: {code} from pyspark

[jira] [Updated] (SPARK-6345) Model update propagation during prediction in Streaming Regression

2015-03-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6345: - Target Version/s: 1.1.2, 1.2.2, 1.4.0, 1.3.1 (was: 1.3.1) > Model update propagation during predi

[jira] [Updated] (SPARK-6345) Model update propagation during prediction in Streaming Regression

2015-03-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6345: - Assignee: Jeremy Freeman > Model update propagation during prediction in Streaming Regression > --

[jira] [Created] (SPARK-6361) Support adding a column with metadata in DataFrames

2015-03-16 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-6361: Summary: Support adding a column with metadata in DataFrames Key: SPARK-6361 URL: https://issues.apache.org/jira/browse/SPARK-6361 Project: Spark Issue Type:

[jira] [Commented] (SPARK-6334) spark-local dir not getting cleared during ALS

2015-03-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363629#comment-14363629 ] Xiangrui Meng commented on SPARK-6334: -- https://issues.apache.org/jira/browse/SPARK-5

[jira] [Created] (SPARK-6364) hashCode and equals for Matrices

2015-03-16 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-6364: Summary: hashCode and equals for Matrices Key: SPARK-6364 URL: https://issues.apache.org/jira/browse/SPARK-6364 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-6336) LBFGS should document what convergenceTol means

2015-03-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6336: - Fix Version/s: (was: 1.3.0) 1.3.1 1.4.0 > LBFGS should d

[jira] [Resolved] (SPARK-6336) LBFGS should document what convergenceTol means

2015-03-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6336. -- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 5033 [https://githu

[jira] [Updated] (SPARK-6336) LBFGS should document what convergenceTol means

2015-03-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6336: - Assignee: Kai Sasaki > LBFGS should document what convergenceTol means > -

[jira] [Updated] (SPARK-6336) LBFGS should document what convergenceTol means

2015-03-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6336: - Target Version/s: 1.4.0, 1.3.1 (was: 1.4.0) > LBFGS should document what convergenceTol means > -

[jira] [Resolved] (SPARK-6226) Support model save/load in Python's KMeans

2015-03-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6226. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5049 [https://githu

[jira] [Updated] (SPARK-6390) Add MatrixUDT in PySpark

2015-03-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6390: - Component/s: PySpark > Add MatrixUDT in PySpark > > > Key

[jira] [Created] (SPARK-6390) Add MatrixUDT in PySpark

2015-03-17 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-6390: Summary: Add MatrixUDT in PySpark Key: SPARK-6390 URL: https://issues.apache.org/jira/browse/SPARK-6390 Project: Spark Issue Type: New Feature Comp

[jira] [Commented] (SPARK-6192) Enhance MLlib's Python API (GSoC 2015)

2015-03-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14365918#comment-14365918 ] Xiangrui Meng commented on SPARK-6192: -- [~MechCoder] Please be a little (but not too)

[jira] [Commented] (SPARK-6334) spark-local dir not getting cleared during ALS

2015-03-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14366021#comment-14366021 ] Xiangrui Meng commented on SPARK-6334: -- Couple suggestions before SPARK-5955 is imple

[jira] [Assigned] (SPARK-5955) Add checkpointInterval to ALS

2015-03-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reassigned SPARK-5955: Assignee: Xiangrui Meng > Add checkpointInterval to ALS > - > >

[jira] [Commented] (SPARK-6096) Support model save/load in Python's naive Bayes

2015-03-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14367280#comment-14367280 ] Xiangrui Meng commented on SPARK-6096: -- Done. > Support model save/load in Python's

[jira] [Updated] (SPARK-6096) Support model save/load in Python's naive Bayes

2015-03-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6096: - Assignee: Xusen Yin > Support model save/load in Python's naive Bayes > --

[jira] [Commented] (SPARK-5874) How to improve the current ML pipeline API?

2015-03-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14367309#comment-14367309 ] Xiangrui Meng commented on SPARK-5874: -- [~Elie A.] Thanks for your feedback! This JIR

[jira] [Resolved] (SPARK-6374) Add getter for GeneralizedLinearAlgorithm

2015-03-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6374. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5058 [https://githu

[jira] [Updated] (SPARK-6374) Add getter for GeneralizedLinearAlgorithm

2015-03-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6374: - Assignee: yuhao yang > Add getter for GeneralizedLinearAlgorithm > ---

[jira] [Updated] (SPARK-6308) VectorUDT is displayed as `vecto` in dtypes

2015-03-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6308: - Assignee: (was: Xiangrui Meng) > VectorUDT is displayed as `vecto` in dtypes > ---

[jira] [Resolved] (SPARK-6095) Support model save/load in Python's linear models

2015-03-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6095. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5016 [https://githu

[jira] [Resolved] (SPARK-5954) Add topByKey to pair RDDs

2015-03-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-5954. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5075 [https://githu

[jira] [Resolved] (SPARK-6096) Support model save/load in Python's naive Bayes

2015-03-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6096. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5090 [https://githu

[jira] [Resolved] (SPARK-5955) Add checkpointInterval to ALS

2015-03-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-5955. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5076 [https://githu

[jira] [Resolved] (SPARK-6309) Add MatrixUDT to support dense/sparse matrices in DataFrames

2015-03-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6309. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5048 [https://githu

[jira] [Updated] (SPARK-6390) Add MatrixUDT in PySpark

2015-03-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6390: - Assignee: Manoj Kumar > Add MatrixUDT in PySpark > > > Ke

[jira] [Updated] (SPARK-6309) Add MatrixUDT to support dense/sparse matrices in DataFrames

2015-03-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6309: - Assignee: Manoj Kumar > Add MatrixUDT to support dense/sparse matrices in DataFrames > ---

[jira] [Updated] (SPARK-6421) _regression_train_wrapper does not test initialWeights correctly

2015-03-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6421: - Assignee: Kai Sasaki > _regression_train_wrapper does not test initialWeights correctly >

[jira] [Resolved] (SPARK-6421) _regression_train_wrapper does not test initialWeights correctly

2015-03-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6421. -- Resolution: Fixed Fix Version/s: 1.3.1 Issue resolved by pull request 5101 [https://githu

[jira] [Updated] (SPARK-6421) _regression_train_wrapper does not test initialWeights correctly

2015-03-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6421: - Fix Version/s: 1.4.0 > _regression_train_wrapper does not test initialWeights correctly >

[jira] [Updated] (SPARK-6441) [MLLIB] Add Deflation/Schur Complement to Power Iteration Clustering for improved resilience to inter-class collisions

2015-03-21 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6441: - Issue Type: Improvement (was: Bug) > [MLLIB] Add Deflation/Schur Complement to Power Iteration Cl

[jira] [Updated] (SPARK-6441) Add Deflation/Schur Complement to Power Iteration Clustering for improved resilience to inter-class collisions

2015-03-21 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6441: - Summary: Add Deflation/Schur Complement to Power Iteration Clustering for improved resilience to i

[jira] [Commented] (SPARK-6441) Add Deflation/Schur Complement to Power Iteration Clustering for improved resilience to inter-class collisions

2015-03-21 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372761#comment-14372761 ] Xiangrui Meng commented on SPARK-6441: -- Let's test subtracting mean first before mult

[jira] [Updated] (SPARK-6441) Add Deflation/Schur Complement to Power Iteration Clustering for improved resilience to inter-class collisions

2015-03-21 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6441: - Assignee: Stephen Boesch > Add Deflation/Schur Complement to Power Iteration Clustering for improv

[jira] [Updated] (SPARK-6308) VectorUDT is displayed as `vecto` in dtypes

2015-03-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6308: - Assignee: Manoj Kumar > VectorUDT is displayed as `vecto` in dtypes >

[jira] [Commented] (SPARK-5954) Add topByKey to pair RDDs

2015-03-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376215#comment-14376215 ] Xiangrui Meng commented on SPARK-5954: -- Note: We added topByKey in mllib.rdd.MLPairRD

[jira] [Resolved] (SPARK-6308) VectorUDT is displayed as `vecto` in dtypes

2015-03-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6308. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5118 [https://githu

[jira] [Created] (SPARK-6475) DataFrame should support array types when creating DFs from JavaBeans.

2015-03-23 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-6475: Summary: DataFrame should support array types when creating DFs from JavaBeans. Key: SPARK-6475 URL: https://issues.apache.org/jira/browse/SPARK-6475 Project: Spark

[jira] [Assigned] (SPARK-6475) DataFrame should support array types when creating DFs from JavaBeans.

2015-03-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reassigned SPARK-6475: Assignee: Xiangrui Meng > DataFrame should support array types when creating DFs from JavaB

[jira] [Commented] (SPARK-1006) MLlib ALS gets stack overflow with too many iterations

2015-03-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377152#comment-14377152 ] Xiangrui Meng commented on SPARK-1006: -- This is fixed as part of SPARK-5955, where we

[jira] [Commented] (SPARK-6192) Enhance MLlib's Python API (GSoC 2015)

2015-03-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377197#comment-14377197 ] Xiangrui Meng commented on SPARK-6192: -- Thanks for the update! The current version lo

[jira] [Commented] (SPARK-3735) Sending the factor directly or AtA based on the cost in ALS

2015-03-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377200#comment-14377200 ] Xiangrui Meng commented on SPARK-3735: -- The proposal is actually something different.

[jira] [Resolved] (SPARK-6334) spark-local dir not getting cleared during ALS

2015-03-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6334. -- Resolution: Duplicate SPARK-5955 was merged. So if you can use the latest master, you can set c

[jira] [Commented] (SPARK-3278) Isotonic regression

2015-03-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377205#comment-14377205 ] Xiangrui Meng commented on SPARK-3278: -- Did you try truncating the digits of x to red

[jira] [Commented] (SPARK-6100) Distributed linear algebra in PySpark/MLlib

2015-03-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377208#comment-14377208 ] Xiangrui Meng commented on SPARK-6100: -- We don't have APIs for distributed matrices i

[jira] [Created] (SPARK-6485) Add CoordinateMatrix/RowMatrix/IndexedRowMatrix in PySpark

2015-03-23 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-6485: Summary: Add CoordinateMatrix/RowMatrix/IndexedRowMatrix in PySpark Key: SPARK-6485 URL: https://issues.apache.org/jira/browse/SPARK-6485 Project: Spark Issu

[jira] [Created] (SPARK-6486) Add BlockMatrix in PySpark

2015-03-23 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-6486: Summary: Add BlockMatrix in PySpark Key: SPARK-6486 URL: https://issues.apache.org/jira/browse/SPARK-6486 Project: Spark Issue Type: Sub-task Compo

[jira] [Updated] (SPARK-6486) Add BlockMatrix in PySpark

2015-03-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6486: - Description: We should add BlockMatrix to PySpark. Internally, we can use DataFrames and MatrixUDT

[jira] [Created] (SPARK-6488) Support addition/multiplication in PySpark's BlockMatrix

2015-03-23 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-6488: Summary: Support addition/multiplication in PySpark's BlockMatrix Key: SPARK-6488 URL: https://issues.apache.org/jira/browse/SPARK-6488 Project: Spark Issue

[jira] [Commented] (SPARK-4036) Add Conditional Random Fields (CRF) algorithm to Spark MLlib

2015-03-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377223#comment-14377223 ] Xiangrui Meng commented on SPARK-4036: -- You don't have to use or change the Optimizer

[jira] [Commented] (SPARK-6487) Add sequential pattern mining algorithm to Spark MLlib

2015-03-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377234#comment-14377234 ] Xiangrui Meng commented on SPARK-6487: -- [~Zhang JiaJin] I'm not very familiar with pa

[jira] [Updated] (SPARK-5692) Model import/export for Word2Vec

2015-03-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5692: - Assignee: Manoj Kumar (was: ANUPAM MEDIRATTA) > Model import/export for Word2Vec > --

[jira] [Commented] (SPARK-5692) Model import/export for Word2Vec

2015-03-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377266#comment-14377266 ] Xiangrui Meng commented on SPARK-5692: -- [~anupamme] You should get familiar with Scal

[jira] [Resolved] (SPARK-6475) DataFrame should support array types when creating DFs from JavaBeans.

2015-03-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6475. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5146 [https://githu

[jira] [Updated] (SPARK-5955) Add checkpointInterval to ALS

2015-03-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5955: - Target Version/s: 1.3.1, 1.4.0 (was: 1.4.0) Affects Version/s: 1.3.0 Fix Version/s: 1

[jira] [Created] (SPARK-6512) Add contains to OpenHashMap

2015-03-24 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-6512: Summary: Add contains to OpenHashMap Key: SPARK-6512 URL: https://issues.apache.org/jira/browse/SPARK-6512 Project: Spark Issue Type: Improvement C

[jira] [Created] (SPARK-6515) OpenHashSet returns invalid position when the data size is 1

2015-03-24 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-6515: Summary: OpenHashSet returns invalid position when the data size is 1 Key: SPARK-6515 URL: https://issues.apache.org/jira/browse/SPARK-6515 Project: Spark I

<    5   6   7   8   9   10   11   12   13   14   >