[jira] [Commented] (SPARK-20506) ML, Graph 2.2 QA: Programming guide update and migration guide

2017-05-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015362#comment-16015362 ] Nick Pentreath commented on SPARK-20506: Cool - I've added a section before the Migration Guide

[jira] [Commented] (SPARK-14174) Accelerate KMeans via Mini-Batch EM

2017-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014047#comment-16014047 ] Nick Pentreath commented on SPARK-14174: [~podongfeng] did you manage to look into some

[jira] [Commented] (SPARK-6000) Batch K-Means clusters should support "mini-batch" updates

2017-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014046#comment-16014046 ] Nick Pentreath commented on SPARK-6000: --- Even though SPARK-14174 is later - it seems there is more

[jira] [Commented] (SPARK-6349) Add probability estimates in SVMModel predict result

2017-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014043#comment-16014043 ] Nick Pentreath commented on SPARK-6349: --- This is now covered by {{ml}}'s {{LinearSVC}}. Shall we

[jira] [Commented] (SPARK-6417) Add Linear Programming algorithm

2017-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014041#comment-16014041 ] Nick Pentreath commented on SPARK-6417: --- I think it's fairly safe to say there is not much bandwidth

[jira] [Closed] (SPARK-6417) Add Linear Programming algorithm

2017-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath closed SPARK-6417. - Resolution: Won't Fix > Add Linear Programming algorithm > - > >

[jira] [Commented] (SPARK-7290) Add StringVectorizer

2017-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014039#comment-16014039 ] Nick Pentreath commented on SPARK-7290: --- Is this still desired? Seems it perhaps doesn't add that

[jira] [Commented] (SPARK-3181) Add Robust Regression Algorithm with Huber Estimator

2017-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014026#comment-16014026 ] Nick Pentreath commented on SPARK-3181: --- So the Breeze bug is fixed now right? Will this be revived?

[jira] [Closed] (SPARK-5328) Update PySpark MLlib NaiveBayes API to take model type parameter for Bernoulli fit

2017-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath closed SPARK-5328. - Resolution: Won't Fix > Update PySpark MLlib NaiveBayes API to take model type parameter for >

[jira] [Commented] (SPARK-5328) Update PySpark MLlib NaiveBayes API to take model type parameter for Bernoulli fit

2017-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014005#comment-16014005 ] Nick Pentreath commented on SPARK-5328: --- This is pretty stale so I'll close it off, since it's now

[jira] [Commented] (SPARK-1503) Implement Nesterov's accelerated first-order method

2017-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013999#comment-16013999 ] Nick Pentreath commented on SPARK-1503: --- I think it's safe to say this won't go into Spark core,

[jira] [Comment Edited] (SPARK-1503) Implement Nesterov's accelerated first-order method

2017-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013999#comment-16013999 ] Nick Pentreath edited comment on SPARK-1503 at 5/17/17 1:13 PM: I think

[jira] [Commented] (SPARK-1359) SGD implementation is not efficient

2017-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013998#comment-16013998 ] Nick Pentreath commented on SPARK-1359: --- Do we care much about this now, since {{mllib}}'s SGD is in

[jira] [Closed] (SPARK-12015) Auto convert int to Double when required in pyspark.ml

2017-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath closed SPARK-12015. -- Resolution: Duplicate > Auto convert int to Double when required in pyspark.ml >

[jira] [Commented] (SPARK-12015) Auto convert int to Double when required in pyspark.ml

2017-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013996#comment-16013996 ] Nick Pentreath commented on SPARK-12015: This was fixed in SPARK-7425 - closing as duplicate. >

[jira] [Updated] (SPARK-20723) Random Forest Classifier should expose intermediateRDDStorageLevel similar to ALS

2017-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-20723: --- Target Version/s: (was: 2.3.0) > Random Forest Classifier should expose

[jira] [Updated] (SPARK-20723) Random Forest Classifier should expose intermediateRDDStorageLevel similar to ALS

2017-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-20723: --- Affects Version/s: (was: 2.3.0) 2.2.0 > Random Forest Classifier

[jira] [Commented] (SPARK-20723) Random Forest Classifier should expose intermediateRDDStorageLevel similar to ALS

2017-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013972#comment-16013972 ] Nick Pentreath commented on SPARK-20723: Please don't set Target Version by the way - committers

[jira] [Commented] (SPARK-20503) ML 2.2 QA: API: Python API coverage

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012214#comment-16012214 ] Nick Pentreath commented on SPARK-20503: Checked all above for doc & user guide consistency

[jira] [Resolved] (SPARK-20503) ML 2.2 QA: API: Python API coverage

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-20503. Resolution: Done > ML 2.2 QA: API: Python API coverage >

[jira] [Created] (SPARK-20768) PySpark FPGrowth does not expose numPartitions (expert) param

2017-05-16 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-20768: -- Summary: PySpark FPGrowth does not expose numPartitions (expert) param Key: SPARK-20768 URL: https://issues.apache.org/jira/browse/SPARK-20768 Project: Spark

[jira] [Reopened] (SPARK-20503) ML 2.2 QA: API: Python API coverage

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reopened SPARK-20503: > ML 2.2 QA: API: Python API coverage > --- > >

[jira] [Commented] (SPARK-20506) ML, Graph 2.2 QA: Programming guide update and migration guide

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012106#comment-16012106 ] Nick Pentreath commented on SPARK-20506: Sent PR for updated migration guide only. I didn't find

[jira] [Comment Edited] (SPARK-20503) ML 2.2 QA: API: Python API coverage

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012101#comment-16012101 ] Nick Pentreath edited comment on SPARK-20503 at 5/16/17 10:21 AM: -- I

[jira] [Reopened] (SPARK-20503) ML 2.2 QA: API: Python API coverage

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reopened SPARK-20503: > ML 2.2 QA: API: Python API coverage > --- > >

[jira] [Resolved] (SPARK-20503) ML 2.2 QA: API: Python API coverage

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-20503. Resolution: Done > ML 2.2 QA: API: Python API coverage >

[jira] [Resolved] (SPARK-20503) ML 2.2 QA: API: Python API coverage

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-20503. Resolution: Resolved > ML 2.2 QA: API: Python API coverage >

[jira] [Comment Edited] (SPARK-20503) ML 2.2 QA: API: Python API coverage

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012010#comment-16012010 ] Nick Pentreath edited comment on SPARK-20503 at 5/16/17 10:18 AM: --

[jira] [Commented] (SPARK-20503) ML 2.2 QA: API: Python API coverage

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012101#comment-16012101 ] Nick Pentreath commented on SPARK-20503: I think I've highlighted all the API gaps in the links

[jira] [Commented] (SPARK-20506) ML, Graph 2.2 QA: Programming guide update and migration guide

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012098#comment-16012098 ] Nick Pentreath commented on SPARK-20506: Hey [~josephkb] [~yanboliang] [~srowen] [~felixcheung]

[jira] [Updated] (SPARK-19940) FPGrowthModel.transform should skip duplicated items

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-19940: --- Description: Due to misplaced {{distinct}} {{FPGrowthModel.transform}} generates duplicated

[jira] [Comment Edited] (SPARK-20503) ML 2.2 QA: API: Python API coverage

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012010#comment-16012010 ] Nick Pentreath edited comment on SPARK-20503 at 5/16/17 9:22 AM: -

[jira] [Updated] (SPARK-20764) Fix visibility discrepancy with numInstances and degreesOfFreedom in LR and GLR - Python version

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-20764: --- Affects Version/s: (was: 2.1.1) 2.2.0 > Fix visibility

[jira] [Updated] (SPARK-20764) Fix visibility discrepancy with numInstances and degreesOfFreedom in LR and GLR - Python version

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-20764: --- Description: SPARK-20097 exposed {{degreesOfFreedom}} in {{LinearRegressionSummary}} and

[jira] [Created] (SPARK-20764) Fix visibility discrepancy with numInstances and degreesOfFreedom in LR and GLR - Python version

2017-05-16 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-20764: -- Summary: Fix visibility discrepancy with numInstances and degreesOfFreedom in LR and GLR - Python version Key: SPARK-20764 URL:

[jira] [Resolved] (SPARK-20677) Clean up ALS recommend all improvement code.

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-20677. Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17919

[jira] [Assigned] (SPARK-20553) Update ALS examples for ML to illustrate recommend all

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-20553: -- Assignee: Nick Pentreath > Update ALS examples for ML to illustrate recommend all >

[jira] [Resolved] (SPARK-20553) Update ALS examples for ML to illustrate recommend all

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-20553. Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17950

[jira] [Commented] (SPARK-20503) ML 2.2 QA: API: Python API coverage

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012010#comment-16012010 ] Nick Pentreath commented on SPARK-20503: Checked: * {{ALS}}: ** {{coldStartStrategy}} param

[jira] [Assigned] (SPARK-20503) ML 2.2 QA: API: Python API coverage

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-20503: -- Assignee: Nick Pentreath > ML 2.2 QA: API: Python API coverage >

[jira] [Commented] (SPARK-20506) ML, Graph 2.2 QA: Programming guide update and migration guide

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011985#comment-16011985 ] Nick Pentreath commented on SPARK-20506: I'm checking into any other behavior changes that need

[jira] [Commented] (SPARK-20502) ML, Graph 2.2 QA: API: Experimental, DeveloperApi, final, sealed audit

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011976#comment-16011976 ] Nick Pentreath commented on SPARK-20502: Sounds good to me. > ML, Graph 2.2 QA: API:

[jira] [Assigned] (SPARK-20506) ML, Graph 2.2 QA: Programming guide update and migration guide

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-20506: -- Assignee: Nick Pentreath > ML, Graph 2.2 QA: Programming guide update and migration

[jira] [Commented] (SPARK-20506) ML, Graph 2.2 QA: Programming guide update and migration guide

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011970#comment-16011970 ] Nick Pentreath commented on SPARK-20506: As per SPARK-20707 no deprecated methods were removed in

[jira] [Commented] (SPARK-20506) ML, Graph 2.2 QA: Programming guide update and migration guide

2017-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011950#comment-16011950 ] Nick Pentreath commented on SPARK-20506: SPARK-19787 changed the default reg param value for

[jira] [Commented] (SPARK-20711) MultivariateOnlineSummarizer incorrect min/max for identical NaN feature

2017-05-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007739#comment-16007739 ] Nick Pentreath commented on SPARK-20711: Shouldn't the stats for any column that contains at

[jira] [Closed] (SPARK-8402) Add DP means clustering to MLlib

2017-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath closed SPARK-8402. - Resolution: Won't Fix > Add DP means clustering to MLlib > > >

[jira] [Commented] (SPARK-8402) Add DP means clustering to MLlib

2017-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006060#comment-16006060 ] Nick Pentreath commented on SPARK-8402: --- I'm afraid I would say there is not sufficient demand or

[jira] [Commented] (SPARK-11669) Python interface to SparkR GLM module

2017-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006056#comment-16006056 ] Nick Pentreath commented on SPARK-11669: I think this can be closed as its done in

[jira] [Commented] (SPARK-20503) ML 2.2 QA: API: Python API coverage

2017-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006038#comment-16006038 ] Nick Pentreath commented on SPARK-20503: If SPARK-20602 and/or SPARK-20348 are completed, Python

[jira] [Updated] (SPARK-20679) Let ML ALS recommend for a subset of users/items

2017-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-20679: --- Summary: Let ML ALS recommend for a subset of users/items (was: Let ALS recommend for a

[jira] [Commented] (SPARK-20679) Let ALS recommend for a subset of users/items

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002360#comment-16002360 ] Nick Pentreath commented on SPARK-20679: I'm working on this > Let ALS recommend for a subset of

[jira] [Commented] (SPARK-10802) Let ALS recommend for subset of data

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002343#comment-16002343 ] Nick Pentreath commented on SPARK-10802: Hey folks - since the {{ALSModel}} in the ML API now

[jira] [Created] (SPARK-20679) Let ALS recommend for a subset of users/items

2017-05-09 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-20679: -- Summary: Let ALS recommend for a subset of users/items Key: SPARK-20679 URL: https://issues.apache.org/jira/browse/SPARK-20679 Project: Spark Issue

[jira] [Comment Edited] (SPARK-10408) Autoencoder

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002326#comment-16002326 ] Nick Pentreath edited comment on SPARK-10408 at 5/9/17 9:00 AM: What is

[jira] [Commented] (SPARK-10408) Autoencoder

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002326#comment-16002326 ] Nick Pentreath commented on SPARK-10408: What is the status here? I think it's fairly safe to say

[jira] [Commented] (SPARK-6323) Large rank matrix factorization with Nonlinear loss and constraints

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002310#comment-16002310 ] Nick Pentreath commented on SPARK-6323: --- I think it is safe to say this will not be feasible to

[jira] [Resolved] (SPARK-20587) Improve performance of ML ALS recommendForAll

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-20587. Resolution: Fixed Fix Version/s: 2.2.1 Issue resolved by pull request 17845

[jira] [Resolved] (SPARK-11968) ALS recommend all methods spend most of time in GC

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-11968. Resolution: Fixed Fix Version/s: 2.2.1 Issue resolved by pull request 17742

[jira] [Assigned] (SPARK-11968) ALS recommend all methods spend most of time in GC

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-11968: -- Assignee: Peng Meng (was: Nick Pentreath) > ALS recommend all methods spend most of

[jira] [Assigned] (SPARK-20677) Clean up ALS recommend all improvement code.

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-20677: -- Assignee: Nick Pentreath > Clean up ALS recommend all improvement code. >

[jira] [Updated] (SPARK-20677) Clean up ALS recommend all improvement code.

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-20677: --- Description: SPARK-11968 and SPARK-20587 added performance improvements to the "recommend

[jira] [Updated] (SPARK-20677) Clean up ALS recommend all improvement code.

2017-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-20677: --- Description: SPARK-11968 and SPARK-20587 added performance improvements to the "recommend

[jira] [Created] (SPARK-20677) Clean up ALS recommend all improvement code.

2017-05-09 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-20677: -- Summary: Clean up ALS recommend all improvement code. Key: SPARK-20677 URL: https://issues.apache.org/jira/browse/SPARK-20677 Project: Spark Issue

[jira] [Updated] (SPARK-20596) Improve ALS recommend all test cases

2017-05-08 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-20596: --- Fix Version/s: (was: 2.2.0) 2.2.1 > Improve ALS recommend all test

[jira] [Resolved] (SPARK-20596) Improve ALS recommend all test cases

2017-05-08 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-20596. Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17860

[jira] [Commented] (SPARK-20503) ML 2.2 QA: API: Python API coverage

2017-05-04 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996676#comment-15996676 ] Nick Pentreath commented on SPARK-20503: cc [~holdenk] [~bryanc] [~zero323]? I can take it if

[jira] [Commented] (SPARK-20501) ML, Graph 2.2 QA: API: New Scala APIs, docs

2017-05-04 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996675#comment-15996675 ] Nick Pentreath commented on SPARK-20501: Things that would need to be checked include: #

[jira] [Updated] (SPARK-20499) Spark MLlib, GraphX 2.2 QA umbrella

2017-05-04 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-20499: --- Description: This JIRA lists tasks for the next Spark release's QA period for MLlib and

[jira] [Updated] (SPARK-20596) Improve ALS recommend all test cases

2017-05-04 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-20596: --- Component/s: Tests > Improve ALS recommend all test cases >

[jira] [Created] (SPARK-20596) Improve ALS recommend all test cases

2017-05-04 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-20596: -- Summary: Improve ALS recommend all test cases Key: SPARK-20596 URL: https://issues.apache.org/jira/browse/SPARK-20596 Project: Spark Issue Type: Test

[jira] [Updated] (SPARK-20596) Improve ALS recommend all test cases

2017-05-04 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-20596: --- Affects Version/s: (was: 2.1.0) 2.2.0 > Improve ALS recommend all

[jira] [Created] (SPARK-20587) Improve performance of ML ALS recommendForAll

2017-05-03 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-20587: -- Summary: Improve performance of ML ALS recommendForAll Key: SPARK-20587 URL: https://issues.apache.org/jira/browse/SPARK-20587 Project: Spark Issue

[jira] [Resolved] (SPARK-6227) PCA and SVD for PySpark

2017-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-6227. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17621

[jira] [Created] (SPARK-20553) Update ALS examples for ML to illustrate recommend all

2017-05-02 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-20553: -- Summary: Update ALS examples for ML to illustrate recommend all Key: SPARK-20553 URL: https://issues.apache.org/jira/browse/SPARK-20553 Project: Spark

[jira] [Resolved] (SPARK-20300) Python API for ALSModel.recommendForAllUsers,Items

2017-05-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-20300. Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17622

[jira] [Commented] (SPARK-20443) The blockSize of MLLIB ALS should be setting by the User

2017-05-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15992497#comment-15992497 ] Nick Pentreath commented on SPARK-20443: Interesting - though it appears to me that {{2048}} is

[jira] [Commented] (SPARK-20551) ImportError adding custom class from jar in pyspark

2017-05-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15992491#comment-15992491 ] Nick Pentreath commented on SPARK-20551: Yes I agree that it appears you're trying to import Java

[jira] [Closed] (SPARK-20551) ImportError adding custom class from jar in pyspark

2017-05-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath closed SPARK-20551. -- Resolution: Not A Problem > ImportError adding custom class from jar in pyspark >

[jira] [Commented] (SPARK-20443) The blockSize of MLLIB ALS should be setting by the User

2017-05-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15992475#comment-15992475 ] Nick Pentreath commented on SPARK-20443: Were these tests against existing master? Because

[jira] [Assigned] (SPARK-20300) Python API for ALSModel.recommendForAllUsers,Items

2017-04-30 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-20300: -- Assignee: Nick Pentreath > Python API for ALSModel.recommendForAllUsers,Items >

[jira] [Commented] (SPARK-20469) Add a method to display DataFrame schema in PipelineStage

2017-04-27 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987219#comment-15987219 ] Nick Pentreath commented on SPARK-20469: Pipeline stages themselves have no concept of schema.

[jira] [Commented] (SPARK-11968) ALS recommend all methods spend most of time in GC

2017-04-26 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984533#comment-15984533 ] Nick Pentreath commented on SPARK-11968: Thanks - in the meantime I will take a look at the PR.

[jira] [Commented] (SPARK-20443) The blockSize of MLLIB ALS should be setting by the User

2017-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983050#comment-15983050 ] Nick Pentreath commented on SPARK-20443: Your PR for SPARK-20446 / SPARK11968 should largely

[jira] [Commented] (SPARK-11968) ALS recommend all methods spend most of time in GC

2017-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983048#comment-15983048 ] Nick Pentreath commented on SPARK-11968: [~peng.m...@intel.com] would you mind posting your

[jira] [Closed] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath closed SPARK-20446. -- Resolution: Duplicate > Optimize the process of MLLIB ALS recommendForAll >

[jira] [Commented] (SPARK-11968) ALS recommend all methods spend most of time in GC

2017-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983026#comment-15983026 ] Nick Pentreath commented on SPARK-11968: Note, there is a solution proposed in SPARK-20446. I've

[jira] [Commented] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983018#comment-15983018 ] Nick Pentreath commented on SPARK-20446: By the way when I say it is a duplicate I mean for the

[jira] [Commented] (SPARK-13857) Feature parity for ALS ML with MLLIB

2017-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982695#comment-15982695 ] Nick Pentreath commented on SPARK-13857: I'm going to close this as superseded by SPARK-19535.

[jira] [Closed] (SPARK-13857) Feature parity for ALS ML with MLLIB

2017-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath closed SPARK-13857. -- Resolution: Duplicate > Feature parity for ALS ML with MLLIB >

[jira] [Commented] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981648#comment-15981648 ] Nick Pentreath commented on SPARK-20446: By "compare to DataFrame implementation" I mean the

[jira] [Commented] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981134#comment-15981134 ] Nick Pentreath commented on SPARK-20446: Also would be good to compare to the new {{DataFrame}}

[jira] [Commented] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981129#comment-15981129 ] Nick Pentreath commented on SPARK-20446: Anyway I'd like to compare the approaches and see which

[jira] [Commented] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981122#comment-15981122 ] Nick Pentreath commented on SPARK-20446: The GC would come from the temp result array in the

[jira] [Commented] (SPARK-20446) Optimize the process of MLLIB ALS recommendForAll

2017-04-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981066#comment-15981066 ] Nick Pentreath commented on SPARK-20446: This is really a duplicate of

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-21 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15978157#comment-15978157 ] Nick Pentreath commented on SPARK-20392: cc [~viirya] > Slow performance when calling fit on ML

[jira] [Resolved] (SPARK-20097) Fix visibility discrepancy with numInstances and degreesOfFreedom in LR and GLR

2017-04-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-20097. Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17431

[jira] [Commented] (SPARK-4038) Outlier Detection Algorithm for MLlib

2017-04-07 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960537#comment-15960537 ] Nick Pentreath commented on SPARK-4038: --- I don't think there can be a reasonable expectation of

[jira] [Commented] (SPARK-17716) Hidden Markov Model (HMM)

2017-04-07 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960534#comment-15960534 ] Nick Pentreath commented on SPARK-17716: I don't think we can expect sufficient committer

[jira] [Commented] (SPARK-3903) Create general data loading method for LabeledPoints

2017-04-07 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960531#comment-15960531 ] Nick Pentreath commented on SPARK-3903: --- I think given the move to DataFrames, and that we can load

<    1   2   3   4   5   6   7   8   9   10   >