[jira] [Comment Edited] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-06-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293316#comment-15293316 ] Nick Pentreath edited comment on SPARK-14810 at 6/20/16 8:17 AM: - List

[jira] [Comment Edited] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-06-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293316#comment-15293316 ] Nick Pentreath edited comment on SPARK-14810 at 6/20/16 8:16 AM: - List

[jira] [Created] (SPARK-16063) Add getStorageLevel to Dataset

2016-06-20 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-16063: -- Summary: Add getStorageLevel to Dataset Key: SPARK-16063 URL: https://issues.apache.org/jira/browse/SPARK-16063 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-15501) ML 2.0 QA: Scala APIs audit for recommendation

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335961#comment-15335961 ] Nick Pentreath commented on SPARK-15501: It's done - resolved it. > ML 2.0 QA: Scala APIs au

[jira] [Resolved] (SPARK-15501) ML 2.0 QA: Scala APIs audit for recommendation

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15501. Resolution: Fixed Fix Version/s: 2.0.0 > ML 2.0 QA: Scala APIs au

[jira] [Resolved] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15447. Resolution: Fixed Fix Version/s: 2.0.0 > Performance test for ALS in Spark

[jira] [Commented] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335956#comment-15335956 ] Nick Pentreath commented on SPARK-15447: Finalized results in the linked Google sheet. Also

[jira] [Updated] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15447: --- Description: We made several changes to ALS in 2.0. It is necessary to run some tests

[jira] [Updated] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15447: --- Description: We made several changes to ALS in 2.0. It is necessary to run some tests

[jira] [Commented] (SPARK-15995) Gradient Boosted Trees - handling of Categorical Inputs

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335801#comment-15335801 ] Nick Pentreath commented on SPARK-15995: cc [~sethah] > Gradient Boosted Trees - handl

[jira] [Updated] (SPARK-16008) ML Logistic Regression aggregator serializes unnecessary data

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-16008: --- Assignee: Seth Hendrickson > ML Logistic Regression aggregator serializes unnecessary d

[jira] [Updated] (SPARK-15997) Audit ml.feature Update documentation for ml feature transformers

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15997: --- Assignee: Gayathri Murali > Audit ml.feature Update documentation for ml feat

[jira] [Commented] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1530#comment-1530 ] Nick Pentreath commented on SPARK-15447: Almost there - I'll be able to close this off by Friday

[jira] [Commented] (SPARK-15746) SchemaUtils.checkColumnType with VectorUDT prints instance details in error message

2016-06-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327237#comment-15327237 ] Nick Pentreath commented on SPARK-15746: I think you can go ahead now - I also vote

[jira] [Commented] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327220#comment-15327220 ] Nick Pentreath commented on SPARK-15904: Could you explain why you're using K>3000 when y

[jira] [Commented] (SPARK-15790) Audit @Since annotations in ML

2016-06-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327193#comment-15327193 ] Nick Pentreath commented on SPARK-15790: Yes, I've just looked at things in the concrete classes

[jira] [Commented] (SPARK-15790) Audit @Since annotations in ML

2016-06-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327028#comment-15327028 ] Nick Pentreath commented on SPARK-15790: Ah thanks - missed that umbrella. It's actually really

[jira] [Resolved] (SPARK-15788) PySpark IDFModel missing "idf" property

2016-06-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15788. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13540 [https

[jira] [Created] (SPARK-15790) Audit @Since annotations in ML

2016-06-06 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15790: -- Summary: Audit @Since annotations in ML Key: SPARK-15790 URL: https://issues.apache.org/jira/browse/SPARK-15790 Project: Spark Issue Type: Documentation

[jira] [Created] (SPARK-15788) PySpark IDFModel missing "idf" property

2016-06-06 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15788: -- Summary: PySpark IDFModel missing "idf" property Key: SPARK-15788 URL: https://issues.apache.org/jira/browse/SPARK-15788 Project: Spark

Re: Welcoming Yanbo Liang as a committer

2016-06-04 Thread Nick Pentreath
Congratulations Yanbo and welcome On Sat, 4 Jun 2016 at 10:17, Hortonworks wrote: > Congratulations, Yanbo > > Zhan Zhang > > Sent from my iPhone > > > On Jun 3, 2016, at 8:39 PM, Dongjoon Hyun wrote: > > > > Congratulations > > -- > CONFIDENTIALITY

[jira] [Updated] (SPARK-15761) pyspark shell should load if PYSPARK_DRIVER_PYTHON is ipython an Python3

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15761: --- Assignee: Manoj Kumar > pyspark shell should load if PYSPARK_DRIVER_PYTHON is ipyt

[jira] [Resolved] (SPARK-15168) Add missing params to Python's MultilayerPerceptronClassifier

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15168. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12943 [https

[jira] [Updated] (SPARK-15168) Add missing params to Python's MultilayerPerceptronClassifier

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15168: --- Assignee: holdenk > Add missing params to Python's MultilayerPerceptronClassif

[jira] [Commented] (SPARK-15746) SchemaUtils.checkColumnType with VectorUDT prints instance details in error message

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15314627#comment-15314627 ] Nick Pentreath commented on SPARK-15746: I'd say hold off on working on it until we decide which

[jira] [Commented] (SPARK-14811) ML, Graph 2.0 QA: API: New Scala APIs, docs

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15314489#comment-15314489 ] Nick Pentreath commented on SPARK-14811: Yes, that does make sense. I will take a pass through

[jira] [Comment Edited] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15314441#comment-15314441 ] Nick Pentreath edited comment on SPARK-15447 at 6/3/16 5:22 PM: Added

[jira] [Commented] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15314441#comment-15314441 ] Nick Pentreath commented on SPARK-15447: Added a second tab to the sheet for testing DF-based API

[jira] [Updated] (SPARK-15746) SchemaUtils.checkColumnType with VectorUDT prints instance details in error message

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15746: --- Summary: SchemaUtils.checkColumnType with VectorUDT prints instance details in error message

[jira] [Created] (SPARK-15746) SchemaUtils.checkColumnType with VectorUDT prints instance details

2016-06-02 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15746: -- Summary: SchemaUtils.checkColumnType with VectorUDT prints instance details Key: SPARK-15746 URL: https://issues.apache.org/jira/browse/SPARK-15746 Project

[jira] [Resolved] (SPARK-15668) ml.feature: update check schema to avoid confusion when user use MLlib.vector as input type

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15668. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13411 [https

[jira] [Updated] (SPARK-15139) PySpark TreeEnsemble missing methods

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15139: --- Assignee: holdenk > PySpark TreeEnsemble missing meth

Re: Classpath hell and Elasticsearch 2.3.2...

2016-06-02 Thread Nick Pentreath
.. thanks Nick. Figured that out since your last email... I deleted > the 2.10 by accident but then put 2+2 together. > > Got it working now. > > Still sticking to my story that it's somewhat complicated to setup :) > > Kevin > > On Thu, Jun 2, 2016 at 3:59 PM, Nick Pentreat

Re: Classpath hell and Elasticsearch 2.3.2...

2016-06-02 Thread Nick Pentreath
voke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181

[jira] [Resolved] (SPARK-15139) PySpark TreeEnsemble missing methods

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15139. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12919 [https

[jira] [Resolved] (SPARK-15092) toDebugString missing from ML DecisionTreeClassifier

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15092. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12919 [https

Re: Classpath hell and Elasticsearch 2.3.2...

2016-06-02 Thread Nick Pentreath
Hey there When I used es-hadoop, I just pulled in the dependency into my pom.xml, with spark as a "provided" dependency, and built a fat jar with assembly. Then with spark-submit use the --jars option to include your assembly jar (IIRC I sometimes also needed to use --driver-classpath too, but

[jira] [Comment Edited] (SPARK-14811) ML, Graph 2.0 QA: API: New Scala APIs, docs

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313208#comment-15313208 ] Nick Pentreath edited comment on SPARK-14811 at 6/2/16 10:31 PM

[jira] [Comment Edited] (SPARK-14811) ML, Graph 2.0 QA: API: New Scala APIs, docs

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313208#comment-15313208 ] Nick Pentreath edited comment on SPARK-14811 at 6/2/16 10:31 PM

[jira] [Commented] (SPARK-14811) ML, Graph 2.0 QA: API: New Scala APIs, docs

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313208#comment-15313208 ] Nick Pentreath commented on SPARK-14811: Question on this - we seem to be inconsistent

[jira] [Updated] (SPARK-15668) ml.feature: update check schema to avoid confusion when user use MLlib.vector as input type

2016-06-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15668: --- Assignee: yuhao yang > ml.feature: update check schema to avoid confusion when user

[jira] [Updated] (SPARK-15164) Mark classification algorithms as experimental where marked so in scala

2016-06-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15164: --- Assignee: holdenk > Mark classification algorithms as experimental where marked so in sc

[jira] [Updated] (SPARK-15162) Update PySpark LogisticRegression threshold PyDoc to be as complete as Scaladoc

2016-06-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15162: --- Assignee: holdenk > Update PySpark LogisticRegression threshold PyDoc to be as compl

[jira] [Comment Edited] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-06-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293316#comment-15293316 ] Nick Pentreath edited comment on SPARK-14810 at 6/1/16 5:56 PM: List

[jira] [Updated] (SPARK-15587) ML 2.0 QA: Scala APIs audit for feature

2016-06-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15587: --- Assignee: Yanbo Liang > ML 2.0 QA: Scala APIs audit for feat

[jira] [Resolved] (SPARK-15587) ML 2.0 QA: Scala APIs audit for feature

2016-06-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15587. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13410 [https

[jira] [Commented] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-05-31 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308797#comment-15308797 ] Nick Pentreath commented on SPARK-15447: Created a Google sheet with initial results: https

[jira] [Commented] (SPARK-15575) Remove breeze from dependencies?

2016-05-27 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304546#comment-15304546 ] Nick Pentreath commented on SPARK-15575: What specifically are the "performance i

[jira] [Resolved] (SPARK-15492) Binarization scala example copy & paste to spark-shell error

2016-05-26 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15492. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13266 [https

[jira] [Resolved] (SPARK-15500) Remove defaults in storage level param doc in ALS

2016-05-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15500. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13277 [https

Re: Cannot build master with sbt

2016-05-25 Thread Nick Pentreath
I've filed https://issues.apache.org/jira/browse/SPARK-15525 For now, you would have to check out sbt-antlr4 at https://github.com/ihji/sbt-antlr4/commit/23eab68b392681a7a09f6766850785afe8dfa53d (since I don't see any branches or tags in the github repo for different versions), and sbt

[jira] [Created] (SPARK-15525) Clean sbt build fails to resolve sbt-antlr4 plugin

2016-05-25 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15525: -- Summary: Clean sbt build fails to resolve sbt-antlr4 plugin Key: SPARK-15525 URL: https://issues.apache.org/jira/browse/SPARK-15525 Project: Spark Issue

[jira] [Resolved] (SPARK-15504) Could MatrixFactorizationModel support recommend for some users only ?

2016-05-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15504. Resolution: Duplicate Please see SPARK-10802 which already exists. For the old RDD-based

[jira] [Updated] (SPARK-15501) ML 2.0 QA: Scala APIs audit for recommendation

2016-05-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15501: --- Component/s: ML Documentation > ML 2.0 QA: Scala APIs au

[jira] [Updated] (SPARK-15500) Remove defaults in storage level param doc in ALS

2016-05-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15500: --- Component/s: PySpark ML Documentation > Remove defau

[jira] [Assigned] (SPARK-15502) Add note in ML ALS docs that user / item column only supports Int

2016-05-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-15502: -- Assignee: Nick Pentreath > Add note in ML ALS docs that user / item column o

[jira] [Created] (SPARK-15502) Add note in ML ALS docs that user / item column only supports Int

2016-05-24 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15502: -- Summary: Add note in ML ALS docs that user / item column only supports Int Key: SPARK-15502 URL: https://issues.apache.org/jira/browse/SPARK-15502 Project: Spark

[jira] [Created] (SPARK-15501) ML 2.0 QA: Scala APIs audit for recommendation

2016-05-24 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15501: -- Summary: ML 2.0 QA: Scala APIs audit for recommendation Key: SPARK-15501 URL: https://issues.apache.org/jira/browse/SPARK-15501 Project: Spark Issue

[jira] [Created] (SPARK-15500) Remove defaults in storage level param doc in ALS

2016-05-24 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15500: -- Summary: Remove defaults in storage level param doc in ALS Key: SPARK-15500 URL: https://issues.apache.org/jira/browse/SPARK-15500 Project: Spark Issue

[jira] [Updated] (SPARK-15254) Improve ML pipeline Cross Validation Scaladoc & PyDoc

2016-05-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15254: --- Description: The ML pipeline Cross Validation Scaladoc & PyDoc is very sparse - we sh

[jira] [Commented] (SPARK-15254) Improve ML pipeline Cross Validation Scaladoc & PyDoc

2016-05-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297871#comment-15297871 ] Nick Pentreath commented on SPARK-15254: Please go ahead! > Improve ML pipeline Cross Validat

[jira] [Resolved] (SPARK-15442) PySpark QuantileDiscretizer missing "relativeError" param

2016-05-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15442. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13228 [https

[jira] [Updated] (SPARK-15492) Binarization scala example copy & paste to spark-shell error

2016-05-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15492: --- Assignee: Miao Wang > Binarization scala example copy & paste to spark-shel

Re: [DISCUSS] PredictionIO incubation proposal

2016-05-24 Thread Nick Pentreath
Hi everyone I just want to make it clear that my suggestion was in no way some sort of attempt to hijack the project or push a corporate agenda. For me personally, I have not been directly involved in PredictionIO, that is true. I have however spent the past 3 years prior to joining IBM building

[jira] [Assigned] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-05-23 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-15447: -- Assignee: Nick Pentreath > Performance test for ALS in Spark

Re: [VOTE] Removing module maintainer process

2016-05-23 Thread Nick Pentreath
+1 (binding) On Mon, 23 May 2016 at 04:19, Matei Zaharia wrote: > Correction, let's run this for 72 hours, so until 9 PM EST May 25th. > > > On May 22, 2016, at 8:34 PM, Matei Zaharia > wrote: > > > > It looks like the discussion thread on this

[jira] [Commented] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294116#comment-15294116 ] Nick Pentreath commented on SPARK-15447: [~mengxr] yes will aim to run some tests during early

[jira] [Assigned] (SPARK-15442) PySpark QuantileDiscretizer missing "relativeError" param

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-15442: -- Assignee: Nick Pentreath > PySpark QuantileDiscretizer missing "relativeErro

[jira] [Comment Edited] (SPARK-15442) PySpark QuantileDiscretizer missing "relativeError" param

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293753#comment-15293753 ] Nick Pentreath edited comment on SPARK-15442 at 5/20/16 5:18 PM: - When do

[jira] [Commented] (SPARK-15442) PySpark QuantileDiscretizer missing "relativeError" param

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293753#comment-15293753 ] Nick Pentreath commented on SPARK-15442: When do you plan to submit a PR? I'm just about

[jira] [Comment Edited] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293318#comment-15293318 ] Nick Pentreath edited comment on SPARK-14810 at 5/20/16 1:04 PM: - Yeah

[jira] [Comment Edited] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280592#comment-15280592 ] Nick Pentreath edited comment on SPARK-14810 at 5/20/16 1:00 PM

[jira] [Commented] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293318#comment-15293318 ] Nick Pentreath commented on SPARK-14810: Yeah makes sense - I've moved the listing

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Commented] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293316#comment-15293316 ] Nick Pentreath commented on SPARK-14810: List of changes since {{1.6.0}} audited

[jira] [Updated] (SPARK-15412) Improve linear & isotonic regression methods PyDocs

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15412: --- Assignee: holdenk > Improve linear & isotonic regression methods

[jira] [Updated] (SPARK-15444) Default value mismatch of param linkPredictionCol for GeneralizedLinearRegression

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15444: --- Assignee: Liang-Chi Hsieh > Default value mismatch of param linkPrediction

[jira] [Resolved] (SPARK-15444) Default value mismatch of param linkPredictionCol for GeneralizedLinearRegression

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15444. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13220 [https

[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292899#comment-15292899 ] Nick Pentreath commented on SPARK-15100: I created SPARK-15442 for #1 > Audit: ml.feat

[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292895#comment-15292895 ] Nick Pentreath commented on SPARK-15100: I'm not sure we need to set each and every possible

[jira] [Created] (SPARK-15442) PySpark QuantileDiscretizer missing "relativeError" param

2016-05-20 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15442: -- Summary: PySpark QuantileDiscretizer missing "relativeError" param Key: SPARK-15442 URL: https://issues.apache.org/jira/browse/SPARK-15442 Proj

[jira] [Resolved] (SPARK-15316) PySpark GeneralizedLinearRegression missing linkPredictionCol param

2016-05-19 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15316. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13106 [https

[jira] [Resolved] (SPARK-14891) ALS in ML never validates input schema

2016-05-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14891. Resolution: Fixed Fix Version/s: 2.0.0 > ALS in ML never validates input sch

[jira] [Commented] (SPARK-14978) PySpark TrainValidationSplitModel should support validationMetrics

2016-05-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15288827#comment-15288827 ] Nick Pentreath commented on SPARK-14978: thanks! > PySpark TrainValidationSplitModel sho

[jira] [Commented] (SPARK-14978) PySpark TrainValidationSplitModel should support validationMetrics

2016-05-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15288790#comment-15288790 ] Nick Pentreath commented on SPARK-14978: [~srowen] how do I add JIRA username {{taku-k

[jira] [Comment Edited] (SPARK-15378) Unable to load NLTK in spark RDD pipeline

2016-05-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15288665#comment-15288665 ] Nick Pentreath edited comment on SPARK-15378 at 5/18/16 9:17 AM: - If you

[jira] [Commented] (SPARK-15378) Unable to load NLTK in spark RDD pipeline

2016-05-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15288665#comment-15288665 ] Nick Pentreath commented on SPARK-15378: If you are trying to run on a cluster, then either

[jira] [Resolved] (SPARK-14978) PySpark TrainValidationSplitModel should support validationMetrics

2016-05-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14978. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12767 [https

Re: [DISCUSS] PredictionIO incubation proposal

2016-05-17 Thread Nick Pentreath
Hi there I'm glad to see the proposal to incubate PredictionIO. In my previous life as a startup co-founder, I kept a close eye on the project, and it would be fantastic to see it become an Apache incubating project! The folks working on Apache Spark and Apache SystemML (incubating) here at IBM

[jira] [Resolved] (SPARK-15182) Copy MLlib doc to ML: ml.feature

2016-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15182. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12957 [https

[jira] [Updated] (SPARK-15182) Copy MLlib doc to ML: ml.feature

2016-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15182: --- Assignee: yuhao yang > Copy MLlib doc to ML: ml.feat

[jira] [Updated] (SPARK-14434) User guide doc and examples for GaussianMixture in spark.ml

2016-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14434: --- Assignee: Miao Wang > User guide doc and examples for GaussianMixture in spark

[jira] [Resolved] (SPARK-14434) User guide doc and examples for GaussianMixture in spark.ml

2016-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14434. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12788 [https

[jira] [Commented] (SPARK-14709) spark.ml API for linear SVM

2016-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15284304#comment-15284304 ] Nick Pentreath commented on SPARK-14709: It would be great to get the list of references

[jira] [Resolved] (SPARK-14979) Add examples for GeneralizedLinearRegression

2016-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14979. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12754 [https

[jira] [Updated] (SPARK-15316) PySpark GeneralizedLinearRegression missing linkPredictionCol param

2016-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15316: --- Assignee: holdenk > PySpark GeneralizedLinearRegression missing linkPredictionCol pa

[jira] [Updated] (SPARK-15305) spark.ml document Bisectiong k-means has the incorrect format

2016-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15305: --- Assignee: Miao Wang > spark.ml document Bisectiong k-means has the incorrect for

[jira] [Resolved] (SPARK-15305) spark.ml document Bisectiong k-means has the incorrect format

2016-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15305. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13083 [https

[jira] [Updated] (SPARK-15186) Add user guide for Generalized Linear Regression.

2016-05-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15186: --- Assignee: Seth Hendrickson > Add user guide for Generalized Linear Regress

[jira] [Updated] (SPARK-14979) Add examples for GeneralizedLinearRegression

2016-05-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14979: --- Assignee: Yanbo Liang > Add examples for GeneralizedLinearRegress

<    3   4   5   6   7   8   9   10   11   12   >