[jira] [Resolved] (SPARK-13961) spark.ml ChiSqSelector and RFormula should support other numeric types for label

2016-05-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-13961. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12467 [https

[jira] [Resolved] (SPARK-15181) Python API for Generalized Linear Regression Summary

2016-05-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15181. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12961 [https

[jira] [Resolved] (SPARK-15188) PySpark NaiveBayes is missing Thresholds param

2016-05-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15188. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12963 [https

[jira] [Resolved] (SPARK-15281) PySpark ML GBTRegressor lacks impurity param

2016-05-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15281. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13071 [https

[jira] [Updated] (SPARK-15281) PySpark ML GBTRegressor lacks impurity param

2016-05-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15281: --- Assignee: holdenk > PySpark ML GBTRegressor lacks impurity pa

[jira] [Commented] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280592#comment-15280592 ] Nick Pentreath commented on SPARK-14810: [~josephkb] [~mengxr] [~srowen] I've made a pass through

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Comment Edited] (SPARK-13959) Audit MiMa excludes added in SPARK-13948 to make sure none are unintended incompatibilities

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280013#comment-15280013 ] Nick Pentreath edited comment on SPARK-13959 at 5/11/16 11:55 AM: -- All

[jira] [Comment Edited] (SPARK-13959) Audit MiMa excludes added in SPARK-13948 to make sure none are unintended incompatibilities

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280013#comment-15280013 ] Nick Pentreath edited comment on SPARK-13959 at 5/11/16 11:56 AM: -- All

[jira] [Commented] (SPARK-13959) Audit MiMa excludes added in SPARK-13948 to make sure none are unintended incompatibilities

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280013#comment-15280013 ] Nick Pentreath commented on SPARK-13959: All {{ML}} potential errors are related to {{DataFrame

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-15243) Binarizer.explainParam(u"...") raises ValueError

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15243: --- Assignee: Seth Hendrickson > Binarizer.explainParam(u"...") rais

[jira] [Resolved] (SPARK-15150) Add python example for LDA

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15150. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12927 [https

[jira] [Updated] (SPARK-15150) Add python example for LDA

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15150: --- Assignee: zhengruifeng > Add python example for

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Commented] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15279877#comment-15279877 ] Nick Pentreath commented on SPARK-14810: Ran {{./dev/mima}} on {{branch-2.0}}. Result for {{mllib

[jira] [Comment Edited] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15279877#comment-15279877 ] Nick Pentreath edited comment on SPARK-14810 at 5/11/16 9:53 AM: - Ran

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Resolved] (SPARK-15189) ml.Evaluation pydoc issues

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15189. Resolution: Fixed Fix Version/s: 2.0.0 > ml.Evaluation pydoc iss

[jira] [Resolved] (SPARK-15149) Include ml.kmeans python example

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15149. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12925 [https

[jira] [Updated] (SPARK-15149) Include ml.kmeans python example

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15149: --- Assignee: zhengruifeng > Include ml.kmeans python exam

[jira] [Updated] (SPARK-14340) Add Scala Example and User DOC for ml.BisectingKMeans

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14340: --- Assignee: zhengruifeng > Add Scala Example and User DOC for ml.BisectingKMe

[jira] [Resolved] (SPARK-14340) Add Scala Example and User DOC for ml.BisectingKMeans

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14340. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11844 [https

[jira] [Updated] (SPARK-15141) Add python example for OneVsRest

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15141: --- Assignee: zhengruifeng > Add python example for OneVsR

[jira] [Resolved] (SPARK-15141) Add python example for OneVsRest

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15141. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12920 [https

[jira] [Closed] (SPARK-14982) Extend Python GeneralizedLinearRegressionSummary to have same functions as Scala & Java equivalent

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath closed SPARK-14982. -- Resolution: Duplicate > Extend Python GeneralizedLinearRegressionSummary to have s

[jira] [Commented] (SPARK-15181) Python API for Generalized Linear Regression Summary

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15279727#comment-15279727 ] Nick Pentreath commented on SPARK-15181: It is - I will actually just close SPARK-14982

[jira] [Updated] (SPARK-15181) Python API for Generalized Linear Regression Summary

2016-05-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15181: --- Assignee: Seth Hendrickson > Python API for Generalized Linear Regression Summ

[jira] [Resolved] (SPARK-15195) Improve PyDoc for ml.tuning

2016-05-10 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15195. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12967 [https

[jira] [Updated] (SPARK-15189) ml.Evaluation pydoc issues

2016-05-10 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15189: --- Assignee: holdenk > ml.Evaluation pydoc iss

[jira] [Updated] (SPARK-15195) Improve PyDoc for ml.tuning

2016-05-10 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15195: --- Assignee: holdenk > Improve PyDoc for ml.tun

[jira] [Commented] (SPARK-14815) ML, Graph, R 2.0 QA: Update user guide for new features & APIs

2016-05-10 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15278193#comment-15278193 ] Nick Pentreath commented on SPARK-14815: Ok, I don't feel very strongly about removing them

[jira] [Updated] (SPARK-15188) PySpark NaiveBayes is missing Thresholds param

2016-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15188: --- Assignee: holdenk > PySpark NaiveBayes is missing Thresholds pa

[jira] [Updated] (SPARK-15188) PySpark NaiveBayes is missing Thresholds param

2016-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15188: --- Summary: PySpark NaiveBayes is missing Thresholds param (was: NaiveBayes is missing

[jira] [Commented] (SPARK-14815) ML, Graph, R 2.0 QA: Update user guide for new features & APIs

2016-05-05 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272258#comment-15272258 ] Nick Pentreath commented on SPARK-14815: Not sure if this is the best JIRA to comment

[jira] [Updated] (SPARK-15092) toDebugString missing from ML DecisionTreeClassifier

2016-05-05 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15092: --- Assignee: holdenk > toDebugString missing from ML DecisionTreeClassif

[jira] [Assigned] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-04 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-14810: -- Assignee: Nick Pentreath > ML, Graph 2.0 QA: API: Binary incompatible chan

[jira] [Commented] (SPARK-14900) spark.ml classification metrics should include accuracy

2016-05-04 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15270574#comment-15270574 ] Nick Pentreath commented on SPARK-14900: +1 for deprecating "precision" in favour of

[jira] [Resolved] (SPARK-14844) KMeansModel in spark.ml should allow to change featureCol and predictionCol

2016-05-04 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14844. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12609 [https

[jira] [Commented] (SPARK-14900) spark.ml classification metrics should include accuracy

2016-05-04 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15270403#comment-15270403 ] Nick Pentreath commented on SPARK-14900: It's ok to put this in MultiClassMetrics, but isn't

[jira] [Updated] (SPARK-15094) CodeGenerator: failed to compile - when using dataset.rdd with generic case class

2016-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15094: --- Component/s: SQL > CodeGenerator: failed to compile - when using dataset.rdd with gene

[jira] [Commented] (SPARK-15027) ALS.train should use DataFrame instead of RDD

2016-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15269367#comment-15269367 ] Nick Pentreath commented on SPARK-15027: Will take a look at repartitioning. 2.1 seems ok, I

[jira] [Created] (SPARK-15094) CodeGenerator: failed to compile - when using dataset.rdd with generic case class

2016-05-03 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15094: -- Summary: CodeGenerator: failed to compile - when using dataset.rdd with generic case class Key: SPARK-15094 URL: https://issues.apache.org/jira/browse/SPARK-15094

[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13448: --- Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can

[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13448: --- Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can

[jira] [Resolved] (SPARK-14971) PySpark ML Params setter code clean up

2016-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14971. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12749 [https

[jira] [Updated] (SPARK-14971) PySpark ML Params setter code clean up

2016-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14971: --- Assignee: Yanbo Liang > PySpark ML Params setter code clean

[jira] [Commented] (SPARK-15027) ALS.train should use DataFrame instead of RDD

2016-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268662#comment-15268662 ] Nick Pentreath commented on SPARK-15027: I've managed to get it working for the following

[jira] [Updated] (SPARK-14971) PySpark ML Params setter code clean up

2016-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14971: --- Shepherd: Nick Pentreath > PySpark ML Params setter code clean

[jira] [Commented] (SPARK-14812) ML, Graph 2.0 QA: API: Experimental, DeveloperApi, final, sealed audit

2016-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268326#comment-15268326 ] Nick Pentreath commented on SPARK-14812: fair enough - any changes to ALS {{transform

Re: Cross Validator to work with K-Fold value of 1?

2016-05-02 Thread Nick Pentreath
There is a JIRA and PR around for supporting polynomial expansion with degree 1. Offhand I can't recall if it's been merged On Mon, 2 May 2016 at 17:45, Julio Antonio Soto de Vicente wrote: > Hi, > > Same goes for the PolynomialExpansion in org.apache.spark.ml.feature. It > would

[jira] [Commented] (SPARK-15027) ALS.train should use DataFrame instead of RDD

2016-04-30 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15265232#comment-15265232 ] Nick Pentreath commented on SPARK-15027: Ok - it would make sense to have it in 2.0 if possible

[jira] [Commented] (SPARK-15027) ALS.train should use DataFrame instead of RDD

2016-04-30 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15265228#comment-15265228 ] Nick Pentreath commented on SPARK-15027: [~mengxr] are you intending this to be a more

[jira] [Updated] (SPARK-14571) Log instrumentation in ALS

2016-04-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14571: --- Assignee: Miao Wang > Log instrumentation in

[jira] [Resolved] (SPARK-14571) Log instrumentation in ALS

2016-04-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14571. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12560 [https

[jira] [Commented] (SPARK-14900) spark.ml classification metrics should include accuracy

2016-04-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263778#comment-15263778 ] Nick Pentreath commented on SPARK-14900: Sure, go ahead > spark.ml classification metrics sho

[jira] [Assigned] (SPARK-14891) ALS in ML never validates input schema

2016-04-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-14891: -- Assignee: Nick Pentreath > ALS in ML never validates input sch

[jira] [Updated] (SPARK-14886) RankingMetrics.ndcgAt throw java.lang.ArrayIndexOutOfBoundsException

2016-04-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14886: --- Assignee: Sean Owen > RankingMetrics.ndcgAt throw java.lang.ArrayIndexOutOfBoundsExcept

[jira] [Resolved] (SPARK-14886) RankingMetrics.ndcgAt throw java.lang.ArrayIndexOutOfBoundsException

2016-04-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14886. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12756 [https

Re: Addign a new column to a dataframe (based on value of existing column)

2016-04-28 Thread Nick Pentreath
This should work: scala> val df = Seq((25.0, "foo"), (30.0, "bar")).toDF("age", "name") scala> df.withColumn("AgeInt", when(col("age") > 29.0, 1).otherwise(0)).show +++--+ | age|name|AgeInt| +++--+ |25.0| foo| 0| |30.0| bar| 1| +++--+ On Thu, 28 Apr 2016 at

[jira] [Comment Edited] (SPARK-8971) Support balanced class labels when splitting train/cross validation sets

2016-04-28 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261655#comment-15261655 ] Nick Pentreath edited comment on SPARK-8971 at 4/28/16 7:18 AM: I think

[jira] [Commented] (SPARK-8971) Support balanced class labels when splitting train/cross validation sets

2016-04-28 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261655#comment-15261655 ] Nick Pentreath commented on SPARK-8971: --- I think it would be good to have something implemented, so

[jira] [Commented] (SPARK-9656) Add missing methods to linalg.distributed

2016-04-27 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15260707#comment-15260707 ] Nick Pentreath commented on SPARK-9656: --- I can't seem to find your name in the search bar - what

[jira] [Resolved] (SPARK-9656) Add missing methods to linalg.distributed

2016-04-27 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-9656. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 9441 [https

Re: Duplicated fit into TrainValidationSplit

2016-04-27 Thread Nick Pentreath
You should find that the first set of fits are called on the training set, and the resulting models evaluated on the validation set. The final best model is then retrained on the entire dataset. This is standard practice - usually the dataset passed to the train validation split is itself further

[jira] [Commented] (SPARK-14891) ALS in ML never validates input schema

2016-04-26 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15257682#comment-15257682 ] Nick Pentreath commented on SPARK-14891: {{ALS.train}} is generic in the ID type, and so does

[jira] [Resolved] (SPARK-13962) spark.ml Evaluators should support other numeric types for label

2016-04-26 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-13962. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12500 [https

[jira] [Updated] (SPARK-14844) KMeansModel in spark.ml should allow to change featureCol and predictionCol

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14844: --- Shepherd: Nick Pentreath (was: Dominik Jastrzębski) > KMeansModel in spark.ml should al

[jira] [Updated] (SPARK-14768) Remove expectedType arg for PySpark Param

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14768: --- Assignee: Jason C Lee > Remove expectedType arg for PySpark Pa

[jira] [Resolved] (SPARK-14768) Remove expectedType arg for PySpark Param

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14768. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12581 [https

[jira] [Updated] (SPARK-14409) Investigate adding a RankingEvaluator to ML

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14409: --- Shepherd: Nick Pentreath > Investigate adding a RankingEvaluator to

[jira] [Commented] (SPARK-14891) ALS in ML never validates input schema

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256140#comment-15256140 ] Nick Pentreath commented on SPARK-14891: Currently the only doc is {code} /** * :: DeveloperApi

[jira] [Comment Edited] (SPARK-14891) ALS in ML never validates input schema

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256140#comment-15256140 ] Nick Pentreath edited comment on SPARK-14891 at 4/25/16 9:27 AM

[jira] [Commented] (SPARK-14891) ALS in ML never validates input schema

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256084#comment-15256084 ] Nick Pentreath commented on SPARK-14891: [~srowen] [~mengxr] [~josephkb] thoughts? Also, while

[jira] [Created] (SPARK-14891) ALS in ML never validates input schema

2016-04-25 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-14891: -- Summary: ALS in ML never validates input schema Key: SPARK-14891 URL: https://issues.apache.org/jira/browse/SPARK-14891 Project: Spark Issue Type: Bug

[jira] [Comment Edited] (SPARK-14886) RankingMetrics.ndcgAt throw java.lang.ArrayIndexOutOfBoundsException

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256065#comment-15256065 ] Nick Pentreath edited comment on SPARK-14886 at 4/25/16 8:26 AM: - Are you

[jira] [Commented] (SPARK-14886) RankingMetrics.ndcgAt throw java.lang.ArrayIndexOutOfBoundsException

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256065#comment-15256065 ] Nick Pentreath commented on SPARK-14886: Are you saying that the "maxDCG" sh

[jira] [Updated] (SPARK-6717) Clear shuffle files after checkpointing in ALS

2016-04-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-6717: -- Shepherd: Nick Pentreath > Clear shuffle files after checkpointing in

[jira] [Updated] (SPARK-14843) Error while encoding: java.lang.ClassCastException with LibSVMRelation

2016-04-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14843: --- Component/s: SQL > Error while encoding: java.lang.ClassCastException with LibSVMRelat

[jira] [Created] (SPARK-14843) Error while encoding: java.lang.ClassCastException with LibSVMRelation

2016-04-22 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-14843: -- Summary: Error while encoding: java.lang.ClassCastException with LibSVMRelation Key: SPARK-14843 URL: https://issues.apache.org/jira/browse/SPARK-14843 Project

[jira] [Commented] (SPARK-14489) RegressionEvaluator returns NaN for ALS in Spark ml

2016-04-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253536#comment-15253536 ] Nick Pentreath commented on SPARK-14489: Is naive sampling not an option

[jira] [Commented] (SPARK-14812) ML 2.0 QA: API: Experimental, DeveloperApi, final, sealed audit

2016-04-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253508#comment-15253508 ] Nick Pentreath commented on SPARK-14812: I would like to keep ALS experimental until SPARK-13857

<    4   5   6   7   8   9   10   11   12   13   >