[jira] [Commented] (SPARK-13761) Deprecate validateParams

2016-03-08 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15186405#comment-15186405 ] yuhao yang commented on SPARK-13761: Hi [~josephkb], do you mind if I work on this? > Deprecate

[jira] [Commented] (SPARK-8884) 1-sample Anderson-Darling Goodness-of-Fit test

2016-03-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182420#comment-15182420 ] yuhao yang commented on SPARK-8884: --- Hi [~josepablocam]. Do you mind if I continue to work on this? I

[jira] [Commented] (SPARK-12566) GLM model family, link function support in SparkR:::glm

2016-03-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182384#comment-15182384 ] yuhao yang commented on SPARK-12566: Since we already have a glm in SparkR which is based on

[jira] [Commented] (SPARK-13639) Statistics.colStats(rdd).mean and variance should handle NaN in the input vectors

2016-03-03 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179227#comment-15179227 ] yuhao yang commented on SPARK-13639: We perhaps can find some other way for SPARK-13568. I just need

[jira] [Comment Edited] (SPARK-13568) Create feature transformer to impute missing values

2016-03-02 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172423#comment-15172423 ] yuhao yang edited comment on SPARK-13568 at 3/3/16 5:48 AM: Yes, I'm working

[jira] [Created] (SPARK-13639) Statistics.colStats(rdd).mean and variance should handle NaN in the input vectors

2016-03-02 Thread yuhao yang (JIRA)
yuhao yang created SPARK-13639: -- Summary: Statistics.colStats(rdd).mean and variance should handle NaN in the input vectors Key: SPARK-13639 URL: https://issues.apache.org/jira/browse/SPARK-13639

[jira] [Commented] (SPARK-12566) GLM model family, link function support in SparkR:::glm

2016-03-01 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174022#comment-15174022 ] yuhao yang commented on SPARK-12566: Yes, I'll start on it. Thanks. > GLM model family, link

[jira] [Comment Edited] (SPARK-13568) Create feature transformer to impute missing values

2016-02-29 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172423#comment-15172423 ] yuhao yang edited comment on SPARK-13568 at 2/29/16 7:10 PM: - Yes, I'm

[jira] [Commented] (SPARK-13568) Create feature transformer to impute missing values

2016-02-29 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172423#comment-15172423 ] yuhao yang commented on SPARK-13568: Yes, I'm working on support numeric values too. And I agree

[jira] [Commented] (SPARK-13568) Create feature transformer to impute missing values

2016-02-29 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172216#comment-15172216 ] yuhao yang commented on SPARK-13568: Hi Nick, can I work on this since I kind of already have... I

[jira] [Comment Edited] (SPARK-4039) KMeans support sparse cluster centers

2016-02-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146465#comment-15146465 ] yuhao yang edited comment on SPARK-4039 at 2/27/16 7:56 PM:

[jira] [Comment Edited] (SPARK-12861) Changes to support KMeans with large feature space

2016-02-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146464#comment-15146464 ] yuhao yang edited comment on SPARK-12861 at 2/27/16 7:55 PM: -

[jira] [Created] (SPARK-13512) Add example and doc for ml.feature.MaxAbsScaler

2016-02-26 Thread yuhao yang (JIRA)
yuhao yang created SPARK-13512: -- Summary: Add example and doc for ml.feature.MaxAbsScaler Key: SPARK-13512 URL: https://issues.apache.org/jira/browse/SPARK-13512 Project: Spark Issue Type: Task

[jira] [Commented] (SPARK-13502) Missing ml.NaiveBayes in MLlib guide

2016-02-26 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168824#comment-15168824 ] yuhao yang commented on SPARK-13502: Hi Xusen, there's already a jira on that:

[jira] [Created] (SPARK-13345) Adding one way ANOVA to Spark ML stat

2016-02-16 Thread yuhao yang (JIRA)
yuhao yang created SPARK-13345: -- Summary: Adding one way ANOVA to Spark ML stat Key: SPARK-13345 URL: https://issues.apache.org/jira/browse/SPARK-13345 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-12861) Changes to support KMeans with large feature space

2016-02-14 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146464#comment-15146464 ] yuhao yang commented on SPARK-12861:

[jira] [Comment Edited] (SPARK-12861) Changes to support KMeans with large feature space

2016-02-14 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146464#comment-15146464 ] yuhao yang edited comment on SPARK-12861 at 2/14/16 9:42 AM: -

[jira] [Commented] (SPARK-4039) KMeans support sparse cluster centers

2016-02-14 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146465#comment-15146465 ] yuhao yang commented on SPARK-4039: ---

[jira] [Commented] (SPARK-13196) Optimize the option and flatten in Word2Vec to reduce the max memory consumption

2016-02-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15133548#comment-15133548 ] yuhao yang commented on SPARK-13196: https://github.com/apache/spark/pull/11078 > Optimize the

[jira] [Issue Comment Deleted] (SPARK-13196) Optimize the option and flatten in Word2Vec to reduce the max memory consumption

2016-02-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-13196: --- Comment: was deleted (was: https://github.com/apache/spark/pull/11078) > Optimize the option and

[jira] [Created] (SPARK-13196) Optimize the option and flatten in Word2Vec to reduce the max memory consumption

2016-02-04 Thread yuhao yang (JIRA)
yuhao yang created SPARK-13196: -- Summary: Optimize the option and flatten in Word2Vec to reduce the max memory consumption Key: SPARK-13196 URL: https://issues.apache.org/jira/browse/SPARK-13196

[jira] [Commented] (SPARK-13103) HashTF dosn't count TF correctly

2016-01-30 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124937#comment-15124937 ] yuhao yang commented on SPARK-13103: Thanks for finding this. I'm not sure what's the historical

[jira] [Commented] (SPARK-13089) spark.ml Naive Bayes user guide

2016-01-30 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124889#comment-15124889 ] yuhao yang commented on SPARK-13089: I'll start on this. > spark.ml Naive Bayes user guide >

[jira] [Created] (SPARK-13028) Add MaxAbsScaler to ML.feature as a transformer

2016-01-26 Thread yuhao yang (JIRA)
yuhao yang created SPARK-13028: -- Summary: Add MaxAbsScaler to ML.feature as a transformer Key: SPARK-13028 URL: https://issues.apache.org/jira/browse/SPARK-13028 Project: Spark Issue Type: New

[jira] [Commented] (SPARK-11560) Optimize KMeans implementation

2016-01-25 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116519#comment-15116519 ] yuhao yang commented on SPARK-11560: Will the new version support sparse data better, or it's not a

[jira] [Created] (SPARK-12875) Add Weight of Evidence and Information value to Spark.ml as a feature transformer

2016-01-18 Thread yuhao yang (JIRA)
yuhao yang created SPARK-12875: -- Summary: Add Weight of Evidence and Information value to Spark.ml as a feature transformer Key: SPARK-12875 URL: https://issues.apache.org/jira/browse/SPARK-12875

[jira] [Commented] (SPARK-11507) Error thrown when using BlockMatrix.add

2016-01-16 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103576#comment-15103576 ] yuhao yang commented on SPARK-11507: A fix has been merged into Breeze.

[jira] [Commented] (SPARK-12685) word2vec trainWordsCount gets overflow

2016-01-11 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15093188#comment-15093188 ] yuhao yang commented on SPARK-12685: Hi [~josephkb] I suppose it will require 3 different PRs, right?

[jira] [Commented] (SPARK-12685) word2vec trainWordsCount gets overflow

2016-01-11 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15093134#comment-15093134 ] yuhao yang commented on SPARK-12685: Sorry to miss that. I'll start on it. > word2vec

[jira] [Updated] (SPARK-12685) word2vec trainWordsCount gets overflow

2016-01-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-12685: --- Description: the log of word2vec reports trainWordsCount = -785727483 during computation over a

[jira] [Updated] (SPARK-12685) word2vec trainWordsCount gets overflow

2016-01-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-12685: --- Description: the log of word2vec reports trainWordsCount = -785727483 during computation over a

[jira] [Updated] (SPARK-12685) word2vec trainWordsCount gets overflow

2016-01-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-12685: --- Priority: Minor (was: Trivial) > word2vec trainWordsCount gets overflow >

[jira] [Commented] (SPARK-12685) word2vec logingo trainWordsCount gets overflow

2016-01-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086786#comment-15086786 ] yuhao yang commented on SPARK-12685: Update the priority as it will affects the computation process.

[jira] [Updated] (SPARK-12685) word2vec trainWordsCount gets overflow

2016-01-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-12685: --- Summary: word2vec trainWordsCount gets overflow (was: word2vec logingo trainWordsCount gets

[jira] [Created] (SPARK-12685) word2vec logingo trainWordsCount gets overflow

2016-01-06 Thread yuhao yang (JIRA)
yuhao yang created SPARK-12685: -- Summary: word2vec logingo trainWordsCount gets overflow Key: SPARK-12685 URL: https://issues.apache.org/jira/browse/SPARK-12685 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-12626) MLlib 2.0 Roadmap

2016-01-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082520#comment-15082520 ] yuhao yang commented on SPARK-12626: Great list. Maybe we can create an Umbrella jira for the feature

[jira] [Commented] (SPARK-12566) GLM model family, link function support

2016-01-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082504#comment-15082504 ] yuhao yang commented on SPARK-12566: I'm interested. Just want to know if this is R specific, or we

[jira] [Commented] (SPARK-12375) VectorIndexer: allow unknown categories

2015-12-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15071370#comment-15071370 ] yuhao yang commented on SPARK-12375: PR here, https://github.com/apache/spark/pull/10466 >

[jira] [Commented] (SPARK-12488) LDA describeTopics() Generates Invalid Term IDs

2015-12-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15070779#comment-15070779 ] yuhao yang commented on SPARK-12488: I cannot repro the issue even with empty vector in documents.

[jira] [Commented] (SPARK-12375) VectorIndexer: allow unknown categories

2015-12-16 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061263#comment-15061263 ] yuhao yang commented on SPARK-12375: Anyone working on this? If not, I'll start to. > VectorIndexer:

[jira] [Commented] (SPARK-9578) Stemmer feature transformer

2015-12-14 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055626#comment-15055626 ] yuhao yang commented on SPARK-9578: --- PR was sent two days ago. I'm not sure why it's not linked here...

[jira] [Commented] (SPARK-9578) Stemmer feature transformer

2015-12-10 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15051915#comment-15051915 ] yuhao yang commented on SPARK-9578: --- Oh, I got a porter implementation now. I'll send it today or

[jira] [Comment Edited] (SPARK-12246) Add documentation for spark.ml.clustering.kmeans

2015-12-09 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049715#comment-15049715 ] yuhao yang edited comment on SPARK-12246 at 12/10/15 12:26 AM: --- This can go

[jira] [Commented] (SPARK-12246) Add documentation for spark.ml.clustering.kmeans

2015-12-09 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049715#comment-15049715 ] yuhao yang commented on SPARK-12246: This can go to the ml-clustering page. There're Scala and Java

[jira] [Commented] (SPARK-12215) User guide section for KMeans in spark.ml

2015-12-09 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049959#comment-15049959 ] yuhao yang commented on SPARK-12215: This can go to the ml-clustering page. There're Scala and Java

[jira] [Created] (SPARK-12096) remove the old constraint in word2vec

2015-12-02 Thread yuhao yang (JIRA)
yuhao yang created SPARK-12096: -- Summary: remove the old constraint in word2vec Key: SPARK-12096 URL: https://issues.apache.org/jira/browse/SPARK-12096 Project: Spark Issue Type: Improvement

[jira] [Comment Edited] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-12-01 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1500#comment-1500 ] yuhao yang edited comment on SPARK-11605 at 12/1/15 8:37 AM: - Thanks for the

[jira] [Commented] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-12-01 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033382#comment-15033382 ] yuhao yang commented on SPARK-11605: Hi [~josephkb],

[jira] [Comment Edited] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-12-01 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1500#comment-1500 ] yuhao yang edited comment on SPARK-11605 at 12/1/15 9:06 AM: - Thanks for the

[jira] [Commented] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-12-01 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033383#comment-15033383 ] yuhao yang commented on SPARK-11605: Hi [~josephkb],

[jira] [Commented] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-12-01 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1500#comment-1500 ] yuhao yang commented on SPARK-11605: Thanks for the feedback. So for now the only problems are the

[jira] [Commented] (SPARK-12000) `sbt publishLocal` hits a Scala compiler bug caused by `Since` annotation

2015-11-29 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030963#comment-15030963 ] yuhao yang commented on SPARK-12000: for 11605, I can use javap for now. > `sbt publishLocal` hits a

[jira] [Commented] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-11-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030503#comment-15030503 ] yuhao yang commented on SPARK-11605: The API with parameter of type DStream in

[jira] [Commented] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-11-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030485#comment-15030485 ] yuhao yang commented on SPARK-11605: Still getting " no-symbol does not have an owner" from jekyll

[jira] [Commented] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-11-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030494#comment-15030494 ] yuhao yang commented on SPARK-11605: return type of org.apache.spark.ml.clustering.LDA.getOldDataset

[jira] [Comment Edited] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-11-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030494#comment-15030494 ] yuhao yang edited comment on SPARK-11605 at 11/28/15 1:16 PM: -- Return type

[jira] [Comment Edited] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-11-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030542#comment-15030542 ] yuhao yang edited comment on SPARK-11605 at 11/28/15 3:04 PM: -- return type

[jira] [Commented] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-11-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030571#comment-15030571 ] yuhao yang commented on SPARK-11605: org.apache.spark.mllib.tree.model.GradientBoostedTreesModel

[jira] [Comment Edited] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-11-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030494#comment-15030494 ] yuhao yang edited comment on SPARK-11605 at 11/28/15 2:51 PM: -- Return type

[jira] [Comment Edited] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-11-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030554#comment-15030554 ] yuhao yang edited comment on SPARK-11605 at 11/28/15 3:33 PM: --

[jira] [Commented] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-11-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030567#comment-15030567 ] yuhao yang commented on SPARK-11605: org.apache.spark.mllib.clustering.KMeansModel public

[jira] [Commented] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-11-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030554#comment-15030554 ] yuhao yang commented on SPARK-11605: org.apache.spark.mllib.classification.LogisticRegressionModel

[jira] [Commented] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-11-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030563#comment-15030563 ] yuhao yang commented on SPARK-11605: org.apache.spark.mllib.clustering.DistributedLDAModel public

[jira] [Commented] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-11-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030542#comment-15030542 ] yuhao yang commented on SPARK-11605: return type of method index in

[jira] [Commented] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-11-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030575#comment-15030575 ] yuhao yang commented on SPARK-11605: These should be all from me. > ML 1.6 QA: API: Java

[jira] [Commented] (SPARK-11602) ML 1.6 QA: API: New Scala APIs, docs

2015-11-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030372#comment-15030372 ] yuhao yang commented on SPARK-11602: I've finished reviewing scala docs and code change. If Xiangrui

[jira] [Commented] (SPARK-12000) `sbt publishLocal` hits a Scala compiler bug caused by `Since` annotation

2015-11-26 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15029429#comment-15029429 ] yuhao yang commented on SPARK-12000: Scala API can be generated. > `sbt publishLocal` hits a Scala

[jira] [Commented] (SPARK-12000) `sbt publishLocal` hits a Scala compiler bug caused by `Since` annotation

2015-11-26 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15029073#comment-15029073 ] yuhao yang commented on SPARK-12000: Met that with "./build/sbt unidoc" no-symbol does not have an

[jira] [Comment Edited] (SPARK-12000) `sbt publishLocal` hits a Scala compiler bug caused by `Since` annotation

2015-11-26 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15029073#comment-15029073 ] yuhao yang edited comment on SPARK-12000 at 11/26/15 4:43 PM: -- Met it with

[jira] [Comment Edited] (SPARK-12000) `sbt publishLocal` hits a Scala compiler bug caused by `Since` annotation

2015-11-26 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15029073#comment-15029073 ] yuhao yang edited comment on SPARK-12000 at 11/26/15 4:43 PM: -- Met that with

[jira] [Commented] (SPARK-11605) ML 1.6 QA: API: Java compatibility, docs

2015-11-25 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027184#comment-15027184 ] yuhao yang commented on SPARK-11605: I plan to finish it this week. I think Xiangrui will make

[jira] [Commented] (SPARK-11602) ML 1.6 QA: API: New Scala APIs, docs

2015-11-23 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023694#comment-15023694 ] yuhao yang commented on SPARK-11602: It's started. I'll finish the API audit today and start checking

[jira] [Created] (SPARK-11898) Use broadcast for the global tables in Word2Vec

2015-11-20 Thread yuhao yang (JIRA)
yuhao yang created SPARK-11898: -- Summary: Use broadcast for the global tables in Word2Vec Key: SPARK-11898 URL: https://issues.apache.org/jira/browse/SPARK-11898 Project: Spark Issue Type:

[jira] [Created] (SPARK-11813) Avoid serialization of vocab in Word2Vec

2015-11-18 Thread yuhao yang (JIRA)
yuhao yang created SPARK-11813: -- Summary: Avoid serialization of vocab in Word2Vec Key: SPARK-11813 URL: https://issues.apache.org/jira/browse/SPARK-11813 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-11816) fix some style issue in ML/MLlib examples

2015-11-18 Thread yuhao yang (JIRA)
yuhao yang created SPARK-11816: -- Summary: fix some style issue in ML/MLlib examples Key: SPARK-11816 URL: https://issues.apache.org/jira/browse/SPARK-11816 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-11847) Model export/import for spark.ml: LDA

2015-11-18 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012776#comment-15012776 ] yuhao yang commented on SPARK-11847: Sure, I can take it. > Model export/import for spark.ml: LDA >

[jira] [Comment Edited] (SPARK-9273) Add Convolutional Neural network to Spark MLlib

2015-11-16 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15003468#comment-15003468 ] yuhao yang edited comment on SPARK-9273 at 11/17/15 5:36 AM: - Reopen the jira

[jira] [Reopened] (SPARK-9273) Add Convolutional Neural network to Spark MLlib

2015-11-12 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang reopened SPARK-9273: --- Reopen the jira since it's an independent feature and under active development. > Add Convolutional

[jira] [Commented] (SPARK-9273) Add Convolutional Neural network to Spark MLlib

2015-11-12 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15003466#comment-15003466 ] yuhao yang commented on SPARK-9273: --- I'd like to hear more opinions on if we should make Pooling Layer a

[jira] [Commented] (SPARK-11502) Word2VecSuite needs appropriate checks

2015-11-09 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996720#comment-14996720 ] yuhao yang commented on SPARK-11502: There happens to be a customer requesting the similar feature.

[jira] [Commented] (SPARK-9273) Add Convolutional Neural network to Spark MLlib

2015-11-08 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14995685#comment-14995685 ] yuhao yang commented on SPARK-9273: --- Thanks so much for the helpful suggestions. I have an optimized

[jira] [Created] (SPARK-11579) Method SGDOptimizer and LBFGSOptimizer in FeedForwardTrainer should not create new optimizer every time they got invoked

2015-11-08 Thread yuhao yang (JIRA)
yuhao yang created SPARK-11579: -- Summary: Method SGDOptimizer and LBFGSOptimizer in FeedForwardTrainer should not create new optimizer every time they got invoked Key: SPARK-11579 URL:

[jira] [Commented] (SPARK-11507) Error thrown when using BlockMatrix.add

2015-11-05 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14992978#comment-14992978 ] yuhao yang commented on SPARK-11507: Seems breeze will add extra 0 to value in CSCMatrix when adding

[jira] [Commented] (SPARK-10809) Single-document topicDistributions method for LocalLDAModel

2015-11-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990974#comment-14990974 ] yuhao yang commented on SPARK-10809: working on this. > Single-document topicDistributions method

[jira] [Commented] (SPARK-9273) Add Convolutional Neural network to Spark MLlib

2015-11-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991018#comment-14991018 ] yuhao yang commented on SPARK-9273: --- Hi [~avulanov]. I've refactored the CNN in

[jira] [Commented] (SPARK-11507) Error thrown when using BlockMatrix.add

2015-11-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991261#comment-14991261 ] yuhao yang commented on SPARK-11507: Looking into it. Should be a bug. Breeze may remove the extra

[jira] [Comment Edited] (SPARK-11507) Error thrown when using BlockMatrix.add

2015-11-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991261#comment-14991261 ] yuhao yang edited comment on SPARK-11507 at 11/5/15 7:21 AM: - Looking into

[jira] [Commented] (SPARK-11069) Add RegexTokenizer option to convert to lowercase

2015-10-12 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14954185#comment-14954185 ] yuhao yang commented on SPARK-11069: I'll try to do it and test with several cases. Updates will be

[jira] [Comment Edited] (SPARK-11069) Add RegexTokenizer option to convert to lowercase

2015-10-12 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14954185#comment-14954185 ] yuhao yang edited comment on SPARK-11069 at 10/13/15 5:11 AM: -- I'll try to

[jira] [Comment Edited] (SPARK-11069) Add RegexTokenizer option to convert to lowercase

2015-10-12 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14954185#comment-14954185 ] yuhao yang edited comment on SPARK-11069 at 10/13/15 5:11 AM: -- I'll try to

[jira] [Commented] (SPARK-11029) Add computeCost to KMeansModel in spark.ml

2015-10-11 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14952672#comment-14952672 ] yuhao yang commented on SPARK-11029: I can work on this if no one has started. Seems just a

[jira] [Commented] (SPARK-10670) Link to each language's API in codetabs in ML docs: spark.ml

2015-09-18 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14875675#comment-14875675 ] yuhao yang commented on SPARK-10670: I can work on this. > Link to each language's API in codetabs

[jira] [Commented] (SPARK-9578) Stemmer feature transformer

2015-09-09 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736695#comment-14736695 ] yuhao yang commented on SPARK-9578: --- A better choice for LDA seems to be lemmatization. Yet that

[jira] [Commented] (SPARK-9273) Add Convolutional Neural network to Spark MLlib

2015-09-09 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737963#comment-14737963 ] yuhao yang commented on SPARK-9273: --- Thank a lot for your attention [~avulanov]. I do hope we can join

[jira] [Commented] (SPARK-10491) move RowMatrix.dspr to BLAS

2015-09-08 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735921#comment-14735921 ] yuhao yang commented on SPARK-10491: If you don't mind, I'll start working on this. > move

[jira] [Created] (SPARK-10482) Add Python interface for CountVectorizer

2015-09-07 Thread yuhao yang (JIRA)
yuhao yang created SPARK-10482: -- Summary: Add Python interface for CountVectorizer Key: SPARK-10482 URL: https://issues.apache.org/jira/browse/SPARK-10482 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-9666) ML 1.5 QA: model save/load audit

2015-09-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14732235#comment-14732235 ] yuhao yang commented on SPARK-9666: --- Sure. Thanks. > ML 1.5 QA: model save/load audit >

[jira] [Commented] (SPARK-8696) StreamingLDA

2015-09-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14732240#comment-14732240 ] yuhao yang commented on SPARK-8696: --- Hi [~josephkb], I got a prototype on this. Is this a desirable

[jira] [Commented] (SPARK-10249) Add Python Code Example to StopWordsRemover User Guide

2015-09-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14732220#comment-14732220 ] yuhao yang commented on SPARK-10249: Thanks [~fliang] for creating the jira. > Add Python Code

[jira] [Created] (SPARK-10393) use ML pipeline in LDA example

2015-09-01 Thread yuhao yang (JIRA)
yuhao yang created SPARK-10393: -- Summary: use ML pipeline in LDA example Key: SPARK-10393 URL: https://issues.apache.org/jira/browse/SPARK-10393 Project: Spark Issue Type: Improvement

<    1   2   3   4   5   6   >