[jira] [Commented] (SPARK-15784) Add Power Iteration Clustering to spark.ml

2018-06-05 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16502733#comment-16502733 ] Miao Wang commented on SPARK-15784: --- [~WeichenXu123] Thank you very much!  > Add Power Iteration

[jira] [Commented] (SPARK-15784) Add Power Iteration Clustering to spark.ml

2018-06-02 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16499143#comment-16499143 ] Miao Wang commented on SPARK-15784: --- [~josephkb] Just saw your comments. Let me try fix it. I am on

[jira] [Commented] (SPARK-23996) Implement the optimal KLL algorithms for quantiles in streams

2018-04-23 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448639#comment-16448639 ] Miao Wang commented on SPARK-23996: --- Thanks for your reply! I am learning the core part of the

[jira] [Comment Edited] (SPARK-23996) Implement the optimal KLL algorithms for quantiles in streams

2018-04-19 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444909#comment-16444909 ] Miao Wang edited comment on SPARK-23996 at 4/19/18 10:34 PM: - [~timhunter]

[jira] [Commented] (SPARK-23996) Implement the optimal KLL algorithms for quantiles in streams

2018-04-19 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444909#comment-16444909 ] Miao Wang commented on SPARK-23996: --- [~timhunter] There is a Java implementation:

[jira] [Commented] (SPARK-19827) spark.ml R API for PIC

2018-04-19 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1679#comment-1679 ] Miao Wang commented on SPARK-19827: --- [~felixcheung] The scala code is just merged. I can work this now.

[jira] [Commented] (SPARK-15784) Add Power Iteration Clustering to spark.ml

2018-04-17 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441353#comment-16441353 ] Miao Wang commented on SPARK-15784: --- [~josephkb] You can start the new PR now. :) > Add Power

[jira] [Commented] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2018-02-12 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361631#comment-16361631 ] Miao Wang commented on SPARK-20307: --- [~felixcheung] I will do it during the Lunar New Year vacation. I

[jira] [Commented] (SPARK-18131) Support returning Vector/Dense Vector from backend

2017-10-05 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193891#comment-16193891 ] Miao Wang commented on SPARK-18131: --- [~felixcheung] We got stuck at the data types definitions. There

[jira] [Commented] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2017-07-17 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090213#comment-16090213 ] Miao Wang commented on SPARK-20307: --- [~yanboliang] Have you added the support on Python side? I think

[jira] [Created] (SPARK-21381) SparkR: pass on setHandleInvalid for classification algorithms

2017-07-11 Thread Miao Wang (JIRA)
Miao Wang created SPARK-21381: - Summary: SparkR: pass on setHandleInvalid for classification algorithms Key: SPARK-21381 URL: https://issues.apache.org/jira/browse/SPARK-21381 Project: Spark

[jira] [Commented] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2017-06-30 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070988#comment-16070988 ] Miao Wang commented on SPARK-20307: --- [~Monday0927!] "Have you also try to load the model trained in

[jira] [Commented] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2017-06-30 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070647#comment-16070647 ] Miao Wang commented on SPARK-20307: --- Update: Manual test works. I will submit PR soon. > SparkR: pass

[jira] [Commented] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2017-06-23 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16061525#comment-16061525 ] Miao Wang commented on SPARK-20307: --- [~Monday0927!] I am working on it now. Thanks! Miao cc

[jira] [Created] (SPARK-20906) Constrained Logistic Regression for SparkR

2017-05-27 Thread Miao Wang (JIRA)
Miao Wang created SPARK-20906: - Summary: Constrained Logistic Regression for SparkR Key: SPARK-20906 URL: https://issues.apache.org/jira/browse/SPARK-20906 Project: Spark Issue Type: Bug

[jira] [Comment Edited] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2017-05-23 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022049#comment-16022049 ] Miao Wang edited comment on SPARK-20307 at 5/23/17 11:16 PM: - [~felixcheung]

[jira] [Commented] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2017-05-23 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022049#comment-16022049 ] Miao Wang commented on SPARK-20307: --- [~felixcheung] I will wait for [~wayen.zh...@263.net] to reply

[jira] [Commented] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2017-05-22 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020491#comment-16020491 ] Miao Wang commented on SPARK-20307: --- I think I get your point: In scala code, /** @group setParam */

[jira] [Commented] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2017-05-22 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020482#comment-16020482 ] Miao Wang commented on SPARK-20307: --- Do you mean adding a new method `setHandleInvalid` in the backend

[jira] [Commented] (SPARK-20803) KernelDensity.estimate in pyspark.mllib.stat.KernelDensity throws net.razorvine.pickle.PickleException when input data is normally distributed (no error when data is n

2017-05-22 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020474#comment-16020474 ] Miao Wang commented on SPARK-20803: --- Can you put the data set `colVec`? So we can easily reproduce this

[jira] [Updated] (SPARK-20533) SparkR Wrappers Model should be private and value should be lazy

2017-04-29 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miao Wang updated SPARK-20533: -- Priority: Minor (was: Major) > SparkR Wrappers Model should be private and value should be lazy >

[jira] [Created] (SPARK-20533) SparkR Wrappers Model should be private and value should be lazy

2017-04-29 Thread Miao Wang (JIRA)
Miao Wang created SPARK-20533: - Summary: SparkR Wrappers Model should be private and value should be lazy Key: SPARK-20533 URL: https://issues.apache.org/jira/browse/SPARK-20533 Project: Spark

[jira] [Commented] (SPARK-20478) Document LinearSVC in R programming guide

2017-04-27 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987903#comment-15987903 ] Miao Wang commented on SPARK-20478: --- OK. I will do it. Thanks for pointing me the place. > Document

[jira] [Commented] (SPARK-20478) Document LinearSVC in R programming guide

2017-04-27 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15986101#comment-15986101 ] Miao Wang commented on SPARK-20478: --- R programming guide and vignettes have linear svm document. What

[jira] [Commented] (SPARK-20477) Document R bisecting k-means in R programming guide

2017-04-27 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15986072#comment-15986072 ] Miao Wang commented on SPARK-20477: --- OK. I will add it. > Document R bisecting k-means in R

[jira] [Commented] (SPARK-20478) Document LinearSVC in R programming guide

2017-04-27 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15986071#comment-15986071 ] Miao Wang commented on SPARK-20478: --- OK. I will add it. > Document LinearSVC in R programming guide >

[jira] [Commented] (SPARK-19634) Feature parity for descriptive statistics in MLlib

2017-03-21 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935304#comment-15935304 ] Miao Wang commented on SPARK-19634: --- Comments never come to email box. [~timhunter] I can continue with

[jira] [Commented] (SPARK-19288) Failure (at test_sparkSQL.R#1300): date functions on a DataFrame in R/run-tests.sh

2017-03-16 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929433#comment-15929433 ] Miao Wang commented on SPARK-19288: --- I think it only happens at local build. I had another similar

[jira] [Comment Edited] (SPARK-19827) spark.ml R API for PIC

2017-03-16 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929429#comment-15929429 ] Miao Wang edited comment on SPARK-19827 at 3/17/17 5:15 AM: Please hold on.

[jira] [Commented] (SPARK-19827) spark.ml R API for PIC

2017-03-16 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929429#comment-15929429 ] Miao Wang commented on SPARK-19827: --- Please hold on. We need to add wrapper to ML instead of MLLIB. The

[jira] [Commented] (SPARK-19635) Feature parity for Chi-square hypothesis testing in MLlib

2017-02-21 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876979#comment-15876979 ] Miao Wang commented on SPARK-19635: --- https://github.com/apache/spark/pull/13440 [~timhunter] This one

[jira] [Commented] (SPARK-19635) Feature parity for Chi-square hypothesis testing in MLlib

2017-02-21 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876977#comment-15876977 ] Miao Wang commented on SPARK-19635: --- I think there is a related PR opened for quite a while. Let me

[jira] [Commented] (SPARK-19639) Add spark.svmLinear example and update vignettes

2017-02-21 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876936#comment-15876936 ] Miao Wang commented on SPARK-19639: --- This JIRA should be closed as the PR is merged. Thanks! cc

[jira] [Created] (SPARK-19639) Add spark.svmLinear example and update vignettes

2017-02-16 Thread Miao Wang (JIRA)
Miao Wang created SPARK-19639: - Summary: Add spark.svmLinear example and update vignettes Key: SPARK-19639 URL: https://issues.apache.org/jira/browse/SPARK-19639 Project: Spark Issue Type:

[jira] [Commented] (SPARK-19634) Feature parity for descriptive statistics in MLlib

2017-02-16 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870825#comment-15870825 ] Miao Wang commented on SPARK-19634: --- I can give a try. Thanks! Miao > Feature parity for descriptive

[jira] [Commented] (SPARK-19460) Update dataset used in R documentation, examples to reduce warning noise and confusions

2017-02-15 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868914#comment-15868914 ] Miao Wang commented on SPARK-19460: --- By the way, I remembered that you had discussion about fixing the

[jira] [Commented] (SPARK-19460) Update dataset used in R documentation, examples to reduce warning noise and confusions

2017-02-15 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868911#comment-15868911 ] Miao Wang commented on SPARK-19460: --- Seems a lots of work. :) I can give a try. > Update dataset used

[jira] [Created] (SPARK-19616) weightCol and aggregationDepth should be improved for some SparkR APIs

2017-02-15 Thread Miao Wang (JIRA)
Miao Wang created SPARK-19616: - Summary: weightCol and aggregationDepth should be improved for some SparkR APIs Key: SPARK-19616 URL: https://issues.apache.org/jira/browse/SPARK-19616 Project: Spark

[jira] [Commented] (SPARK-14894) Python GaussianMixture summary

2017-02-14 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866999#comment-15866999 ] Miao Wang commented on SPARK-14894: --- This is a dup of JIRA-18282. Should be closed. > Python

[jira] [Created] (SPARK-19456) Add LinearSVC R API

2017-02-03 Thread Miao Wang (JIRA)
Miao Wang created SPARK-19456: - Summary: Add LinearSVC R API Key: SPARK-19456 URL: https://issues.apache.org/jira/browse/SPARK-19456 Project: Spark Issue Type: New Feature Components:

[jira] [Commented] (SPARK-19382) Test sparse vectors in LinearSVCSuite

2017-02-02 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15850369#comment-15850369 ] Miao Wang commented on SPARK-19382: --- In addition, def merge(other: MultivariateOnlineSummarizer):

[jira] [Commented] (SPARK-19382) Test sparse vectors in LinearSVCSuite

2017-01-31 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847683#comment-15847683 ] Miao Wang commented on SPARK-19382: --- [~josephkb] If I understand correctly, I think we have to create

[jira] (SPARK-19382) Test sparse vectors in LinearSVCSuite

2017-01-31 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847391#comment-15847391 ] Miao Wang commented on SPARK-19382: --- I can try to submit a PR today or tomorrow. Thanks! > Test sparse

[jira] (SPARK-18131) Support returning Vector/Dense Vector from backend

2017-01-31 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847390#comment-15847390 ] Miao Wang commented on SPARK-18131: --- [~felixcheung][~yanboliang][~shivaram] I am trying to add the

[jira] [Commented] (SPARK-19336) LinearSVC Python API

2017-01-24 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836523#comment-15836523 ] Miao Wang commented on SPARK-19336: --- [~mlnick] Thanks! > LinearSVC Python API > >

[jira] [Created] (SPARK-19319) SparkR Kmeans summary returns error when the cluster size doesn't equal to k

2017-01-20 Thread Miao Wang (JIRA)
Miao Wang created SPARK-19319: - Summary: SparkR Kmeans summary returns error when the cluster size doesn't equal to k Key: SPARK-19319 URL: https://issues.apache.org/jira/browse/SPARK-19319 Project:

[jira] [Commented] (SPARK-18011) SparkR serialize "NA" throws exception

2017-01-18 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828426#comment-15828426 ] Miao Wang commented on SPARK-18011: --- OS and R information: R version 3.3.0 (2016-05-03) -- "Supposedly

[jira] [Comment Edited] (SPARK-18011) SparkR serialize "NA" throws exception

2017-01-17 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827032#comment-15827032 ] Miao Wang edited comment on SPARK-18011 at 1/17/17 11:27 PM: - [~felixcheung]

[jira] [Commented] (SPARK-18011) SparkR serialize "NA" throws exception

2017-01-17 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827032#comment-15827032 ] Miao Wang commented on SPARK-18011: --- [~felixcheung] I did intensive debug in both R side and scala

[jira] [Created] (SPARK-19142) spark.kmeans should take seed, initSteps, and tol as parameters

2017-01-09 Thread Miao Wang (JIRA)
Miao Wang created SPARK-19142: - Summary: spark.kmeans should take seed, initSteps, and tol as parameters Key: SPARK-19142 URL: https://issues.apache.org/jira/browse/SPARK-19142 Project: Spark

[jira] [Commented] (SPARK-18011) SparkR serialize "NA" throws exception

2017-01-09 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812306#comment-15812306 ] Miao Wang commented on SPARK-18011: --- The problem is for some R version (e.g., the version on my mac),

[jira] [Created] (SPARK-19110) DistributedLDAModel returns different logPrior for original and loaded model

2017-01-06 Thread Miao Wang (JIRA)
Miao Wang created SPARK-19110: - Summary: DistributedLDAModel returns different logPrior for original and loaded model Key: SPARK-19110 URL: https://issues.apache.org/jira/browse/SPARK-19110 Project:

[jira] [Updated] (SPARK-19066) SparkR LDA doesn't set optimizer correctly

2017-01-03 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miao Wang updated SPARK-19066: -- Description: spark.lda pass the optimizer "em" or "online" to the backend. However, LDAWrapper

[jira] [Created] (SPARK-19066) SparkR LDA doesn't set optimizer correctly

2017-01-03 Thread Miao Wang (JIRA)
Miao Wang created SPARK-19066: - Summary: SparkR LDA doesn't set optimizer correctly Key: SPARK-19066 URL: https://issues.apache.org/jira/browse/SPARK-19066 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-18821) Bisecting k-means wrapper in SparkR

2017-01-02 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15794322#comment-15794322 ] Miao Wang commented on SPARK-18821: --- Start it now. ETA within one week. > Bisecting k-means wrapper in

[jira] [Commented] (SPARK-18821) Bisecting k-means wrapper in SparkR

2016-12-20 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766299#comment-15766299 ] Miao Wang commented on SPARK-18821: --- I can work on this one, if it is not urgent. Thanks! > Bisecting

[jira] [Created] (SPARK-18865) SparkR vignettes MLP and LDA updates

2016-12-14 Thread Miao Wang (JIRA)
Miao Wang created SPARK-18865: - Summary: SparkR vignettes MLP and LDA updates Key: SPARK-18865 URL: https://issues.apache.org/jira/browse/SPARK-18865 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-18795) SparkR vignette update: ksTest

2016-12-14 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749701#comment-15749701 ] Miao Wang commented on SPARK-18795: --- [~josephkb] Sorry for late response. I was in something in last

[jira] [Issue Comment Deleted] (SPARK-18332) SparkR 2.1 QA: Programming guide, migration guide, vignettes updates

2016-12-08 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miao Wang updated SPARK-18332: -- Comment: was deleted (was: Update spark.logit is part of the QA work.) > SparkR 2.1 QA: Programming

[jira] [Commented] (SPARK-18332) SparkR 2.1 QA: Programming guide, migration guide, vignettes updates

2016-12-08 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733785#comment-15733785 ] Miao Wang commented on SPARK-18332: --- [~josephkb] https://github.com/apache/spark/pull/16222 This PR

[jira] [Commented] (SPARK-18795) SparkR vignette update: ksTest

2016-12-08 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733777#comment-15733777 ] Miao Wang commented on SPARK-18795: --- I will work on this one too. Thanks! Miao > SparkR vignette

[jira] [Commented] (SPARK-18792) SparkR vignette update: logit

2016-12-08 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733775#comment-15733775 ] Miao Wang commented on SPARK-18792: --- I have submitted PR for JIRA-18797, which is the same as this one.

[jira] [Commented] (SPARK-18332) SparkR 2.1 QA: Programming guide, migration guide, vignettes updates

2016-12-08 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733771#comment-15733771 ] Miao Wang commented on SPARK-18332: --- Update spark.logit is part of the QA work. > SparkR 2.1 QA:

[jira] [Created] (SPARK-18797) Update spark.logit in sparkr-vignettes

2016-12-08 Thread Miao Wang (JIRA)
Miao Wang created SPARK-18797: - Summary: Update spark.logit in sparkr-vignettes Key: SPARK-18797 URL: https://issues.apache.org/jira/browse/SPARK-18797 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-18332) SparkR 2.1 QA: Programming guide, migration guide, vignettes updates

2016-12-08 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732975#comment-15732975 ] Miao Wang commented on SPARK-18332: --- [~felixcheung]Let me add the spark.logit by today. Then, we can

[jira] [Commented] (SPARK-18349) Update R API documentation on ml model summary

2016-12-05 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722932#comment-15722932 ] Miao Wang commented on SPARK-18349: --- I go through the "summary" methods in mllib.R and have the

[jira] [Commented] (SPARK-18349) Update R API documentation on ml model summary

2016-12-04 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15721360#comment-15721360 ] Miao Wang commented on SPARK-18349: --- Will submit PR by tomorrow. Thanks! > Update R API documentation

[jira] [Commented] (SPARK-18349) Update R API documentation on ml model summary

2016-12-02 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15716699#comment-15716699 ] Miao Wang commented on SPARK-18349: --- [~felixcheung]I can help on this task. I am helping Yanbo on

[jira] [Commented] (SPARK-18476) SparkR Logistic Regression should should support output original label.

2016-12-01 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713515#comment-15713515 ] Miao Wang commented on SPARK-18476: --- spark.logit predict should output original label instead of a

[jira] [Commented] (SPARK-18131) Support returning Vector/Dense Vector from backend

2016-12-01 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713430#comment-15713430 ] Miao Wang commented on SPARK-18131: --- I can try to follow this discussion for an initial PR. > Support

[jira] [Created] (SPARK-18633) Add multiclass logistic regression summary python example and document

2016-11-29 Thread Miao Wang (JIRA)
Miao Wang created SPARK-18633: - Summary: Add multiclass logistic regression summary python example and document Key: SPARK-18633 URL: https://issues.apache.org/jira/browse/SPARK-18633 Project: Spark

[jira] [Commented] (SPARK-18332) SparkR 2.1 QA: Programming guide, migration guide, vignettes updates

2016-11-28 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703455#comment-15703455 ] Miao Wang commented on SPARK-18332: --- For some reason, I didn't receive the `CC` notification for this

[jira] [Commented] (SPARK-18558) spark-csv: infer data type for mixed integer/null columns causes exception

2016-11-28 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703424#comment-15703424 ] Miao Wang commented on SPARK-18558: --- scala> val df = spark.read.option("header",

[jira] [Created] (SPARK-18476) SparkR Logistic Regression should should support output original label.

2016-11-16 Thread Miao Wang (JIRA)
Miao Wang created SPARK-18476: - Summary: SparkR Logistic Regression should should support output original label. Key: SPARK-18476 URL: https://issues.apache.org/jira/browse/SPARK-18476 Project: Spark

[jira] [Commented] (SPARK-18266) Update R vignettes and programming guide for 2.1.0 release

2016-11-04 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637204#comment-15637204 ] Miao Wang commented on SPARK-18266: --- [~felixcheung] Is this an umbrella JIRA? > Update R vignettes and

[jira] [Commented] (SPARK-15784) Add Power Iteration Clustering to spark.ml

2016-11-04 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637189#comment-15637189 ] Miao Wang commented on SPARK-15784: --- I created a new PR to implement PIC as a Transformer. > Add Power

[jira] [Comment Edited] (SPARK-15784) Add Power Iteration Clustering to spark.ml

2016-11-01 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15627045#comment-15627045 ] Miao Wang edited comment on SPARK-15784 at 11/1/16 11:29 PM: - [~josephkb] I

[jira] [Commented] (SPARK-15784) Add Power Iteration Clustering to spark.ml

2016-11-01 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15627045#comment-15627045 ] Miao Wang commented on SPARK-15784: --- [~josephkb] I am good for the Transformer approach too. I will

[jira] [Commented] (SPARK-15784) Add Power Iteration Clustering to spark.ml

2016-11-01 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15626169#comment-15626169 ] Miao Wang commented on SPARK-15784: --- Just closed the PR. Let us continue the design here and I will

[jira] [Commented] (SPARK-18133) Python ML Pipeline Example has syntax errors

2016-10-27 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15610968#comment-15610968 ] Miao Wang commented on SPARK-18133: --- Use Pyspark: >>> training = spark.createDataFrame([ ...

[jira] [Commented] (SPARK-18133) Python ML Pipeline Example has syntax errors

2016-10-27 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15610961#comment-15610961 ] Miao Wang commented on SPARK-18133: --- Python 2.7.11 |Anaconda 2.4.0 (x86_64)| (default, Dec 6 2015,

[jira] [Created] (SPARK-18131) Support returning Vector/Dense Vector from backend

2016-10-26 Thread Miao Wang (JIRA)
Miao Wang created SPARK-18131: - Summary: Support returning Vector/Dense Vector from backend Key: SPARK-18131 URL: https://issues.apache.org/jira/browse/SPARK-18131 Project: Spark Issue Type: New

[jira] [Updated] (SPARK-18126) getIteratorZipWithIndex accepts negative value as index.

2016-10-26 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miao Wang updated SPARK-18126: -- Summary: getIteratorZipWithIndex accepts negative value as index. (was: zipWithIndex accepts negative

[jira] [Created] (SPARK-18126) zipWithIndex accepts negative value as index.

2016-10-26 Thread Miao Wang (JIRA)
Miao Wang created SPARK-18126: - Summary: zipWithIndex accepts negative value as index. Key: SPARK-18126 URL: https://issues.apache.org/jira/browse/SPARK-18126 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-18011) SparkR serialize "NA" throws exception

2016-10-19 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15589331#comment-15589331 ] Miao Wang commented on SPARK-18011: --- We have detailed discussions on PR

[jira] [Created] (SPARK-18011) SparkR serialize "NA" throws exception

2016-10-19 Thread Miao Wang (JIRA)
Miao Wang created SPARK-18011: - Summary: SparkR serialize "NA" throws exception Key: SPARK-18011 URL: https://issues.apache.org/jira/browse/SPARK-18011 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-17811) SparkR cannot parallelize data.frame with NA or NULL in Date columns

2016-10-11 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566163#comment-15566163 ] Miao Wang commented on SPARK-17811: --- :) Just want to submit a PR and found that you have a fix. Good to

[jira] [Comment Edited] (SPARK-17811) SparkR cannot parallelize data.frame with NA or NULL in Date columns

2016-10-07 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1996#comment-1996 ] Miao Wang edited comment on SPARK-17811 at 10/7/16 7:10 PM: > df <-

[jira] [Commented] (SPARK-17811) SparkR cannot parallelize data.frame with NA or NULL in Date columns

2016-10-07 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1996#comment-1996 ] Miao Wang commented on SPARK-17811: --- > df <- data.frame(Date = as.POSIXlt(as.Date(c(rep("2016-01-10",

[jira] [Commented] (SPARK-17811) SparkR cannot parallelize data.frame with NA or NULL in Date columns

2016-10-06 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15554187#comment-15554187 ] Miao Wang commented on SPARK-17811: --- I am looking at this issue. Thanks! > SparkR cannot parallelize

[jira] [Commented] (SPARK-17602) PySpark - Performance Optimization Large Size of Broadcast Variable

2016-09-20 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15508065#comment-15508065 ] Miao Wang commented on SPARK-17602: --- Does this change also benefit/impact Windows OS? > PySpark -

[jira] [Commented] (SPARK-17608) Long type has incorrect serialization/deserialization

2016-09-20 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15507257#comment-15507257 ] Miao Wang commented on SPARK-17608: ---

[jira] [Commented] (SPARK-17608) Long type has incorrect serialization/deserialization

2016-09-20 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15507227#comment-15507227 ] Miao Wang commented on SPARK-17608: --- Let me take a look. Thanks! > Long type has incorrect

[jira] [Commented] (SPARK-17498) StringIndexer.setHandleInvalid sohuld have another option 'new'

2016-09-12 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15483383#comment-15483383 ] Miao Wang commented on SPARK-17498: --- Can you give a concrete example? > StringIndexer.setHandleInvalid

[jira] [Commented] (SPARK-17469) mapWithState causes block lock warning

2016-09-09 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15478339#comment-15478339 ] Miao Wang commented on SPARK-17469: --- Can you give command for reproduction? > mapWithState causes

[jira] [Commented] (SPARK-17110) Pyspark with locality ANY throw java.io.StreamCorruptedException

2016-08-29 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15445029#comment-15445029 ] Miao Wang commented on SPARK-17110: --- Can you post a sample configuration? It could be simpler to

[jira] [Commented] (SPARK-17110) Pyspark with locality ANY throw java.io.StreamCorruptedException

2016-08-26 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15440786#comment-15440786 ] Miao Wang commented on SPARK-17110: --- I set up a two-node cluster, one master, one worker, 48 cores. 1G

[jira] [Commented] (SPARK-17156) Add multiclass logistic regression Scala Example

2016-08-24 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436004#comment-15436004 ] Miao Wang commented on SPARK-17156: --- Two quick comments: 1). Add some comments like in the

[jira] [Commented] (SPARK-17157) Add multiclass logistic regression SparkR Wrapper

2016-08-24 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435996#comment-15435996 ] Miao Wang commented on SPARK-17157: --- Start working on it now. > Add multiclass logistic regression

[jira] [Updated] (SPARK-17157) Add multiclass logistic regression SparkR Wrapper

2016-08-19 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miao Wang updated SPARK-17157: -- Component/s: SparkR > Add multiclass logistic regression SparkR Wrapper >

  1   2   3   >