[jira] [Commented] (SPARK-22974) CountVectorModel does not attach attributes to output column

2019-05-03 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832691#comment-16832691 ] yuhao yang commented on SPARK-22974: On a business trip from April 29th to May 3rd . Please expect

[jira] [Commented] (SPARK-20082) Incremental update of LDA model, by adding initialModel as start point

2019-03-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791938#comment-16791938 ] yuhao yang commented on SPARK-20082: Yuhao is taking family bonding leave from March 7th to Apr 19th

[jira] [Updated] (SPARK-25011) Add PrefixSpan to __all__ in fpm.py

2018-08-03 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-25011: --- Summary: Add PrefixSpan to __all__ in fpm.py (was: Add PrefixSpan to __all__) > Add PrefixSpan to

[jira] [Created] (SPARK-25011) Add PrefixSpan to __all__

2018-08-02 Thread yuhao yang (JIRA)
yuhao yang created SPARK-25011: -- Summary: Add PrefixSpan to __all__ Key: SPARK-25011 URL: https://issues.apache.org/jira/browse/SPARK-25011 Project: Spark Issue Type: Bug Components:

[jira] [Commented] (SPARK-23742) Filter out redundant AssociationRules

2018-08-01 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566326#comment-16566326 ] yuhao yang commented on SPARK-23742: [~maropu] Can you be more specific about the suggestion? E.g.

[jira] [Commented] (SPARK-23742) Filter out redundant AssociationRules

2018-08-01 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564858#comment-16564858 ] yuhao yang commented on SPARK-23742: The redundant rule may have different confidence and support.

[jira] [Commented] (SPARK-15064) Locale support in StopWordsRemover

2018-06-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16502929#comment-16502929 ] yuhao yang commented on SPARK-15064: Yuhao will be OOF from May 29th to June 6th (annual leave and

[jira] [Commented] (SPARK-22943) OneHotEncoder supports manual specification of categorySizes

2018-01-16 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328310#comment-16328310 ] yuhao yang commented on SPARK-22943: Thanks for the reply, yet I cannot see how can user specify the

[jira] [Commented] (SPARK-22943) OneHotEncoder supports manual specification of categorySizes

2018-01-05 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16314412#comment-16314412 ] yuhao yang commented on SPARK-22943: Feel free to work on this but I would suggest to get green light

[jira] [Created] (SPARK-22943) OneHotEncoder supports manual specification of categorySizes

2018-01-02 Thread yuhao yang (JIRA)
yuhao yang created SPARK-22943: -- Summary: OneHotEncoder supports manual specification of categorySizes Key: SPARK-22943 URL: https://issues.apache.org/jira/browse/SPARK-22943 Project: Spark

[jira] [Commented] (SPARK-19053) Supporting multiple evaluation metrics in DataFrame-based API: discussion

2017-12-19 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297887#comment-16297887 ] yuhao yang commented on SPARK-19053: Plan for further development: 1. Initial API and function

[jira] [Commented] (SPARK-8418) Add single- and multi-value support to ML Transformers

2017-12-02 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16275723#comment-16275723 ] yuhao yang commented on SPARK-8418: --- second Nick's comments. > Add single- and multi-value support to

[jira] [Commented] (SPARK-22331) Make MLlib string params case-insensitive

2017-11-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16269169#comment-16269169 ] yuhao yang commented on SPARK-22331: Thanks for the interests [~smurakozi]. I tried to support this

[jira] [Commented] (SPARK-22427) StackOverFlowError when using FPGrowth

2017-11-20 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259587#comment-16259587 ] yuhao yang commented on SPARK-22427: I tried with larger scale data but did not repro the issue.

[jira] [Commented] (SPARK-22427) StackOverFlowError when using FPGrowth

2017-11-12 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16249017#comment-16249017 ] yuhao yang commented on SPARK-22427: Hi [~lyt] does increasing stack size resolve your issue? If not

[jira] [Created] (SPARK-22502) OnlineLDAOptimizer variationalTopicInference might be able to handle empty documents

2017-11-12 Thread yuhao yang (JIRA)
yuhao yang created SPARK-22502: -- Summary: OnlineLDAOptimizer variationalTopicInference might be able to handle empty documents Key: SPARK-22502 URL: https://issues.apache.org/jira/browse/SPARK-22502

[jira] [Commented] (SPARK-18755) Add Randomized Grid Search to Spark ML

2017-11-10 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247870#comment-16247870 ] yuhao yang commented on SPARK-18755: Thanks for all the interests. For anyone who wants to

[jira] [Commented] (SPARK-22427) StackOverFlowError when using FPGrowth

2017-11-03 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237174#comment-16237174 ] yuhao yang commented on SPARK-22427: Could you please try to increase the stack size, E.g. with

[jira] [Commented] (SPARK-13030) Change OneHotEncoder to Estimator

2017-10-31 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227094#comment-16227094 ] yuhao yang commented on SPARK-13030: I see. Thanks for the response [~mlnick]. The Estimator is

[jira] [Commented] (SPARK-13030) Change OneHotEncoder to Estimator

2017-10-31 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226307#comment-16226307 ] yuhao yang commented on SPARK-13030: Sorry to jumping in so late. I can see there's been a lot of

[jira] [Created] (SPARK-22381) Add StringParam that supports valid options

2017-10-28 Thread yuhao yang (JIRA)
yuhao yang created SPARK-22381: -- Summary: Add StringParam that supports valid options Key: SPARK-22381 URL: https://issues.apache.org/jira/browse/SPARK-22381 Project: Spark Issue Type: New

[jira] [Commented] (SPARK-18755) Add Randomized Grid Search to Spark ML

2017-10-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221800#comment-16221800 ] yuhao yang commented on SPARK-18755: Thanks for sending the update here. Feel free to send a PR as

[jira] [Commented] (SPARK-22331) Make MLlib string params case-insensitive

2017-10-23 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215489#comment-16215489 ] yuhao yang commented on SPARK-22331: Yes, I don't see the change will break any existing code. >

[jira] [Commented] (SPARK-22331) Strength consistency for supporting string params: case-insensitive or not

2017-10-22 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214667#comment-16214667 ] yuhao yang commented on SPARK-22331: cc [~WeichenXu123] > Strength consistency for supporting string

[jira] [Created] (SPARK-22331) Strength consistency for supporting string params: case-insensitive or not

2017-10-22 Thread yuhao yang (JIRA)
yuhao yang created SPARK-22331: -- Summary: Strength consistency for supporting string params: case-insensitive or not Key: SPARK-22331 URL: https://issues.apache.org/jira/browse/SPARK-22331 Project:

[jira] [Commented] (SPARK-22289) Cannot save LogisticRegressionClassificationModel with bounds on coefficients

2017-10-17 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16208614#comment-16208614 ] yuhao yang commented on SPARK-22289: Thanks for the reply. I'll start compose a PR. > Cannot save

[jira] [Commented] (SPARK-22289) Cannot save LogisticRegressionClassificationModel with bounds on coefficients

2017-10-17 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207115#comment-16207115 ] yuhao yang commented on SPARK-22289: cc [~yanboliang] [~dbtsai] > Cannot save

[jira] [Comment Edited] (SPARK-22289) Cannot save LogisticRegressionClassificationModel with bounds on coefficients

2017-10-17 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207063#comment-16207063 ] yuhao yang edited comment on SPARK-22289 at 10/17/17 6:43 AM: -- Thanks for

[jira] [Comment Edited] (SPARK-22289) Cannot save LogisticRegressionClassificationModel with bounds on coefficients

2017-10-17 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207063#comment-16207063 ] yuhao yang edited comment on SPARK-22289 at 10/17/17 6:28 AM: -- Thanks for

[jira] [Commented] (SPARK-22289) Cannot save LogisticRegressionClassificationModel with bounds on coefficients

2017-10-17 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207063#comment-16207063 ] yuhao yang commented on SPARK-22289: Thanks for reporting the issue. Should be a straight-forward

[jira] [Comment Edited] (SPARK-22195) Add cosine similarity to org.apache.spark.ml.linalg.Vectors

2017-10-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193844#comment-16193844 ] yuhao yang edited comment on SPARK-22195 at 10/6/17 7:33 AM: - Thanks for the

[jira] [Commented] (SPARK-22195) Add cosine similarity to org.apache.spark.ml.linalg.Vectors

2017-10-05 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193844#comment-16193844 ] yuhao yang commented on SPARK-22195: Thanks for the feedback. I don't see the existing

[jira] [Created] (SPARK-22210) Online LDA variationalTopicInference should use random seed to have stable behavior

2017-10-05 Thread yuhao yang (JIRA)
yuhao yang created SPARK-22210: -- Summary: Online LDA variationalTopicInference should use random seed to have stable behavior Key: SPARK-22210 URL: https://issues.apache.org/jira/browse/SPARK-22210

[jira] [Commented] (SPARK-3181) Add Robust Regression Algorithm with Huber Estimator

2017-10-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192217#comment-16192217 ] yuhao yang commented on SPARK-3181: --- Regarding to whether to separate Huber loss an an independent

[jira] [Commented] (SPARK-22195) Add cosine similarity to org.apache.spark.ml.linalg.Vectors

2017-10-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190884#comment-16190884 ] yuhao yang commented on SPARK-22195: Exactly, the implementation is straight forward, but I guess not

[jira] [Created] (SPARK-22195) Add cosine similarity to org.apache.spark.ml.linalg.Vectors

2017-10-03 Thread yuhao yang (JIRA)
yuhao yang created SPARK-22195: -- Summary: Add cosine similarity to org.apache.spark.ml.linalg.Vectors Key: SPARK-22195 URL: https://issues.apache.org/jira/browse/SPARK-22195 Project: Spark

[jira] [Commented] (SPARK-21866) SPIP: Image support in Spark

2017-10-03 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190239#comment-16190239 ] yuhao yang commented on SPARK-21866: My two cents, 1. In most scenarios, deep learning applications

[jira] [Commented] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

2017-08-23 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139023#comment-16139023 ] yuhao yang commented on SPARK-21535: Thank for for the comments. > Reduce memory requirement for

[jira] [Resolved] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

2017-08-10 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang resolved SPARK-21535. Resolution: Not A Problem The new implementation will load the evaluation dataset when training

[jira] [Commented] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

2017-07-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103547#comment-16103547 ] yuhao yang commented on SPARK-21535: It's not in my opinion.

[jira] [Comment Edited] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

2017-07-26 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100860#comment-16100860 ] yuhao yang edited comment on SPARK-21535 at 7/26/17 6:30 PM: -

[jira] [Commented] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

2017-07-26 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101870#comment-16101870 ] yuhao yang commented on SPARK-21535: The basic idea is that we should release the driver memory as

[jira] [Commented] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

2017-07-25 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100860#comment-16100860 ] yuhao yang commented on SPARK-21535: https://github.com/apache/spark/pulls > Reduce memory

[jira] [Updated] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

2017-07-25 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-21535: --- Description: CrossValidator and TrainValidationSplit both use {code}models =

[jira] [Created] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

2017-07-25 Thread yuhao yang (JIRA)
yuhao yang created SPARK-21535: -- Summary: Reduce memory requirement for CrossValidator and TrainValidationSplit Key: SPARK-21535 URL: https://issues.apache.org/jira/browse/SPARK-21535 Project: Spark

[jira] [Commented] (SPARK-21087) CrossValidator, TrainValidationSplit should preserve all models after fitting: Scala

2017-07-25 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100447#comment-16100447 ] yuhao yang commented on SPARK-21087: Withdrawing my PR, anyone with interests please go ahead and

[jira] [Commented] (SPARK-21524) ValidatorParamsSuiteHelpers generates wrong temp files

2017-07-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099313#comment-16099313 ] yuhao yang commented on SPARK-21524: https://github.com/apache/spark/pull/18728 >

[jira] [Created] (SPARK-21524) ValidatorParamsSuiteHelpers generates wrong temp files

2017-07-24 Thread yuhao yang (JIRA)
yuhao yang created SPARK-21524: -- Summary: ValidatorParamsSuiteHelpers generates wrong temp files Key: SPARK-21524 URL: https://issues.apache.org/jira/browse/SPARK-21524 Project: Spark Issue

[jira] [Commented] (SPARK-14239) Add load for LDAModel that supports both local and distributedModel

2017-07-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16098948#comment-16098948 ] yuhao yang commented on SPARK-14239: Close overlooked stale jira. > Add load for LDAModel that

[jira] [Resolved] (SPARK-14239) Add load for LDAModel that supports both local and distributedModel

2017-07-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang resolved SPARK-14239. Resolution: Won't Do > Add load for LDAModel that supports both local and distributedModel >

[jira] [Commented] (SPARK-12875) Add Weight of Evidence and Information value to Spark.ml as a feature transformer

2017-07-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16098946#comment-16098946 ] yuhao yang commented on SPARK-12875: Close stale jira. > Add Weight of Evidence and Information

[jira] [Resolved] (SPARK-12875) Add Weight of Evidence and Information value to Spark.ml as a feature transformer

2017-07-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang resolved SPARK-12875. Resolution: Won't Do > Add Weight of Evidence and Information value to Spark.ml as a feature >

[jira] [Comment Edited] (SPARK-14760) Feature transformers should always invoke transformSchema in transform or fit

2017-07-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16098940#comment-16098940 ] yuhao yang edited comment on SPARK-14760 at 7/24/17 6:23 PM: - Close stale

[jira] [Commented] (SPARK-14760) Feature transformers should always invoke transformSchema in transform or fit

2017-07-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16098940#comment-16098940 ] yuhao yang commented on SPARK-14760: Close it since it's been overlooked for some time. Thanks for

[jira] [Resolved] (SPARK-13223) Add stratified sampling to ML feature engineering

2017-07-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang resolved SPARK-13223. Resolution: Not A Problem > Add stratified sampling to ML feature engineering >

[jira] [Commented] (SPARK-13223) Add stratified sampling to ML feature engineering

2017-07-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16098933#comment-16098933 ] yuhao yang commented on SPARK-13223: Close it since it's been overlooked for some time and can be

[jira] [Commented] (SPARK-21086) CrossValidator, TrainValidationSplit should preserve all models after fitting

2017-07-21 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16097062#comment-16097062 ] yuhao yang commented on SPARK-21086: sure, indices sounds fine. For the driver memory, especially

[jira] [Updated] (SPARK-18724) Add TuningSummary for TrainValidationSplit and CountVectorizer

2017-07-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-18724: --- Summary: Add TuningSummary for TrainValidationSplit and CountVectorizer (was: Add TuningSummary for

[jira] [Comment Edited] (SPARK-11069) Add RegexTokenizer option to convert to lowercase

2017-07-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073987#comment-16073987 ] yuhao yang edited comment on SPARK-11069 at 7/4/17 6:32 PM:

[jira] [Comment Edited] (SPARK-11069) Add RegexTokenizer option to convert to lowercase

2017-07-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073987#comment-16073987 ] yuhao yang edited comment on SPARK-11069 at 7/4/17 6:31 PM:

[jira] [Commented] (SPARK-11069) Add RegexTokenizer option to convert to lowercase

2017-07-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073987#comment-16073987 ] yuhao yang commented on SPARK-11069: [~levente.torok.ge] use val regexTokenizer = new

[jira] [Commented] (SPARK-20082) Incremental update of LDA model, by adding initialModel as start point

2017-06-30 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070883#comment-16070883 ] yuhao yang commented on SPARK-20082: I'm OK with only supporting initialModel for Online LDA now. For

[jira] [Commented] (SPARK-19053) Supporting multiple evaluation metrics in DataFrame-based API: discussion

2017-06-30 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070849#comment-16070849 ] yuhao yang commented on SPARK-19053: Not sure if this is still wanted. cc [~josephkb] And I'd like to

[jira] [Commented] (SPARK-18441) Add Smote in spark mlib and ml

2017-06-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16067494#comment-16067494 ] yuhao yang commented on SPARK-18441: Move the Smote code to

[jira] [Commented] (SPARK-21152) Use level 3 BLAS operations in LogisticAggregator

2017-06-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065694#comment-16065694 ] yuhao yang commented on SPARK-21152: This is something that we should investigate anyway. By GEMM,

[jira] [Created] (SPARK-21108) convert LinearSVC to aggregator framework

2017-06-15 Thread yuhao yang (JIRA)
yuhao yang created SPARK-21108: -- Summary: convert LinearSVC to aggregator framework Key: SPARK-21108 URL: https://issues.apache.org/jira/browse/SPARK-21108 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-21087) CrossValidator, TrainValidationSplit should preserve all models after fitting: Scala

2017-06-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048723#comment-16048723 ] yuhao yang commented on SPARK-21087: I'd like to work on this if my

[jira] [Comment Edited] (SPARK-21086) CrossValidator, TrainValidationSplit should preserve all models after fitting

2017-06-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048647#comment-16048647 ] yuhao yang edited comment on SPARK-21086 at 6/14/17 5:22 AM: - Sounds good.

[jira] [Comment Edited] (SPARK-21086) CrossValidator, TrainValidationSplit should preserve all models after fitting

2017-06-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048647#comment-16048647 ] yuhao yang edited comment on SPARK-21086 at 6/14/17 5:12 AM: - Sounds good.

[jira] [Commented] (SPARK-20988) Convert logistic regression to new aggregator framework

2017-06-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048698#comment-16048698 ] yuhao yang commented on SPARK-20988: Eh.. I was trying to add the squared_hinge loss to LinearSVC and

[jira] [Resolved] (SPARK-20348) Support squared hinge loss (L2 loss) for LinearSVC

2017-06-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang resolved SPARK-20348. Resolution: Duplicate Combine it with SPARK-20602 and resolve this as duplicate. > Support

[jira] [Commented] (SPARK-20602) Adding LBFGS optimizer and Squared_hinge loss for LinearSVC

2017-06-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048663#comment-16048663 ] yuhao yang commented on SPARK-20602: Combining this with SPARK-20348. Support squared hinge loss (L2

[jira] [Updated] (SPARK-20602) Adding LBFGS optimizer and Squared_hinge loss for LinearSVC

2017-06-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-20602: --- Summary: Adding LBFGS optimizer and Squared_hinge loss for LinearSVC (was: Adding LBFGS as

[jira] [Commented] (SPARK-21086) CrossValidator, TrainValidationSplit should preserve all models after fitting

2017-06-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048647#comment-16048647 ] yuhao yang commented on SPARK-21086: Sounds good. About the default path for saving different models,

[jira] [Updated] (SPARK-20602) Adding LBFGS as optimizer for LinearSVC

2017-06-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-20602: --- Description: Currently LinearSVC in Spark only supports OWLQN as the optimizer ( check

[jira] [Commented] (SPARK-20082) Incremental update of LDA model, by adding initialModel as start point

2017-05-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022379#comment-16022379 ] yuhao yang commented on SPARK-20082: refer to https://issues.apache.org/jira/browse/SPARK-20767 for

[jira] [Commented] (SPARK-20767) The training continuation for saved LDA model

2017-05-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022375#comment-16022375 ] yuhao yang commented on SPARK-20767: Note there's already an issue about setInitialModel in

[jira] [Commented] (SPARK-20864) I tried to run spark mllib PIC algorithm, but got error

2017-05-23 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022345#comment-16022345 ] yuhao yang commented on SPARK-20864: [~yuanjie] Could you please provide more code to help the

[jira] [Commented] (SPARK-20768) PySpark FPGrowth does not expose numPartitions (expert) param

2017-05-18 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016116#comment-16016116 ] yuhao yang commented on SPARK-20768: Thanks for the ping. [~mlnick] We should just treat it as an

[jira] [Commented] (SPARK-20797) mllib lda's LocalLDAModel's save: out of memory.

2017-05-18 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016061#comment-16016061 ] yuhao yang commented on SPARK-20797: [~d0evi1] Thanks for reporting the issue and proposal for the

[jira] [Created] (SPARK-20670) Simplify FPGrowth transform

2017-05-08 Thread yuhao yang (JIRA)
yuhao yang created SPARK-20670: -- Summary: Simplify FPGrowth transform Key: SPARK-20670 URL: https://issues.apache.org/jira/browse/SPARK-20670 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-20602) Adding LBFGS as optimizer for LinearSVC

2017-05-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15997314#comment-15997314 ] yuhao yang commented on SPARK-20602: cc [~josephkb] > Adding LBFGS as optimizer for LinearSVC >

[jira] [Created] (SPARK-20602) Adding LBFGS as optimizer for LinearSVC

2017-05-04 Thread yuhao yang (JIRA)
yuhao yang created SPARK-20602: -- Summary: Adding LBFGS as optimizer for LinearSVC Key: SPARK-20602 URL: https://issues.apache.org/jira/browse/SPARK-20602 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-20526) Load doesn't work in PCAModel

2017-04-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989591#comment-15989591 ] yuhao yang commented on SPARK-20526: Can you please provide more context? like which version of Spark

[jira] [Commented] (SPARK-20502) ML, Graph 2.2 QA: API: Experimental, DeveloperApi, final, sealed audit

2017-04-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989317#comment-15989317 ] yuhao yang commented on SPARK-20502: Check here https://issues.apache.org/jira/browse/SPARK-18319 for

[jira] [Created] (SPARK-20351) Add trait hasTrainingSummary to replace the duplicate code

2017-04-16 Thread yuhao yang (JIRA)
yuhao yang created SPARK-20351: -- Summary: Add trait hasTrainingSummary to replace the duplicate code Key: SPARK-20351 URL: https://issues.apache.org/jira/browse/SPARK-20351 Project: Spark Issue

[jira] [Created] (SPARK-20348) Support squared hinge loss (L2 loss) for LinearSVC

2017-04-15 Thread yuhao yang (JIRA)
yuhao yang created SPARK-20348: -- Summary: Support squared hinge loss (L2 loss) for LinearSVC Key: SPARK-20348 URL: https://issues.apache.org/jira/browse/SPARK-20348 Project: Spark Issue Type:

[jira] [Commented] (SPARK-7128) Add generic bagging algorithm to spark.ml

2017-04-11 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965121#comment-15965121 ] yuhao yang commented on SPARK-7128: --- I would vote for adding this now. This is quite helpful in

[jira] [Updated] (SPARK-20271) Add FuncTransformer to simplify custom transformer creation

2017-04-09 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-20271: --- Description: Just to share some code I implemented to help easily create a custom Transformer in

[jira] [Created] (SPARK-20271) Add FuncTransformer to simplify custom transformer creation

2017-04-09 Thread yuhao yang (JIRA)
yuhao yang created SPARK-20271: -- Summary: Add FuncTransformer to simplify custom transformer creation Key: SPARK-20271 URL: https://issues.apache.org/jira/browse/SPARK-20271 Project: Spark

[jira] [Commented] (SPARK-20082) Incremental update of LDA model, by adding initialModel as start point

2017-04-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15959368#comment-15959368 ] yuhao yang commented on SPARK-20082: Sorry I'm occupied by some internal project this week. I'll find

[jira] [Commented] (SPARK-20203) Change default maxPatternLength value to Int.MaxValue in PrefixSpan

2017-04-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955705#comment-15955705 ] yuhao yang commented on SPARK-20203: [~Syrux] Since you got some experiences using the PrefixSpan,

[jira] [Comment Edited] (SPARK-20180) Unlimited max pattern length in Prefix span

2017-04-01 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15952377#comment-15952377 ] yuhao yang edited comment on SPARK-20180 at 4/1/17 8:14 PM: I assume user can

[jira] [Commented] (SPARK-20180) Unlimited max pattern length in Prefix span

2017-04-01 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15952377#comment-15952377 ] yuhao yang commented on SPARK-20180: I assume user can achieve the same effect by setting

[jira] [Comment Edited] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2017-03-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15944239#comment-15944239 ] yuhao yang edited comment on SPARK-20114 at 3/27/17 11:42 PM: -- Currently I

[jira] [Commented] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2017-03-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15944239#comment-15944239 ] yuhao yang commented on SPARK-20114: Currently I prefer to implement the dummy PrefixSpanModel as the

[jira] [Updated] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2017-03-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-20114: --- Description: Creating this jira to track the feature parity for PrefixSpan and sequential pattern

[jira] [Updated] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2017-03-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-20114: --- Description: Creating this jira to track the feature parity for PrefixSpan and sequential pattern

[jira] [Created] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2017-03-27 Thread yuhao yang (JIRA)
yuhao yang created SPARK-20114: -- Summary: spark.ml parity for sequential pattern mining - PrefixSpan Key: SPARK-20114 URL: https://issues.apache.org/jira/browse/SPARK-20114 Project: Spark Issue

[jira] [Commented] (SPARK-20083) Change matrix toArray to not create a new array when matrix is already column major

2017-03-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15943857#comment-15943857 ] yuhao yang commented on SPARK-20083: So the result array will allow users to manipulate the matrix

  1   2   3   4   5   6   >