[jira] [Commented] (SPARK-15100) Audit: ml.feature
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342475#comment-15342475 ] yuhao yang commented on SPARK-15100: [~josephkb] Yes, I made a pass on ml.feature and all the potential issue are listed in the table above. Right now https://issues.apache.org/jira/browse/SPARK-15997 is not finished, yet I think it can be pushed to 2.1 if time if critical. > Audit: ml.feature > - > > Key: SPARK-15100 > URL: https://issues.apache.org/jira/browse/SPARK-15100 > Project: Spark > Issue Type: Documentation > Components: Documentation, ML >Reporter: Joseph K. Bradley >Priority: Blocker > > Audit this sub-package for new algorithms which do not have corresponding > sections & examples in the user guide. > See parent issue for more details. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15100) Audit: ml.feature
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334346#comment-15334346 ] Joseph K. Bradley commented on SPARK-15100: --- [~yuhaoyan] Is it correct that you finished the audit of ml.feature? Also, can you please make sure that there are subtasks for each of the issues identified during the audit & that they are linked here? Then we can close this issue. Thanks! > Audit: ml.feature > - > > Key: SPARK-15100 > URL: https://issues.apache.org/jira/browse/SPARK-15100 > Project: Spark > Issue Type: Documentation > Components: Documentation, ML >Reporter: Joseph K. Bradley >Priority: Blocker > > Audit this sub-package for new algorithms which do not have corresponding > sections & examples in the user guide. > See parent issue for more details. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15100) Audit: ml.feature
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305308#comment-15305308 ] Apache Spark commented on SPARK-15100: -- User 'hhbyyh' has created a pull request for this issue: https://github.com/apache/spark/pull/13375 > Audit: ml.feature > - > > Key: SPARK-15100 > URL: https://issues.apache.org/jira/browse/SPARK-15100 > Project: Spark > Issue Type: Documentation > Components: Documentation, ML >Reporter: Joseph K. Bradley > > Audit this sub-package for new algorithms which do not have corresponding > sections & examples in the user guide. > See parent issue for more details. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15100) Audit: ml.feature
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292899#comment-15292899 ] Nick Pentreath commented on SPARK-15100: I created SPARK-15442 for #1 > Audit: ml.feature > - > > Key: SPARK-15100 > URL: https://issues.apache.org/jira/browse/SPARK-15100 > Project: Spark > Issue Type: Documentation > Components: Documentation, ML >Reporter: Joseph K. Bradley > > Audit this sub-package for new algorithms which do not have corresponding > sections & examples in the user guide. > See parent issue for more details. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15100) Audit: ml.feature
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292895#comment-15292895 ] Nick Pentreath commented on SPARK-15100: I'm not sure we need to set each and every possible parameter in each example, especially things that have sane defaults (like relativeError) or are expert params. > Audit: ml.feature > - > > Key: SPARK-15100 > URL: https://issues.apache.org/jira/browse/SPARK-15100 > Project: Spark > Issue Type: Documentation > Components: Documentation, ML >Reporter: Joseph K. Bradley > > Audit this sub-package for new algorithms which do not have corresponding > sections & examples in the user guide. > See parent issue for more details. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15100) Audit: ml.feature
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289606#comment-15289606 ] Apache Spark commented on SPARK-15100: -- User 'GayathriMurali' has created a pull request for this issue: https://github.com/apache/spark/pull/13176 > Audit: ml.feature > - > > Key: SPARK-15100 > URL: https://issues.apache.org/jira/browse/SPARK-15100 > Project: Spark > Issue Type: Documentation > Components: Documentation, ML >Reporter: Joseph K. Bradley > > Audit this sub-package for new algorithms which do not have corresponding > sections & examples in the user guide. > See parent issue for more details. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15100) Audit: ml.feature
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289513#comment-15289513 ] Gayathri Murali commented on SPARK-15100: - While making changes to CountVectorizer, HashingTF and QuantileDiscretizer I found the following issues 1. RelativeError is not available in Python for Quantile Discretizer. 2. In built Python examples in feature.py does not include the newly added parameters such as Binary or relative Error 3. I am making changes to the example source code to include these parameters in model building. I hope this is expected For 1 and 2, Should I send out a separate PR? > Audit: ml.feature > - > > Key: SPARK-15100 > URL: https://issues.apache.org/jira/browse/SPARK-15100 > Project: Spark > Issue Type: Documentation > Components: Documentation, ML >Reporter: Joseph K. Bradley > > Audit this sub-package for new algorithms which do not have corresponding > sections & examples in the user guide. > See parent issue for more details. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15100) Audit: ml.feature
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289402#comment-15289402 ] Bryan Cutler commented on SPARK-15100: -- sure, I hadn't started on those yet > Audit: ml.feature > - > > Key: SPARK-15100 > URL: https://issues.apache.org/jira/browse/SPARK-15100 > Project: Spark > Issue Type: Documentation > Components: Documentation, ML >Reporter: Joseph K. Bradley > > Audit this sub-package for new algorithms which do not have corresponding > sections & examples in the user guide. > See parent issue for more details. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15100) Audit: ml.feature
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289394#comment-15289394 ] Gayathri Murali commented on SPARK-15100: - [~bryanc] I have the PR ready for Countvectorizer, hashingTf and QuantileDiscretizer. Do you mind if I send it? Can you please help review > Audit: ml.feature > - > > Key: SPARK-15100 > URL: https://issues.apache.org/jira/browse/SPARK-15100 > Project: Spark > Issue Type: Documentation > Components: Documentation, ML >Reporter: Joseph K. Bradley > > Audit this sub-package for new algorithms which do not have corresponding > sections & examples in the user guide. > See parent issue for more details. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15100) Audit: ml.feature
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289350#comment-15289350 ] Bryan Cutler commented on SPARK-15100: -- I did a quick pass through Scala and Python APIs, just found some docstring discrepancies which I put in PR [#13159|https://github.com/apache/spark/pull/13159] > Audit: ml.feature > - > > Key: SPARK-15100 > URL: https://issues.apache.org/jira/browse/SPARK-15100 > Project: Spark > Issue Type: Documentation > Components: Documentation, ML >Reporter: Joseph K. Bradley > > Audit this sub-package for new algorithms which do not have corresponding > sections & examples in the user guide. > See parent issue for more details. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15100) Audit: ml.feature
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285366#comment-15285366 ] Bryan Cutler commented on SPARK-15100: -- I can do a PR to update CountVectorizer and HashingTF > Audit: ml.feature > - > > Key: SPARK-15100 > URL: https://issues.apache.org/jira/browse/SPARK-15100 > Project: Spark > Issue Type: Documentation > Components: Documentation, ML >Reporter: Joseph K. Bradley > > Audit this sub-package for new algorithms which do not have corresponding > sections & examples in the user guide. > See parent issue for more details. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15100) Audit: ml.feature
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277767#comment-15277767 ] yuhao yang commented on SPARK-15100: Improvements or new features in spark.ml.feature: ||feature || improvements || user guide updated |Binarizer | add support for vector input type;| no |CountVectorizer | add binary parameter | no |HashingTF | add binary parameter | no |MaxAbsScaler| new feature | yes |QuantileDiscretizer | add relative error parameter | no |StopWordsRemover | add locale support| no I'll send PR to update for Binarizer and StopWordsRemover. > Audit: ml.feature > - > > Key: SPARK-15100 > URL: https://issues.apache.org/jira/browse/SPARK-15100 > Project: Spark > Issue Type: Documentation > Components: Documentation, ML >Reporter: Joseph K. Bradley > > Audit this sub-package for new algorithms which do not have corresponding > sections & examples in the user guide. > See parent issue for more details. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org