[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-06-21 Thread yuhao yang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342475#comment-15342475
 ] 

yuhao yang commented on SPARK-15100:


[~josephkb] Yes, I made a pass on ml.feature and all the potential issue are 
listed in the table above. 
Right now https://issues.apache.org/jira/browse/SPARK-15997 is not finished, 
yet I think it can be pushed to 2.1 if time if critical.  

> Audit: ml.feature
> -
>
> Key: SPARK-15100
> URL: https://issues.apache.org/jira/browse/SPARK-15100
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, ML
>Reporter: Joseph K. Bradley
>Priority: Blocker
>
> Audit this sub-package for new algorithms which do not have corresponding 
> sections & examples in the user guide.
> See parent issue for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-06-16 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334346#comment-15334346
 ] 

Joseph K. Bradley commented on SPARK-15100:
---

[~yuhaoyan]  Is it correct that you finished the audit of ml.feature?  Also, 
can you please make sure that there are subtasks for each of the issues 
identified during the audit & that they are linked here?  Then we can close 
this issue.  Thanks!

> Audit: ml.feature
> -
>
> Key: SPARK-15100
> URL: https://issues.apache.org/jira/browse/SPARK-15100
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, ML
>Reporter: Joseph K. Bradley
>Priority: Blocker
>
> Audit this sub-package for new algorithms which do not have corresponding 
> sections & examples in the user guide.
> See parent issue for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-05-28 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305308#comment-15305308
 ] 

Apache Spark commented on SPARK-15100:
--

User 'hhbyyh' has created a pull request for this issue:
https://github.com/apache/spark/pull/13375

> Audit: ml.feature
> -
>
> Key: SPARK-15100
> URL: https://issues.apache.org/jira/browse/SPARK-15100
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, ML
>Reporter: Joseph K. Bradley
>
> Audit this sub-package for new algorithms which do not have corresponding 
> sections & examples in the user guide.
> See parent issue for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-05-20 Thread Nick Pentreath (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292899#comment-15292899
 ] 

Nick Pentreath commented on SPARK-15100:


I created SPARK-15442 for #1

> Audit: ml.feature
> -
>
> Key: SPARK-15100
> URL: https://issues.apache.org/jira/browse/SPARK-15100
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, ML
>Reporter: Joseph K. Bradley
>
> Audit this sub-package for new algorithms which do not have corresponding 
> sections & examples in the user guide.
> See parent issue for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-05-20 Thread Nick Pentreath (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292895#comment-15292895
 ] 

Nick Pentreath commented on SPARK-15100:


I'm not sure we need to set each and every possible parameter in each example, 
especially things that have sane defaults (like relativeError) or are expert 
params.

> Audit: ml.feature
> -
>
> Key: SPARK-15100
> URL: https://issues.apache.org/jira/browse/SPARK-15100
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, ML
>Reporter: Joseph K. Bradley
>
> Audit this sub-package for new algorithms which do not have corresponding 
> sections & examples in the user guide.
> See parent issue for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-05-18 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289606#comment-15289606
 ] 

Apache Spark commented on SPARK-15100:
--

User 'GayathriMurali' has created a pull request for this issue:
https://github.com/apache/spark/pull/13176

> Audit: ml.feature
> -
>
> Key: SPARK-15100
> URL: https://issues.apache.org/jira/browse/SPARK-15100
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, ML
>Reporter: Joseph K. Bradley
>
> Audit this sub-package for new algorithms which do not have corresponding 
> sections & examples in the user guide.
> See parent issue for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-05-18 Thread Gayathri Murali (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289513#comment-15289513
 ] 

Gayathri Murali commented on SPARK-15100:
-

While making changes to CountVectorizer, HashingTF and QuantileDiscretizer I 
found the following issues

1. RelativeError is not available in Python for Quantile Discretizer. 
2. In built Python examples in feature.py does not include the newly added 
parameters such as Binary or relative Error
3. I am making changes to the example source code to include these parameters 
in model building. I hope this is expected

For 1 and 2, Should I send out a separate PR?

> Audit: ml.feature
> -
>
> Key: SPARK-15100
> URL: https://issues.apache.org/jira/browse/SPARK-15100
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, ML
>Reporter: Joseph K. Bradley
>
> Audit this sub-package for new algorithms which do not have corresponding 
> sections & examples in the user guide.
> See parent issue for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-05-18 Thread Bryan Cutler (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289402#comment-15289402
 ] 

Bryan Cutler commented on SPARK-15100:
--

sure, I hadn't started on those yet

> Audit: ml.feature
> -
>
> Key: SPARK-15100
> URL: https://issues.apache.org/jira/browse/SPARK-15100
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, ML
>Reporter: Joseph K. Bradley
>
> Audit this sub-package for new algorithms which do not have corresponding 
> sections & examples in the user guide.
> See parent issue for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-05-18 Thread Gayathri Murali (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289394#comment-15289394
 ] 

Gayathri Murali commented on SPARK-15100:
-

[~bryanc] I have the PR ready for Countvectorizer, hashingTf and 
QuantileDiscretizer. Do you mind if I send it? Can you please help review

> Audit: ml.feature
> -
>
> Key: SPARK-15100
> URL: https://issues.apache.org/jira/browse/SPARK-15100
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, ML
>Reporter: Joseph K. Bradley
>
> Audit this sub-package for new algorithms which do not have corresponding 
> sections & examples in the user guide.
> See parent issue for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-05-18 Thread Bryan Cutler (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289350#comment-15289350
 ] 

Bryan Cutler commented on SPARK-15100:
--

I did a quick pass through Scala and Python APIs, just found some docstring 
discrepancies which I put in PR 
[#13159|https://github.com/apache/spark/pull/13159]

> Audit: ml.feature
> -
>
> Key: SPARK-15100
> URL: https://issues.apache.org/jira/browse/SPARK-15100
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, ML
>Reporter: Joseph K. Bradley
>
> Audit this sub-package for new algorithms which do not have corresponding 
> sections & examples in the user guide.
> See parent issue for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-05-16 Thread Bryan Cutler (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285366#comment-15285366
 ] 

Bryan Cutler commented on SPARK-15100:
--

I can do a PR to update CountVectorizer and HashingTF

> Audit: ml.feature
> -
>
> Key: SPARK-15100
> URL: https://issues.apache.org/jira/browse/SPARK-15100
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, ML
>Reporter: Joseph K. Bradley
>
> Audit this sub-package for new algorithms which do not have corresponding 
> sections & examples in the user guide.
> See parent issue for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-05-10 Thread yuhao yang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277767#comment-15277767
 ] 

yuhao yang commented on SPARK-15100:


Improvements or new features in spark.ml.feature:

||feature   || improvements || user guide 
updated
|Binarizer | add support for vector input type;| no
|CountVectorizer | add binary parameter   | no
|HashingTF | add binary parameter   | no
|MaxAbsScaler| new feature  | yes
|QuantileDiscretizer | add relative error parameter | no
|StopWordsRemover | add locale support| no

I'll send PR to update for Binarizer and StopWordsRemover.

> Audit: ml.feature
> -
>
> Key: SPARK-15100
> URL: https://issues.apache.org/jira/browse/SPARK-15100
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, ML
>Reporter: Joseph K. Bradley
>
> Audit this sub-package for new algorithms which do not have corresponding 
> sections & examples in the user guide.
> See parent issue for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org