[ 
https://issues.apache.org/jira/browse/SPARK-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14554629#comment-14554629
 ] 

Xiangrui Meng edited comment on SPARK-7535 at 5/21/15 4:56 PM:
---------------------------------------------------------------

Some notes:

1. Estimator/Transformer/ doesn’t need to extend Params since PipelineStage 
already does.
2. @varargs to setDefault (SPARK-7498)
3. Move Evaluator to ml.evaluation.
4. Mention larger metrics are better.
5. PipelineModel doc. “compiled” -> “fitted”
6. Remove Params.validateParams(paramMap)?
7. UnresolvedAttribute (Java compatibility?)
8. Missing RegressionEvaluator
9. ml.feature missing package doc
10. param and getParam should be final
11. Hide PolynomialExpansion.expand
12. Update RegexTokenizer default setting.
13. Mention `RegexTokenizer` in `Tokenizer`.
14. Hide VectorAssembler.
15. Word2Vec.minCount -> @param
16. ParamValidators -> DeveloperApi
17. Params -> @DeveloperApi
18. ALS -> use dataframes to store user/item factors? Then we can hide 
ALS.Rating
19. ALSModel -> remove training parameters?
20. Hide MetadataUtils/SchemaUtils.


was (Author: mengxr):
Some notes:

1. Estimator/Transformer/ doesn’t need to extend Params since PipelineStage 
already does.
2. @varargs to fit / set / setDefault
3. Move Evaluator to ml.evaluation.
4. Mention larger metrics are better.
5. PipelineModel doc. “compiled” -> “fitted”
6. Remove Params.validateParams(paramMap)?
7. UnresolvedAttribute (Java compatibility?)
8. Missing RegressionEvaluator
9. ml.feature missing package doc
10. param and getParam should be final
11. Hide PolynomialExpansion.expand
12. Update RegexTokenizer default setting.
13. Mention `RegexTokenizer` in `Tokenizer`.
14. Hide VectorAssembler.
15. Word2Vec.minCount -> @param
16. ParamValidators -> DeveloperApi
17. Params -> @DeveloperApi
18. ALS -> use dataframes to store user/item factors? Then we can hide 
ALS.Rating
19. ALSModel -> remove training parameters?
20. Hide MetadataUtils/SchemaUtils.

> Audit Pipeline APIs for 1.4
> ---------------------------
>
>                 Key: SPARK-7535
>                 URL: https://issues.apache.org/jira/browse/SPARK-7535
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML, PySpark
>            Reporter: Joseph K. Bradley
>            Assignee: Xiangrui Meng
>
> This is an umbrella for auditing the Pipeline (spark.ml) APIs.  Items to 
> check:
> * Public/protected/private access
> * Consistency across spark.ml
> * Classes, methods, and parameters in spark.mllib but missing in spark.ml
> ** We should create JIRAs for each of these (under an umbrella) as to-do 
> items for future releases.
> For each algorithm or API component, create a subtask under this umbrella.  
> Some major new items:
> * new feature transformers
> * tree models
> * elastic-net
> * ML attributes
> * developer APIs (Predictor, Classifier, Regressor)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to