[jira] [Commented] (SPARK-20502) ML, Graph 2.2 QA: API: Experimental, DeveloperApi, final, sealed audit

2017-05-16 Thread Nick Pentreath (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011976#comment-16011976
 ] 

Nick Pentreath commented on SPARK-20502:


Sounds good to me.

> ML, Graph 2.2 QA: API: Experimental, DeveloperApi, final, sealed audit
> --
>
> Key: SPARK-20502
> URL: https://issues.apache.org/jira/browse/SPARK-20502
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Reporter: Joseph K. Bradley
>Assignee: yuhao yang
>Priority: Blocker
> Fix For: 2.2.0
>
>
> We should make a pass through the items marked as Experimental or 
> DeveloperApi and see if any are stable enough to be unmarked.
> We should also check for items marked final or sealed to see if they are 
> stable enough to be opened up as APIs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20502) ML, Graph 2.2 QA: API: Experimental, DeveloperApi, final, sealed audit

2017-05-15 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011768#comment-16011768
 ] 

Joseph K. Bradley commented on SPARK-20502:
---

Thanks for doing this audit!  Note that the list includes some items which are 
actually package private.  I largely agree with you about not making changes 
right now.  We could arguably make some more things non-Experimental, but 
really, I'd prefer to leave them as-is for this release.  Some of the main 
items:
* Summaries: I wonder if we should keep these Experimental b/c of future 
extensions, for which we'd want to make these final.
* Keep Evaluators Experimental b/c of ongoing discussions about supporting 
multiple metrics, etc.
* Keep RFormula Experimental b/c of existing differences from R which may 
require behavior changes to fix

I'll mark this as done, but others can comment if they disagree.  Thanks 
[~yuhaoyan]!

> ML, Graph 2.2 QA: API: Experimental, DeveloperApi, final, sealed audit
> --
>
> Key: SPARK-20502
> URL: https://issues.apache.org/jira/browse/SPARK-20502
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Reporter: Joseph K. Bradley
>Assignee: yuhao yang
>Priority: Blocker
> Fix For: 2.2.0
>
>
> We should make a pass through the items marked as Experimental or 
> DeveloperApi and see if any are stable enough to be unmarked.
> We should also check for items marked final or sealed to see if they are 
> stable enough to be opened up as APIs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20502) ML, Graph 2.2 QA: API: Experimental, DeveloperApi, final, sealed audit

2017-04-28 Thread yuhao yang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989317#comment-15989317
 ] 

yuhao yang commented on SPARK-20502:


Check here https://issues.apache.org/jira/browse/SPARK-18319 for previous 
discussion. I updated the list according to the change we made last release. So 
far I don't think we need to make any change about the sealed and experimental 
API. But I listed some final class we have in ml which may be ready to be 
unmarked. 

sealed: 
org.apache.spark.ml.attribute.Attribute
org.apache.spark.ml.attribute.AttributeType
org.apache.spark.ml.classification.LogisticRegressionTrainingSummary
org.apache.spark.ml.classification.LogisticRegressionSummary
org.apache.spark.ml.feature.Term
org.apache.spark.ml.feature.InteractableTerm
org.apache.spark.ml.optim.WeightedLeastSquares.Solver
org.apache.spark.ml.optim.NormalEquationSolver
org.apache.spark.ml.tree.Node
org.apache.spark.ml.tree.Split
org.apache.spark.ml.util.BaseReadWrite
org.apache.spark.ml.linalg.Matrix
org.apache.spark.ml.linalg.Vector
org.apache.spark.mllib.stat.test.StreamingTestMethod
org.apache.spark.mllib.tree.model.TreeEnsembleModel

Experimental:
org.apache.spark.ml.classification.LinearSVC
org.apache.spark.ml.classification.LinearSVCModel
org.apache.spark.ml.classification.BinaryLogisticRegressionTrainingSummary
org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
org.apache.spark.ml.clustering.ClusteringSummary
org.apache.spark.ml.clustering.BisectingKMeansSummary
org.apache.spark.ml.clustering.GaussianMixtureSummary
org.apache.spark.ml.clustering.KMeansSummary
org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
org.apache.spark.ml.evaluation.RegressionEvaluator
org.apache.spark.ml.feature.BucketedRandomProjectionLSH(Model)
org.apache.spark.ml.feature.Imputer(Model)
org.apache.spark.ml.feature.MinHash(Model)
org.apache.spark.ml.feature.RFormula(Model)
org.apache.spark.ml.fpm.FPGrowth(Model)
org.apache.spark.ml.regression.AFTSurvivalRegression(Model)
org.apache.spark.ml.regression.GeneralizedLinearRegression(Model) and summary
org.apache.spark.ml.regression.LinearRegressionTrainingSummary
org.apache.spark.ml.stat.ChiSquareTest
org.apache.spark.ml.stat.ChiSquareTest

Developer API
Most developer API are the basic components for ML pipeline, such like 
Transformer, Estimator, PipelineStage, Params and Attributes, which I don't see 
necessary to change.

final class:
org.apache.spark.ml.classification.OneVsRest
org.apache.spark.ml.evaluation.RegressionEvaluator
org.apache.spark.ml.feature.Binarizer
org.apache.spark.ml.feature.Bucketizer
org.apache.spark.ml.feature.ChiSqSelector
org.apache.spark.ml.feature.IDF
org.apache.spark.ml.feature.QuantileDiscretizer
org.apache.spark.ml.feature.VectorSlicer
org.apache.spark.ml.feature.Word2Vec
org.apache.spark.ml.param.ParamMap

Most of the final class here should be ready to be unmarked. I also checked 
final method and fields (most params) which can be kept the same for now.





> ML, Graph 2.2 QA: API: Experimental, DeveloperApi, final, sealed audit
> --
>
> Key: SPARK-20502
> URL: https://issues.apache.org/jira/browse/SPARK-20502
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Reporter: Joseph K. Bradley
>Priority: Blocker
>
> We should make a pass through the items marked as Experimental or 
> DeveloperApi and see if any are stable enough to be unmarked.
> We should also check for items marked final or sealed to see if they are 
> stable enough to be opened up as APIs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org