[ https://issues.apache.org/jira/browse/SPARK-23109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16332698#comment-16332698 ]
Bryan Cutler edited comment on SPARK-23109 at 1/20/18 3:17 AM: --------------------------------------------------------------- I did the following: generated HTML doc and checked for consistency with Scala, did not see any API breaking changes, checked for missing items (see list below), checked default param values match. No blocking or major issues found. Items requiring follow up, I will create (related) JIRAS to fix: classification: GBTClassifier - missing featureSubsetStrategy, should be moved to TreeEnsembleParams GBTClassificationModel - missing numClasses, should inherit from JavaClassificationModel for both of the above https://issues.apache.org/jira/browse/SPARK-23161 clustering: GuassianMixtureModel - missing guassians, need to serialize Array[MultivariateGaussian]? LDAModel - missing topicsMatrix - can send Matrix through Py4J? evaluation: ClusteringEvaluator - DOC describe silhouette like scaladoc feature: Bucketizer - mulitple input/output cols, splitsArray - https://issues.apache.org/jira/browse/SPARK-22797 ChiSqSelector - DOC selectorType desc missing new types QuantileDiscretizer - multiple input output cols - https://issues.apache.org/jira/browse/SPARK-22796 fpm: DOC associationRules should say return "DataFrame" image: missing columnSchema, get*, scala missing toNDArray regression: LinearRegressionSummary - missing r2adj - https://issues.apache.org/jira/browse/SPARK-23162 stat: missing Summarizer class - https://issues.apache.org/jira/browse/SPARK-21741 tuning: missing subModels, hasSubModels - https://issues.apache.org/jira/browse/SPARK-22005 for the above DOC issues https://issues.apache.org/jira/browse/SPARK-23163 was (Author: bryanc): I did the following: generated HTML doc and checked for consistency with Scala, did not see any API breaking changes, checked for missing items (see list below), checked default param values match. No blocking or major issues found. Items requiring follow up, I will create (related) JIRAS to fix: classification: GBTClassifier - missing featureSubsetStrategy, should be moved to TreeEnsembleParams GBTClassificationModel - missing numClasses, should inherit from JavaClassificationModel for both of the above https://issues.apache.org/jira/browse/SPARK-23161 clustering: GuassianMixtureModel - missing guassians, need to serialize Array[MultivariateGaussian]? LDAModel - missing topicsMatrix - can send Matrix through Py4J? evaluation: ClusteringEvaluator - DOC describe silhouette like scaladoc feature: Bucketizer - mulitple input/output cols, splitsArray - https://issues.apache.org/jira/browse/SPARK-22797 ChiSqSelector - DOC selectorType desc missing new types QuantileDiscretizer - multiple input output cols - https://issues.apache.org/jira/browse/SPARK-22796 fpm: DOC associationRules should say return "DataFrame" image: missing columnSchema, get*, scala missing toNDArray regression: LinearRegressionSummary - missing r2adj - https://issues.apache.org/jira/browse/SPARK-23162 stat: missing Summarizer class - https://issues.apache.org/jira/browse/SPARK-21741 tuning: missing subModels, hasSubModels - https://issues.apache.org/jira/browse/SPARK-22005 > ML 2.3 QA: API: Python API coverage > ----------------------------------- > > Key: SPARK-23109 > URL: https://issues.apache.org/jira/browse/SPARK-23109 > Project: Spark > Issue Type: Sub-task > Components: Documentation, ML, PySpark > Reporter: Joseph K. Bradley > Priority: Blocker > > For new public APIs added to MLlib ({{spark.ml}} only), we need to check the > generated HTML doc and compare the Scala & Python versions. > * *GOAL*: Audit and create JIRAs to fix in the next release. > * *NON-GOAL*: This JIRA is _not_ for fixing the API parity issues. > We need to track: > * Inconsistency: Do class/method/parameter names match? > * Docs: Is the Python doc missing or just a stub? We want the Python doc to > be as complete as the Scala doc. > * API breaking changes: These should be very rare but are occasionally either > necessary (intentional) or accidental. These must be recorded and added in > the Migration Guide for this release. > ** Note: If the API change is for an Alpha/Experimental/DeveloperApi > component, please note that as well. > * Missing classes/methods/parameters: We should create to-do JIRAs for > functionality missing from Python, to be added in the next release cycle. > *Please use a _separate_ JIRA (linked below as "requires") for this list of > to-do items.* -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org