Joseph K. Bradley created SPARK-23110:
-----------------------------------------

             Summary: CLONE - ML 2.2 QA: API: Java compatibility, docs
                 Key: SPARK-23110
                 URL: https://issues.apache.org/jira/browse/SPARK-23110
             Project: Spark
          Issue Type: Sub-task
          Components: Documentation, Java API, ML, MLlib
            Reporter: Joseph K. Bradley
            Assignee: Weichen Xu
             Fix For: 2.2.0


Check Java compatibility for this release:
* APIs in {{spark.ml}}
* New APIs in {{spark.mllib}} (There should be few, if any.)

Checking compatibility means:
* Checking for differences in how Scala and Java handle types. Some items to 
look out for are:
** Check for generic "Object" types where Java cannot understand complex Scala 
types.
*** *Note*: The Java docs do not always match the bytecode. If you find a 
problem, please verify it using {{javap}}.
** Check Scala objects (especially with nesting!) carefully.  These may not be 
understood in Java, or they may be accessible only via the weirdly named Java 
types (with "$" or "#") which are generated by the Scala compiler.
** Check for uses of Scala and Java enumerations, which can show up oddly in 
the other language's doc.  (In {{spark.ml}}, we have largely tried to avoid 
using enumerations, and have instead favored plain strings.)
* Check for differences in generated Scala vs Java docs.  E.g., one past issue 
was that Javadocs did not respect Scala's package private modifier.

If you find issues, please comment here, or for larger items, create separate 
JIRAs and link here as "requires".
* Remember that we should not break APIs from previous releases.  If you find a 
problem, check if it was introduced in this Spark release (in which case we can 
fix it) or in a previous one (in which case we can create a java-friendly 
version of the API).
* If needed for complex issues, create small Java unit tests which execute each 
method.  (Algorithmic correctness can be checked in Scala.)

Recommendations for how to complete this task:
* There are not great tools.  In the past, this task has been done by:
** Generating API docs
** Building JAR and outputting the Java class signatures for MLlib
** Manually inspecting and searching the docs and class signatures for issues
* If you do have ideas for better tooling, please say so we can make this task 
easier in the future!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to