Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19876#discussion_r159018510
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala ---
    @@ -85,12 +87,55 @@ private[util] sealed trait BaseReadWrite {
       protected final def sc: SparkContext = sparkSession.sparkContext
     }
     
    +/**
    + * ML export formats for should implement this trait so that users can 
specify a shortname rather
    + * than the fully qualified class name of the exporter.
    + *
    + * A new instance of this class will be instantiated each time a DDL call 
is made.
    + *
    + * @since 2.3.0
    + */
    +@InterfaceStability.Evolving
    +trait MLFormatRegister {
    +  /**
    +   * The string that represents the format that this data source provider 
uses. This is
    +   * overridden by children to provide a nice alias for the data source. 
For example:
    +   *
    +   * {{{
    +   *   override def shortName(): String =
    +   *       "pmml+org.apache.spark.ml.regression.LinearRegressionModel"
    +   * }}}
    +   * Indicates that this format is capable of saving Spark's own 
LinearRegressionModel in pmml.
    +   *
    +   * Format discovery is done using a ServiceLoader so make sure to list 
your format in
    +   * META-INF/services.
    +   * @since 2.3.0
    +   */
    +  def shortName(): String
    +}
    +
    +/**
    + * Implemented by objects that provide ML exportability.
    + *
    + * A new instance of this class will be instantiated each time a DDL call 
is made.
    + *
    + * @since 2.3.0
    + */
    +@InterfaceStability.Evolving
    +trait MLWriterFormat {
    +  /**
    +   * Function write the provided pipeline stage out.
    +   */
    +  def write(path: String, session: SparkSession, optionMap: 
mutable.Map[String, String],
    +    stage: PipelineStage): Unit
    +}
    +
     /**
      * Abstract class for utility classes that can save ML instances.
      */
    +@deprecated("Use GeneralMLWriter instead. Will be removed in Spark 3.0.0", 
"2.3.0")
    --- End diff --
    
    I'm debating if this should be deprecated in 2.4 and just have this as a 
new option in 2.3. What do you think @sethah / @MLnick ?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to