[ 
https://issues.apache.org/jira/browse/SPARK-26247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16707907#comment-16707907
 ] 

Anne Holler commented on SPARK-26247:
-------------------------------------

Hi, [~skonto],

My basic take on model representation is that any representation that is not 
the same format that the
spark mllib code produces for training and consumes for serving basically 
introduces additional maintenance
toil and potential risk of model serving mismatch.  In that sense, spark mllib 
format is a de facto standard.


Unless PMML were to completely replace spark mllib representation as the first 
class citizen model
representation in spark (which doesn't seem to have clear switchover ROI), the 
team I am on would not
choose to move to it, because we do not want to take the risk that the model 
trained and evaluated wrt spark
mllib native representation has some difference when served in batch or online 
mode from PMML representation.

Best regards, Anne

> SPIP - ML Model Extension for no-Spark MLLib Online Serving
> -----------------------------------------------------------
>
>                 Key: SPARK-26247
>                 URL: https://issues.apache.org/jira/browse/SPARK-26247
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 2.1.0
>            Reporter: Anne Holler
>            Priority: Major
>              Labels: SPIP
>         Attachments: SPIPMlModelExtensionForOnlineServing.pdf
>
>
> This ticket tracks an SPIP to improve model load time and model serving 
> interfaces for online serving of Spark MLlib models.  The SPIP is here
> [https://docs.google.com/a/uber.com/document/d/e/2PACX-1vRttVNNMBt4pBU2oBWKoiK3-7PW6RDwvHNgSMqO67ilxTX_WUStJ2ysUdAk5Im08eyHvlpcfq1g-DLF/pub]
>  
> The improvement opportunity exists in all versions of spark.  We developed 
> our set of changes wrt version 2.1.0 and can port them forward to other 
> versions (e.g., we have ported them forward to 2.3.2).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to