[ 
https://issues.apache.org/jira/browse/SPARK-26247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16782863#comment-16782863
 ] 

Anne Holler commented on SPARK-26247:
-------------------------------------

Hi, [~skonto] and [~srowen],

Thank you for your comments!  We hear your points: 1) that the idea that the 
Spark pipeline model representation can serve as the representation that 'rules 
them all' (as is the aspiration of, e.g., PMML and PFA) can be viewed as a 
Spark lock-in, and 2) that for many folks MLeap adequately solves the problem 
of online serving Spark pipeline models.

That said, our opinion is that there is general value to the Spark community in 
our small patch to the Spark codebase to reduce Spark pipeline model load time 
and to support low-latency scoring, so that online serving can be performed 
directly from the Spark model representation, if desired.  We believe that 
updating Spark while retaining its on-disk format (rather than depending on an 
external codebase with an alternative on-disk format, as is the case with 
MLeap) simplifies keeping the online and offline serving code paths consistent 
and lessens the risk of model serving mismatch.

> SPIP - ML Model Extension for no-Spark MLLib Online Serving
> -----------------------------------------------------------
>
>                 Key: SPARK-26247
>                 URL: https://issues.apache.org/jira/browse/SPARK-26247
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 2.1.0
>            Reporter: Anne Holler
>            Priority: Major
>              Labels: SPIP
>         Attachments: SPIPMlModelExtensionForOnlineServing.pdf
>
>
> This ticket tracks an SPIP to improve model load time and model serving 
> interfaces for online serving of Spark MLlib models.  The SPIP is here
> [https://docs.google.com/a/uber.com/document/d/e/2PACX-1vRttVNNMBt4pBU2oBWKoiK3-7PW6RDwvHNgSMqO67ilxTX_WUStJ2ysUdAk5Im08eyHvlpcfq1g-DLF/pub]
>  
> The improvement opportunity exists in all versions of spark.  We developed 
> our set of changes wrt version 2.1.0 and can port them forward to other 
> versions (e.g., we have ported them forward to 2.3.2).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to