[ https://issues.apache.org/jira/browse/SPARK-26247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16782863#comment-16782863 ]
Anne Holler commented on SPARK-26247: ------------------------------------- Hi, [~skonto] and [~srowen], Thank you for your comments! We hear your points: 1) that the idea that the Spark pipeline model representation can serve as the representation that 'rules them all' (as is the aspiration of, e.g., PMML and PFA) can be viewed as a Spark lock-in, and 2) that for many folks MLeap adequately solves the problem of online serving Spark pipeline models. That said, our opinion is that there is general value to the Spark community in our small patch to the Spark codebase to reduce Spark pipeline model load time and to support low-latency scoring, so that online serving can be performed directly from the Spark model representation, if desired. We believe that updating Spark while retaining its on-disk format (rather than depending on an external codebase with an alternative on-disk format, as is the case with MLeap) simplifies keeping the online and offline serving code paths consistent and lessens the risk of model serving mismatch. > SPIP - ML Model Extension for no-Spark MLLib Online Serving > ----------------------------------------------------------- > > Key: SPARK-26247 > URL: https://issues.apache.org/jira/browse/SPARK-26247 > Project: Spark > Issue Type: Improvement > Components: MLlib > Affects Versions: 2.1.0 > Reporter: Anne Holler > Priority: Major > Labels: SPIP > Attachments: SPIPMlModelExtensionForOnlineServing.pdf > > > This ticket tracks an SPIP to improve model load time and model serving > interfaces for online serving of Spark MLlib models. The SPIP is here > [https://docs.google.com/a/uber.com/document/d/e/2PACX-1vRttVNNMBt4pBU2oBWKoiK3-7PW6RDwvHNgSMqO67ilxTX_WUStJ2ysUdAk5Im08eyHvlpcfq1g-DLF/pub] > > The improvement opportunity exists in all versions of spark. We developed > our set of changes wrt version 2.1.0 and can port them forward to other > versions (e.g., we have ported them forward to 2.3.2). -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org