[ https://issues.apache.org/jira/browse/SPARK-38648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515623#comment-17515623 ]
Lee Yang commented on SPARK-38648: ---------------------------------- We've seen a need to help Spark users who are struggling with DL inference with issues like incorrectly copy/pasting boilerplate code, getting model serialization failures, having difficulty connecting Spark DataFrames to various underyling DL framework tensor representations, etc. Many existing users aren't using MLFlow to train/track their DL models, so that isn't a specific solution or goal here. The hope is to make inference with third-party DL models a first-class citizen in the Spark community by adding this to Spark itself, to gather Spark+DL experts who can contribute to making life easier for Spark+DL novices. > SPIP: Simplified API for DL Inferencing > --------------------------------------- > > Key: SPARK-38648 > URL: https://issues.apache.org/jira/browse/SPARK-38648 > Project: Spark > Issue Type: Improvement > Components: ML > Affects Versions: 3.0.0 > Reporter: Lee Yang > Priority: Minor > > h1. Background and Motivation > The deployment of deep learning (DL) models to Spark clusters can be a point > of friction today. DL practitioners often aren't well-versed with Spark, and > Spark experts often aren't well-versed with the fast-changing DL frameworks. > Currently, the deployment of trained DL models is done in a fairly ad-hoc > manner, with each model integration usually requiring significant effort. > To simplify this process, we propose adding an integration layer for each > major DL framework that can introspect their respective saved models to > more-easily integrate these models into Spark applications. You can find a > detailed proposal > [here|https://docs.google.com/document/d/1n7QPHVZfmQknvebZEXxzndHPV2T71aBsDnP4COQa_v0] > h1. Goals > - Simplify the deployment of trained single-node DL models to Spark inference > applications. > - Follow pandas_udf for simple inference use-cases. > - Follow Spark ML Pipelines APIs for transfer-learning use-cases. > - Enable integrations with popular third-party DL frameworks like TensorFlow, > PyTorch, and Huggingface. > - Focus on PySpark, since most of the DL frameworks use Python. > - Take advantage of built-in Spark features like GPU scheduling and Arrow > integration. > - Enable inference on both CPU and GPU. > h1. Non-goals > - DL model training. > - Inference w/ distributed models, i.e. "model parallel" inference. > h1. Target Personas > - Data scientists who need to deploy DL models on Spark. > - Developers who need to deploy DL models on Spark. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org