[jira] [Commented] (SPARK-38648) SPIP: Simplified API for DL Inferencing
[ https://issues.apache.org/jira/browse/SPARK-38648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17597397#comment-17597397 ] Lee Yang commented on SPARK-38648: -- Agreed, closing. > SPIP: Simplified API for DL Inferencing > --- > > Key: SPARK-38648 > URL: https://issues.apache.org/jira/browse/SPARK-38648 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 3.0.0 >Reporter: Lee Yang >Priority: Minor > > h1. Background and Motivation > The deployment of deep learning (DL) models to Spark clusters can be a point > of friction today. DL practitioners often aren't well-versed with Spark, and > Spark experts often aren't well-versed with the fast-changing DL frameworks. > Currently, the deployment of trained DL models is done in a fairly ad-hoc > manner, with each model integration usually requiring significant effort. > To simplify this process, we propose adding an integration layer for each > major DL framework that can introspect their respective saved models to > more-easily integrate these models into Spark applications. You can find a > detailed proposal here: > [https://docs.google.com/document/d/1n7QPHVZfmQknvebZEXxzndHPV2T71aBsDnP4COQa_v0] > h1. Goals > - Simplify the deployment of pre-trained single-node DL models to Spark > inference applications. > - Follow pandas_udf for simple inference use-cases. > - Follow Spark ML Pipelines APIs for transfer-learning use-cases. > - Enable integrations with popular third-party DL frameworks like > TensorFlow, PyTorch, and Huggingface. > - Focus on PySpark, since most of the DL frameworks use Python. > - Take advantage of built-in Spark features like GPU scheduling and Arrow > integration. > - Enable inference on both CPU and GPU. > h1. Non-goals > - DL model training. > - Inference w/ distributed models, i.e. "model parallel" inference. > h1. Target Personas > - Data scientists who need to deploy DL models on Spark. > - Developers who need to deploy DL models on Spark. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38648) SPIP: Simplified API for DL Inferencing
[ https://issues.apache.org/jira/browse/SPARK-38648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17582071#comment-17582071 ] Xiangrui Meng commented on SPARK-38648: --- I had an offline discussion with [~leewyang]. Summary: We might not need to introduce a new package in Spark with dependencies on DL frameworks. Instead, we can provide abstractions in pyspark.ml to implement the common data operations needed by DL inference, e.g., batching, tensor conversion, pipelining, etc. For example, we can define the following API (just to illustrate the idea, not proposing the final API): {code:scala} def dl_model_udf( predict_fn: Callable[pd.DataFrame, pd.DataFrame], # need to discuss the data format batch_size: int, input_tensor_shapes: Map[str, List[int]], output_data_type, preprocess_fn, ... ) -> PandasUDF {code} Users only need to supply predict_fn, which could return a (wrapped) TensorFlow model, a PyTorch model, or an MLflow model. Users are responsible for package dependency management and model loading logics. We doesn't cover everything proposed in the original SPIP but we do save the boilerplate code for users on creating batches over Iterator[DataFrame], converting 1d arrays to tensors, and async preprocessing (CPU) and prediction (GPU). If we go with this direction, I don't free the change needs an SPIP because it doesn't introduce a new Spark package nor new dependencies. > SPIP: Simplified API for DL Inferencing > --- > > Key: SPARK-38648 > URL: https://issues.apache.org/jira/browse/SPARK-38648 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 3.0.0 >Reporter: Lee Yang >Priority: Minor > > h1. Background and Motivation > The deployment of deep learning (DL) models to Spark clusters can be a point > of friction today. DL practitioners often aren't well-versed with Spark, and > Spark experts often aren't well-versed with the fast-changing DL frameworks. > Currently, the deployment of trained DL models is done in a fairly ad-hoc > manner, with each model integration usually requiring significant effort. > To simplify this process, we propose adding an integration layer for each > major DL framework that can introspect their respective saved models to > more-easily integrate these models into Spark applications. You can find a > detailed proposal here: > [https://docs.google.com/document/d/1n7QPHVZfmQknvebZEXxzndHPV2T71aBsDnP4COQa_v0] > h1. Goals > - Simplify the deployment of pre-trained single-node DL models to Spark > inference applications. > - Follow pandas_udf for simple inference use-cases. > - Follow Spark ML Pipelines APIs for transfer-learning use-cases. > - Enable integrations with popular third-party DL frameworks like > TensorFlow, PyTorch, and Huggingface. > - Focus on PySpark, since most of the DL frameworks use Python. > - Take advantage of built-in Spark features like GPU scheduling and Arrow > integration. > - Enable inference on both CPU and GPU. > h1. Non-goals > - DL model training. > - Inference w/ distributed models, i.e. "model parallel" inference. > h1. Target Personas > - Data scientists who need to deploy DL models on Spark. > - Developers who need to deploy DL models on Spark. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38648) SPIP: Simplified API for DL Inferencing
[ https://issues.apache.org/jira/browse/SPARK-38648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17580548#comment-17580548 ] Yikun Jiang commented on SPARK-38648: - By the way, just curious, is this SPIP expected to be a feature in version 3.4? > SPIP: Simplified API for DL Inferencing > --- > > Key: SPARK-38648 > URL: https://issues.apache.org/jira/browse/SPARK-38648 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 3.0.0 >Reporter: Lee Yang >Priority: Minor > > h1. Background and Motivation > The deployment of deep learning (DL) models to Spark clusters can be a point > of friction today. DL practitioners often aren't well-versed with Spark, and > Spark experts often aren't well-versed with the fast-changing DL frameworks. > Currently, the deployment of trained DL models is done in a fairly ad-hoc > manner, with each model integration usually requiring significant effort. > To simplify this process, we propose adding an integration layer for each > major DL framework that can introspect their respective saved models to > more-easily integrate these models into Spark applications. You can find a > detailed proposal here: > [https://docs.google.com/document/d/1n7QPHVZfmQknvebZEXxzndHPV2T71aBsDnP4COQa_v0] > h1. Goals > - Simplify the deployment of pre-trained single-node DL models to Spark > inference applications. > - Follow pandas_udf for simple inference use-cases. > - Follow Spark ML Pipelines APIs for transfer-learning use-cases. > - Enable integrations with popular third-party DL frameworks like > TensorFlow, PyTorch, and Huggingface. > - Focus on PySpark, since most of the DL frameworks use Python. > - Take advantage of built-in Spark features like GPU scheduling and Arrow > integration. > - Enable inference on both CPU and GPU. > h1. Non-goals > - DL model training. > - Inference w/ distributed models, i.e. "model parallel" inference. > h1. Target Personas > - Data scientists who need to deploy DL models on Spark. > - Developers who need to deploy DL models on Spark. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38648) SPIP: Simplified API for DL Inferencing
[ https://issues.apache.org/jira/browse/SPARK-38648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17580547#comment-17580547 ] Yikun Jiang commented on SPARK-38648: - If we want to run onnx directly, we might want to support onnxruntime as one of DL Frameworks, like sparkext.onnxruntime.Model(url). For other frameworks, user can first convert onnx to framework specific model format directly [1], and then call sparkext.onnxruntime.Model(converted_url), I don't think it's too difficult. So I personally think, the format of the model should not be unified, onnx is just one of them. [1]https://pytorch.org/docs/stable/onnx.html#torch-onnx > SPIP: Simplified API for DL Inferencing > --- > > Key: SPARK-38648 > URL: https://issues.apache.org/jira/browse/SPARK-38648 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 3.0.0 >Reporter: Lee Yang >Priority: Minor > > h1. Background and Motivation > The deployment of deep learning (DL) models to Spark clusters can be a point > of friction today. DL practitioners often aren't well-versed with Spark, and > Spark experts often aren't well-versed with the fast-changing DL frameworks. > Currently, the deployment of trained DL models is done in a fairly ad-hoc > manner, with each model integration usually requiring significant effort. > To simplify this process, we propose adding an integration layer for each > major DL framework that can introspect their respective saved models to > more-easily integrate these models into Spark applications. You can find a > detailed proposal here: > [https://docs.google.com/document/d/1n7QPHVZfmQknvebZEXxzndHPV2T71aBsDnP4COQa_v0] > h1. Goals > - Simplify the deployment of pre-trained single-node DL models to Spark > inference applications. > - Follow pandas_udf for simple inference use-cases. > - Follow Spark ML Pipelines APIs for transfer-learning use-cases. > - Enable integrations with popular third-party DL frameworks like > TensorFlow, PyTorch, and Huggingface. > - Focus on PySpark, since most of the DL frameworks use Python. > - Take advantage of built-in Spark features like GPU scheduling and Arrow > integration. > - Enable inference on both CPU and GPU. > h1. Non-goals > - DL model training. > - Inference w/ distributed models, i.e. "model parallel" inference. > h1. Target Personas > - Data scientists who need to deploy DL models on Spark. > - Developers who need to deploy DL models on Spark. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38648) SPIP: Simplified API for DL Inferencing
[ https://issues.apache.org/jira/browse/SPARK-38648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17580153#comment-17580153 ] Varun Kruthiventi commented on SPARK-38648: --- Support for integration with DL frameworks like Tensorflow, pytorch will be very helpful. Can we use a common model specification like ONNX for porting the models to spark ? > SPIP: Simplified API for DL Inferencing > --- > > Key: SPARK-38648 > URL: https://issues.apache.org/jira/browse/SPARK-38648 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 3.0.0 >Reporter: Lee Yang >Priority: Minor > > h1. Background and Motivation > The deployment of deep learning (DL) models to Spark clusters can be a point > of friction today. DL practitioners often aren't well-versed with Spark, and > Spark experts often aren't well-versed with the fast-changing DL frameworks. > Currently, the deployment of trained DL models is done in a fairly ad-hoc > manner, with each model integration usually requiring significant effort. > To simplify this process, we propose adding an integration layer for each > major DL framework that can introspect their respective saved models to > more-easily integrate these models into Spark applications. You can find a > detailed proposal here: > [https://docs.google.com/document/d/1n7QPHVZfmQknvebZEXxzndHPV2T71aBsDnP4COQa_v0] > h1. Goals > - Simplify the deployment of pre-trained single-node DL models to Spark > inference applications. > - Follow pandas_udf for simple inference use-cases. > - Follow Spark ML Pipelines APIs for transfer-learning use-cases. > - Enable integrations with popular third-party DL frameworks like > TensorFlow, PyTorch, and Huggingface. > - Focus on PySpark, since most of the DL frameworks use Python. > - Take advantage of built-in Spark features like GPU scheduling and Arrow > integration. > - Enable inference on both CPU and GPU. > h1. Non-goals > - DL model training. > - Inference w/ distributed models, i.e. "model parallel" inference. > h1. Target Personas > - Data scientists who need to deploy DL models on Spark. > - Developers who need to deploy DL models on Spark. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38648) SPIP: Simplified API for DL Inferencing
[ https://issues.apache.org/jira/browse/SPARK-38648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17542107#comment-17542107 ] Lee Yang commented on SPARK-38648: -- [~mengxr] I think that could work. FWIW, I looked into how the projects in the "connector" (formerly "external") folder are built/published. It looks like they're all currently scala projects that are just built as part of the main [Build and test|https://github.com/apache/spark/blob/master/.github/workflows/build_and_test.yml#L684] GitHub Actions workflow and [released/versioned along with the core spark releases|https://github.com/apache/spark/pull/35879/files?file-filters%5B%5D=.xml&show-viewed-files=true]. We could do presumably something similar with this SPIP (with some modifications to [release-build.sh|https://github.com/apache/spark/blob/master/dev/create-release/release-build.sh#L116-L128] to publish a separate artifact to PyPI). > SPIP: Simplified API for DL Inferencing > --- > > Key: SPARK-38648 > URL: https://issues.apache.org/jira/browse/SPARK-38648 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 3.0.0 >Reporter: Lee Yang >Priority: Minor > > h1. Background and Motivation > The deployment of deep learning (DL) models to Spark clusters can be a point > of friction today. DL practitioners often aren't well-versed with Spark, and > Spark experts often aren't well-versed with the fast-changing DL frameworks. > Currently, the deployment of trained DL models is done in a fairly ad-hoc > manner, with each model integration usually requiring significant effort. > To simplify this process, we propose adding an integration layer for each > major DL framework that can introspect their respective saved models to > more-easily integrate these models into Spark applications. You can find a > detailed proposal here: > [https://docs.google.com/document/d/1n7QPHVZfmQknvebZEXxzndHPV2T71aBsDnP4COQa_v0] > h1. Goals > - Simplify the deployment of pre-trained single-node DL models to Spark > inference applications. > - Follow pandas_udf for simple inference use-cases. > - Follow Spark ML Pipelines APIs for transfer-learning use-cases. > - Enable integrations with popular third-party DL frameworks like > TensorFlow, PyTorch, and Huggingface. > - Focus on PySpark, since most of the DL frameworks use Python. > - Take advantage of built-in Spark features like GPU scheduling and Arrow > integration. > - Enable inference on both CPU and GPU. > h1. Non-goals > - DL model training. > - Inference w/ distributed models, i.e. "model parallel" inference. > h1. Target Personas > - Data scientists who need to deploy DL models on Spark. > - Developers who need to deploy DL models on Spark. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38648) SPIP: Simplified API for DL Inferencing
[ https://issues.apache.org/jira/browse/SPARK-38648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17528862#comment-17528862 ] Xiangrui Meng commented on SPARK-38648: --- I think it is beneficial to both Spark and DL frameworks if Spark has state-of-the-art DL capabilities. We did some work in the past to make Spark work better with DL frameworks, e.g., iterator Scalar Pandas UDF, barrier mode, and GPU scheduling. But most of them are low level APIs for developers, not end users. Our Spark user guide contains little about DL and AI. The dependency on DL frameworks might create issues. One idea is to develop in the Spark repo and Spark namespace but publish to PyPI independently. For example, in order to use DL features, users need to explicitly install `pyspark-dl` and then use the features under `pyspark.dl` namespace. Putting development inside Spark and publishing under the spark namespace would help drive both development and adoption. > SPIP: Simplified API for DL Inferencing > --- > > Key: SPARK-38648 > URL: https://issues.apache.org/jira/browse/SPARK-38648 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 3.0.0 >Reporter: Lee Yang >Priority: Minor > > h1. Background and Motivation > The deployment of deep learning (DL) models to Spark clusters can be a point > of friction today. DL practitioners often aren't well-versed with Spark, and > Spark experts often aren't well-versed with the fast-changing DL frameworks. > Currently, the deployment of trained DL models is done in a fairly ad-hoc > manner, with each model integration usually requiring significant effort. > To simplify this process, we propose adding an integration layer for each > major DL framework that can introspect their respective saved models to > more-easily integrate these models into Spark applications. You can find a > detailed proposal here: > [https://docs.google.com/document/d/1n7QPHVZfmQknvebZEXxzndHPV2T71aBsDnP4COQa_v0] > h1. Goals > - Simplify the deployment of pre-trained single-node DL models to Spark > inference applications. > - Follow pandas_udf for simple inference use-cases. > - Follow Spark ML Pipelines APIs for transfer-learning use-cases. > - Enable integrations with popular third-party DL frameworks like > TensorFlow, PyTorch, and Huggingface. > - Focus on PySpark, since most of the DL frameworks use Python. > - Take advantage of built-in Spark features like GPU scheduling and Arrow > integration. > - Enable inference on both CPU and GPU. > h1. Non-goals > - DL model training. > - Inference w/ distributed models, i.e. "model parallel" inference. > h1. Target Personas > - Data scientists who need to deploy DL models on Spark. > - Developers who need to deploy DL models on Spark. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38648) SPIP: Simplified API for DL Inferencing
[ https://issues.apache.org/jira/browse/SPARK-38648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519186#comment-17519186 ] Lee Yang commented on SPARK-38648: -- FYI, here is some proof-of-concept code (with example notebooks): https://github.com/leewyang/sparkext > SPIP: Simplified API for DL Inferencing > --- > > Key: SPARK-38648 > URL: https://issues.apache.org/jira/browse/SPARK-38648 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 3.0.0 >Reporter: Lee Yang >Priority: Minor > > h1. Background and Motivation > The deployment of deep learning (DL) models to Spark clusters can be a point > of friction today. DL practitioners often aren't well-versed with Spark, and > Spark experts often aren't well-versed with the fast-changing DL frameworks. > Currently, the deployment of trained DL models is done in a fairly ad-hoc > manner, with each model integration usually requiring significant effort. > To simplify this process, we propose adding an integration layer for each > major DL framework that can introspect their respective saved models to > more-easily integrate these models into Spark applications. You can find a > detailed proposal here: > [https://docs.google.com/document/d/1n7QPHVZfmQknvebZEXxzndHPV2T71aBsDnP4COQa_v0] > h1. Goals > - Simplify the deployment of pre-trained single-node DL models to Spark > inference applications. > - Follow pandas_udf for simple inference use-cases. > - Follow Spark ML Pipelines APIs for transfer-learning use-cases. > - Enable integrations with popular third-party DL frameworks like > TensorFlow, PyTorch, and Huggingface. > - Focus on PySpark, since most of the DL frameworks use Python. > - Take advantage of built-in Spark features like GPU scheduling and Arrow > integration. > - Enable inference on both CPU and GPU. > h1. Non-goals > - DL model training. > - Inference w/ distributed models, i.e. "model parallel" inference. > h1. Target Personas > - Data scientists who need to deploy DL models on Spark. > - Developers who need to deploy DL models on Spark. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38648) SPIP: Simplified API for DL Inferencing
[ https://issues.apache.org/jira/browse/SPARK-38648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515623#comment-17515623 ] Lee Yang commented on SPARK-38648: -- We've seen a need to help Spark users who are struggling with DL inference with issues like incorrectly copy/pasting boilerplate code, getting model serialization failures, having difficulty connecting Spark DataFrames to various underyling DL framework tensor representations, etc. Many existing users aren't using MLFlow to train/track their DL models, so that isn't a specific solution or goal here. The hope is to make inference with third-party DL models a first-class citizen in the Spark community by adding this to Spark itself, to gather Spark+DL experts who can contribute to making life easier for Spark+DL novices. > SPIP: Simplified API for DL Inferencing > --- > > Key: SPARK-38648 > URL: https://issues.apache.org/jira/browse/SPARK-38648 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 3.0.0 >Reporter: Lee Yang >Priority: Minor > > h1. Background and Motivation > The deployment of deep learning (DL) models to Spark clusters can be a point > of friction today. DL practitioners often aren't well-versed with Spark, and > Spark experts often aren't well-versed with the fast-changing DL frameworks. > Currently, the deployment of trained DL models is done in a fairly ad-hoc > manner, with each model integration usually requiring significant effort. > To simplify this process, we propose adding an integration layer for each > major DL framework that can introspect their respective saved models to > more-easily integrate these models into Spark applications. You can find a > detailed proposal > [here|https://docs.google.com/document/d/1n7QPHVZfmQknvebZEXxzndHPV2T71aBsDnP4COQa_v0] > h1. Goals > - Simplify the deployment of trained single-node DL models to Spark inference > applications. > - Follow pandas_udf for simple inference use-cases. > - Follow Spark ML Pipelines APIs for transfer-learning use-cases. > - Enable integrations with popular third-party DL frameworks like TensorFlow, > PyTorch, and Huggingface. > - Focus on PySpark, since most of the DL frameworks use Python. > - Take advantage of built-in Spark features like GPU scheduling and Arrow > integration. > - Enable inference on both CPU and GPU. > h1. Non-goals > - DL model training. > - Inference w/ distributed models, i.e. "model parallel" inference. > h1. Target Personas > - Data scientists who need to deploy DL models on Spark. > - Developers who need to deploy DL models on Spark. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38648) SPIP: Simplified API for DL Inferencing
[ https://issues.apache.org/jira/browse/SPARK-38648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515531#comment-17515531 ] Sean R. Owen commented on SPARK-38648: -- Hm, how much do we need a custom layer in Spark? this seems like something relatively easily to do directly in Spark, or with tools like MLflow already > SPIP: Simplified API for DL Inferencing > --- > > Key: SPARK-38648 > URL: https://issues.apache.org/jira/browse/SPARK-38648 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 3.0.0 >Reporter: Lee Yang >Priority: Minor > > h1. Background and Motivation > The deployment of deep learning (DL) models to Spark clusters can be a point > of friction today. DL practitioners often aren't well-versed with Spark, and > Spark experts often aren't well-versed with the fast-changing DL frameworks. > Currently, the deployment of trained DL models is done in a fairly ad-hoc > manner, with each model integration usually requiring significant effort. > To simplify this process, we propose adding an integration layer for each > major DL framework that can introspect their respective saved models to > more-easily integrate these models into Spark applications. You can find a > detailed proposal > [here|https://docs.google.com/document/d/1n7QPHVZfmQknvebZEXxzndHPV2T71aBsDnP4COQa_v0] > h1. Goals > - Simplify the deployment of trained single-node DL models to Spark inference > applications. > - Follow pandas_udf for simple inference use-cases. > - Follow Spark ML Pipelines APIs for transfer-learning use-cases. > - Enable integrations with popular third-party DL frameworks like TensorFlow, > PyTorch, and Huggingface. > - Focus on PySpark, since most of the DL frameworks use Python. > - Take advantage of built-in Spark features like GPU scheduling and Arrow > integration. > - Enable inference on both CPU and GPU. > h1. Non-goals > - DL model training. > - Inference w/ distributed models, i.e. "model parallel" inference. > h1. Target Personas > - Data scientists who need to deploy DL models on Spark. > - Developers who need to deploy DL models on Spark. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org