Which Python version will run that stored procedure?

All Python versions supported in PySpark

How to manage external dependencies?

Existing way we have
https://spark.apache.org/docs/latest/api/python/user_guide/python_packaging.html
.
In fact, this will use the external dependencies within your Python
interpreter so you can use all existing conda or venvs.

How to test it via a common CI process?

Existing way of PySpark unittests, see
https://github.com/apache/spark/tree/master/python/pyspark/tests

How to manage versions and do upgrades? Migrations?

This is a new feature so no migration is needed. We will keep the
compatibility according to the sember we follow.

Current Python UDF solution handles these problems in a good way since they
delegate them to project level.

Current UDF solution cannot handle stored procedures because UDF is on the
worker side. This is Driver side.

In my opinion, the concerns raised here look orthogonal with the Stored
Procedure itself.
Let me know if this does not address your concern.

On Thu, 31 Aug 2023 at 12:49, Alexander Shorin <kxe...@apache.org> wrote:

> -1
>
> Great idea to ignore the experience of others and copy bad practices back
> for nothing.
>
> If you are familiar with Python ecosystem then you should answer the
> questions:
> 1. Which Python version will run that stored procedure?
> 2. How to manage external dependencies?
> 3. How to test it via a common CI process?
> 4. How to manage versions and do upgrades? Migrations?
>
> Current Python UDF solution handles these problems in a good way since
> they delegate them to project level.
>
> --
> ,,,^..^,,,
>
>
> On Thu, Aug 31, 2023 at 1:29 AM Allison Wang
> <allison.w...@databricks.com.invalid> wrote:
>
>> Hi all,
>>
>> I would like to start a discussion on “Python Stored Procedures".
>>
>> This proposal aims to extend Spark SQL by introducing support for stored
>> procedures, starting with Python as the procedural language. This will
>> enable users to run complex logic using Python within their SQL workflows
>> and save these routines in catalogs like HMS for future use.
>>
>> *SPIP*:
>> https://docs.google.com/document/d/1ce2EZrf2BxHu7TjfGn4TgToK3TBYYzRkmsIVcfmkNzE/edit?usp=sharing
>> *JIRA*: https://issues.apache.org/jira/browse/SPARK-45023
>>
>> Looking forward to your feedback!
>>
>> Thanks,
>> Allison
>>
>>

Reply via email to