These are my initial thoughts:

As usual your mileage varies. Depending on the use case, introducing
support for stored procedures (SP) in Spark SQL with Python as the
procedural language

*Pros*

   - Can potentially provide more flexibility and capabilities in the
   respective SQL workflows. We  can seamlessly integrate Python code with SQL
   workflows, thus enabling ourselves to perform a wider range of tasks
   directly within Spark SQL.
   - SPs as usual will enable more modular and reusable coding. Users can
   build their own libraries of stored procedures and remember these are
   compiled once and used thereafter.
   - With SPs, one can potentially perform advanced analytics in Spark SQL
   through Python packages
   - Restricted access and enhanced security by hiding sensitive code in
   SPs, only accessible through SP
   - Build your own Catalog and enhance it

*Cons*

   - Performance implications due to the need to serialize and deserialize
   data between Spark and Python, especially for large datasets
   - Additional resource utilisation
   - Error handling will require more thoughts
   - Compatibility with different versions of Spark andPython libraries
   - Client side and server side Python compatibilities
   - if the underlying table schema changes, often the SP code will be
   invalidated and has to be recompiled

HTH

Mich Talebzadeh,
Distinguished Technologist, Solutions Architect & Engineer
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh

 Disclaimer: Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



Mich Talebzadeh,
Distinguished Technologist, Solutions Architect & Engineer
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Thu, 31 Aug 2023 at 09:45, Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Thanks Allison!
>
> Mich Talebzadeh,
> Distinguished Technologist, Solutions Architect & Engineer
> London
> United Kingdom
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Thu, 31 Aug 2023 at 01:26, Allison Wang <allison.w...@databricks.com>
> wrote:
>
>> Hi Mich,
>>
>> I've updated the permissions on the document. Please feel free to leave
>> comments.
>> Thanks,
>> Allison
>>
>> On Wed, Aug 30, 2023 at 3:44 PM Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Great. Please allow edit access on SPIP or ability to comment.
>>>
>>> Thanks
>>>
>>> Mich Talebzadeh,
>>> Distinguished Technologist, Solutions Architect & Engineer
>>> London
>>> United Kingdom
>>>
>>>
>>>    view my Linkedin profile
>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>
>>>
>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>>
>>> On Wed, 30 Aug 2023 at 23:29, Allison Wang
>>> <allison.w...@databricks.com.invalid> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I would like to start a discussion on “Python Stored Procedures".
>>>>
>>>> This proposal aims to extend Spark SQL by introducing support for
>>>> stored procedures, starting with Python as the procedural language. This
>>>> will enable users to run complex logic using Python within their SQL
>>>> workflows and save these routines in catalogs like HMS for future use.
>>>>
>>>> *SPIP*:
>>>> https://docs.google.com/document/d/1ce2EZrf2BxHu7TjfGn4TgToK3TBYYzRkmsIVcfmkNzE/edit?usp=sharing
>>>> *JIRA*: https://issues.apache.org/jira/browse/SPARK-45023
>>>>
>>>> Looking forward to your feedback!
>>>>
>>>> Thanks,
>>>> Allison
>>>>
>>>>

Reply via email to