These are my initial thoughts: As usual your mileage varies. Depending on the use case, introducing support for stored procedures (SP) in Spark SQL with Python as the procedural language
*Pros* - Can potentially provide more flexibility and capabilities in the respective SQL workflows. We can seamlessly integrate Python code with SQL workflows, thus enabling ourselves to perform a wider range of tasks directly within Spark SQL. - SPs as usual will enable more modular and reusable coding. Users can build their own libraries of stored procedures and remember these are compiled once and used thereafter. - With SPs, one can potentially perform advanced analytics in Spark SQL through Python packages - Restricted access and enhanced security by hiding sensitive code in SPs, only accessible through SP - Build your own Catalog and enhance it *Cons* - Performance implications due to the need to serialize and deserialize data between Spark and Python, especially for large datasets - Additional resource utilisation - Error handling will require more thoughts - Compatibility with different versions of Spark andPython libraries - Client side and server side Python compatibilities - if the underlying table schema changes, often the SP code will be invalidated and has to be recompiled HTH Mich Talebzadeh, Distinguished Technologist, Solutions Architect & Engineer London United Kingdom view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. Mich Talebzadeh, Distinguished Technologist, Solutions Architect & Engineer London United Kingdom view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Thu, 31 Aug 2023 at 09:45, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Thanks Allison! > > Mich Talebzadeh, > Distinguished Technologist, Solutions Architect & Engineer > London > United Kingdom > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > https://en.everybodywiki.com/Mich_Talebzadeh > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Thu, 31 Aug 2023 at 01:26, Allison Wang <allison.w...@databricks.com> > wrote: > >> Hi Mich, >> >> I've updated the permissions on the document. Please feel free to leave >> comments. >> Thanks, >> Allison >> >> On Wed, Aug 30, 2023 at 3:44 PM Mich Talebzadeh < >> mich.talebza...@gmail.com> wrote: >> >>> Hi, >>> >>> Great. Please allow edit access on SPIP or ability to comment. >>> >>> Thanks >>> >>> Mich Talebzadeh, >>> Distinguished Technologist, Solutions Architect & Engineer >>> London >>> United Kingdom >>> >>> >>> view my Linkedin profile >>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>> >>> >>> https://en.everybodywiki.com/Mich_Talebzadeh >>> >>> >>> >>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>> any loss, damage or destruction of data or any other property which may >>> arise from relying on this email's technical content is explicitly >>> disclaimed. The author will in no case be liable for any monetary damages >>> arising from such loss, damage or destruction. >>> >>> >>> >>> >>> On Wed, 30 Aug 2023 at 23:29, Allison Wang >>> <allison.w...@databricks.com.invalid> wrote: >>> >>>> Hi all, >>>> >>>> I would like to start a discussion on “Python Stored Procedures". >>>> >>>> This proposal aims to extend Spark SQL by introducing support for >>>> stored procedures, starting with Python as the procedural language. This >>>> will enable users to run complex logic using Python within their SQL >>>> workflows and save these routines in catalogs like HMS for future use. >>>> >>>> *SPIP*: >>>> https://docs.google.com/document/d/1ce2EZrf2BxHu7TjfGn4TgToK3TBYYzRkmsIVcfmkNzE/edit?usp=sharing >>>> *JIRA*: https://issues.apache.org/jira/browse/SPARK-45023 >>>> >>>> Looking forward to your feedback! >>>> >>>> Thanks, >>>> Allison >>>> >>>>