[VOTE] Release Apache Spark 3.5.0 (RC4)

2023-09-05 Thread Yuanjian Li
Please vote on releasing the following candidate(RC4) as Apache Spark version 3.5.0. The vote is open until 11:59pm Pacific time Sep 8th and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 3.5.0 [ ] -1 Do not release this pack

Re: [DISCUSS] SPIP: Python Stored Procedures

2023-09-05 Thread Allison Wang
Hi Mich, Thank you for your comments! I've left some comments on the SPIP, but let's continue the discussion here. You've highlighted the potential advantages of Python stored procedures, and I'd like to emphasize two important aspects: 1. *Versatility*: Integrating Python into SQL provides r

Re: Feature to restart Spark job from previous failure point

2023-09-05 Thread Mich Talebzadeh
Hi Dipayan, You ought to maintain data source consistency minimising changes. upstream. Spark is not a Swiss Army knife :) Anyhow, we already do this in spark structured streaming with the concept of checkpointing.You can do so by implementing - Checkpointing - Stateful processing in Spar