Re: Which version of spark version supports parquet version 2 ?

2024-04-17 Thread Mich Talebzadeh
Hi Prem, Your question about writing Parquet v2 with Spark 3.2.0. Spark 3.2.0 Limitations: Spark 3.2.0 doesn't have a built-in way to explicitly force Parquet v2 encoding. As we saw previously, even Spark 3.4 created a file with parquet-mr version, indicating v1 encoding. Dremio v2 Support: As

Re: Which version of spark version supports parquet version 2 ?

2024-04-17 Thread Prem Sahoo
Hello Ryan, May I know how you can write Parquet V2 encoding from spark 3.2.0 ? As per my knowledge Dremio is creating and reading Parquet V2. "Apache Parquet-MR Writer version PARQUET_2_0, which is widely adopted by engines that write Parquet data, supports delta encodings. However, these

Re: Which version of spark version supports parquet version 2 ?

2024-04-17 Thread Ryan Blue
Prem, as I said earlier, v2 is not a finalized spec so you should not use it. That's why it is not the default. You can get Spark to write v2 files, but it isn't recommended by the Parquet community. On Wed, Apr 17, 2024 at 11:05 AM Prem Sahoo wrote: > Hello Community, > Could anyone shed more

Re: Which version of spark version supports parquet version 2 ?

2024-04-17 Thread Prem Sahoo
Hello Community, Could anyone shed more light on this (Spark Supporting Parquet V2)? On Tue, Apr 16, 2024 at 3:42 PM Mich Talebzadeh wrote: > Hi Prem, > > Regrettably this is not my area of speciality. I trust another colleague > will have a more informed idea. Alternatively you may raise an

Re: [DISCUSS] Spark 4.0.0 release

2024-04-17 Thread Wenchen Fan
Thank you all for the replies! To @Nicholas Chammas : Thanks for cleaning up the error terminology and documentation! I've merged the first PR and let's finish others before the 4.0 release. To @Dongjoon Hyun : Thanks for driving the ANSI on by default effort! Now the vote has passed, let's

[VOTE][RESULT] SPARK-44444: Use ANSI SQL mode by default

2024-04-17 Thread Dongjoon Hyun
The vote passes with 24 +1s (13 binding +1s). Thanks to all who helped with the vote! (* = binding) +1: - Dongjoon Hyun * - Gengliang Wang * - Chao Sun * - Hyukjin Kwon * - Liang-Chi Hsieh * - Holden Karau * - Huaxin Gao * - Denny Lee - Xiao Li * - Mich Talebzadeh - Christiano Anderson - Yang Jie

Re: [VOTE] SPARK-44444: Use ANSI SQL mode by default

2024-04-17 Thread Dongjoon Hyun
Thank you all. The vote passed. I'll conclude this vote. Dongjooon. On 2024/04/16 04:58:39 Arun Dakua wrote: > +1 > > On Tue, Apr 16, 2024 at 12:50 AM Josh Rosen wrote: > > > +1 > > > > On Mon, Apr 15, 2024 at 11:26 AM Maciej wrote: > > > >> +1 > >> > >> Best regards, > >> Maciej