Spark will read data written with v2 encodings just fine. You just don't
need to worry about making Spark produce v2. And you should probably also
not produce v2 encodings from other systems.

On Mon, Apr 15, 2024 at 4:37 PM Prem Sahoo <prem.re...@gmail.com> wrote:

> oops but so spark does not support parquet V2  atm ?, as We have a use
> case where we need parquet V2 as  one of our components uses Parquet V2 .
>
> On Mon, Apr 15, 2024 at 7:09 PM Ryan Blue <b...@tabular.io> wrote:
>
>> Hi Prem,
>>
>> Parquet v1 is the default because v2 has not been finalized and adopted
>> by the community. I highly recommend not using v2 encodings at this time.
>>
>> Ryan
>>
>> On Mon, Apr 15, 2024 at 3:05 PM Prem Sahoo <prem.re...@gmail.com> wrote:
>>
>>> I am using spark 3.2.0 . but my spark package comes with parquet-mr
>>> 1.2.1 which writes in parquet version 1 not version version 2:(. so I was
>>> looking how to write in Parquet version2 ?
>>>
>>> On Mon, Apr 15, 2024 at 5:05 PM Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
>>>> Sorry you have a point there. It was released in version 3.00. What
>>>> version of spark are you using?
>>>>
>>>> Technologist | Solutions Architect | Data Engineer  | Generative AI
>>>> London
>>>> United Kingdom
>>>>
>>>>
>>>>    view my Linkedin profile
>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>
>>>>
>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>
>>>>
>>>>
>>>> *Disclaimer:* The information provided is correct to the best of my
>>>> knowledge but of course cannot be guaranteed . It is essential to note
>>>> that, as with any advice, quote "one test result is worth one-thousand
>>>> expert opinions (Werner
>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>>>
>>>>
>>>> On Mon, 15 Apr 2024 at 21:33, Prem Sahoo <prem.re...@gmail.com> wrote:
>>>>
>>>>> Thank you so much for the info! But do we have any release notes where
>>>>> it says spark2.4.0 onwards supports parquet version 2. I was under the
>>>>> impression Spark3.0 onwards it started supporting .
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Apr 15, 2024 at 4:28 PM Mich Talebzadeh <
>>>>> mich.talebza...@gmail.com> wrote:
>>>>>
>>>>>> Well if I am correct, Parquet version 2 support was introduced in
>>>>>> Spark version 2.4.0. Therefore, any version of Spark starting from 2.4.0
>>>>>> supports Parquet version 2. Assuming that you are using Spark version
>>>>>> 2.4.0 or later, you should be able to take advantage of Parquet version 2
>>>>>> features.
>>>>>>
>>>>>> HTH
>>>>>>
>>>>>> Mich Talebzadeh,
>>>>>> Technologist | Solutions Architect | Data Engineer  | Generative AI
>>>>>> London
>>>>>> United Kingdom
>>>>>>
>>>>>>
>>>>>>    view my Linkedin profile
>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>
>>>>>>
>>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Disclaimer:* The information provided is correct to the best of my
>>>>>> knowledge but of course cannot be guaranteed . It is essential to note
>>>>>> that, as with any advice, quote "one test result is worth one-thousand
>>>>>> expert opinions (Werner
>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>>>>>
>>>>>>
>>>>>> On Mon, 15 Apr 2024 at 20:53, Prem Sahoo <prem.re...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thank you for the information!
>>>>>>> I can use any version of parquet-mr to produce parquet file.
>>>>>>>
>>>>>>> regarding 2nd question .
>>>>>>> Which version of spark is supporting parquet version 2?
>>>>>>> May I get the release notes where parquet versions are mentioned ?
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Apr 15, 2024 at 2:34 PM Mich Talebzadeh <
>>>>>>> mich.talebza...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Parquet-mr is a Java library that provides functionality for
>>>>>>>> working with Parquet files with hadoop. It is therefore  more geared
>>>>>>>> towards working with Parquet files within the Hadoop ecosystem,
>>>>>>>> particularly using MapReduce jobs. There is no definitive way to check
>>>>>>>> exact compatible versions within the library itself. However, you can 
>>>>>>>> have
>>>>>>>> a look at this
>>>>>>>>
>>>>>>>> https://github.com/apache/parquet-mr/blob/master/CHANGES.md
>>>>>>>>
>>>>>>>> HTH
>>>>>>>>
>>>>>>>> Mich Talebzadeh,
>>>>>>>> Technologist | Solutions Architect | Data Engineer  | Generative AI
>>>>>>>> London
>>>>>>>> United Kingdom
>>>>>>>>
>>>>>>>>
>>>>>>>>    view my Linkedin profile
>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>>>
>>>>>>>>
>>>>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *Disclaimer:* The information provided is correct to the best of
>>>>>>>> my knowledge but of course cannot be guaranteed . It is essential to 
>>>>>>>> note
>>>>>>>> that, as with any advice, quote "one test result is worth one-thousand
>>>>>>>> expert opinions (Werner
>>>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>>>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, 15 Apr 2024 at 18:59, Prem Sahoo <prem.re...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hello Team,
>>>>>>>>> May I know how to check which version of parquet is supported by
>>>>>>>>> parquet-mr 1.2.1 ?
>>>>>>>>>
>>>>>>>>> Which version of parquet-mr is supporting parquet version 2 (V2) ?
>>>>>>>>>
>>>>>>>>> Which version of spark is supporting parquet version 2?
>>>>>>>>> May I get the release notes where parquet versions are mentioned ?
>>>>>>>>>
>>>>>>>>
>>
>> --
>> Ryan Blue
>> Tabular
>>
>

-- 
Ryan Blue
Tabular

Reply via email to