Re: Parquet without Hadoop dependencies

Atour Mousavi Gourabi Fri, 09 Jun 2023 00:18:31 -0700

Hi Gang,

Backward compatibility does indeed seem challenging here. Especially as I'd 
rather see the writers/readers moved out of parquet-hadoop after they've been 
decoupled. What are your thoughts on this?


Best regards,
Atour
________________________________
From: Gang Wu <ust...@gmail.com>
Sent: Friday, June 9, 2023 3:32 AM
To: dev@parquet.apache.org <dev@parquet.apache.org>
Subject: Re: Parquet without Hadoop dependencies

Hi Atour,

Thanks for bringing this up!

From what I observed from PARQUET-1822, I think it is a valid use
case to support parquet reading/writing without hadoop installed.
The challenge is backward compatibility. It would be great if you can
work on it.

Best,
Gang

On Fri, Jun 9, 2023 at 12:24 AM Atour Mousavi Gourabi <at...@live.com>
wrote:

> Dear all,
>
> The Java implementations of the Parquet readers and writers seem pretty
> tightly coupled to Hadoop (see: PARQUET-1822). For some projects, this can
> cause issues as it's an unnecessary and big dependency when you might just
> need to write to disk. Is there any appetite here for separating the Hadoop
> code and supporting more convenient ways to write to disk out of the box? I
> am willing to work on these changes but would like some pointers on whether
> such patches would be reviewed and accepted as PARQUET-1822 has been open
> for over three years now.
>
> Best regards,
> Atour Mousavi Gourabi
>

Re: Parquet without Hadoop dependencies

Reply via email to