Hi Gang, Backward compatibility does indeed seem challenging here. Especially as I'd rather see the writers/readers moved out of parquet-hadoop after they've been decoupled. What are your thoughts on this?
Best regards, Atour ________________________________ From: Gang Wu <ust...@gmail.com> Sent: Friday, June 9, 2023 3:32 AM To: dev@parquet.apache.org <dev@parquet.apache.org> Subject: Re: Parquet without Hadoop dependencies Hi Atour, Thanks for bringing this up! From what I observed from PARQUET-1822, I think it is a valid use case to support parquet reading/writing without hadoop installed. The challenge is backward compatibility. It would be great if you can work on it. Best, Gang On Fri, Jun 9, 2023 at 12:24 AM Atour Mousavi Gourabi <at...@live.com> wrote: > Dear all, > > The Java implementations of the Parquet readers and writers seem pretty > tightly coupled to Hadoop (see: PARQUET-1822). For some projects, this can > cause issues as it's an unnecessary and big dependency when you might just > need to write to disk. Is there any appetite here for separating the Hadoop > code and supporting more convenient ways to write to disk out of the box? I > am willing to work on these changes but would like some pointers on whether > such patches would be reviewed and accepted as PARQUET-1822 has been open > for over three years now. > > Best regards, > Atour Mousavi Gourabi >