Hi Fokko,

Thanks for looking into this! I generally agree we probably should retire
parquet-thrift. The only thing is we need to find out what is still using
it which is hard to do because of the large user base of parquet-mr. What
we did earlier is to mark that module as deprecated first. Then after one
release, we officially remove it. But I don't know that process would block
you too long.

Xinli

On Thu, Sep 28, 2023 at 2:20 AM Fokko Driesprong <[email protected]> wrote:

> Hey Gang,
>
> It is also used in some of the code:
>
>    - org.apache.parquet.hadoop.thrift.AbstractThriftWriteSupport
>    - org.apache.parquet.thrift.AbstractThriftWriteSupport
>    - org.apache.parquet.thrift.ThriftSchemaConverter
>    - org.apache.parquet.thrift.TupleToThriftWriteSupport
>
> Yesterday I tried to factor it out, but I ended up removing most of the
> codebase. I'm not aware of any alternative to Elephantbird. I tried to ping
> the original author
> <https://github.com/apache/parquet-mr/pull/1068#issuecomment-1729434254>,
> but the GitHub account seems to be abandoned.
>
> Kind regards,
> Fokko
>
> Op do 28 sep 2023 om 11:13 schreef Gang Wu <[email protected]>:
>
> > Hi Fokko,
> >
> > Is there any alternative to Elephantbird? Since it is only used in the
> > test, could we rewrite those test cases using the alternative if any?
> > The effort may be huge though.
> >
> > Best,
> > Gang
> >
> > On Thu, Sep 28, 2023 at 5:03 PM Fokko Driesprong <[email protected]>
> wrote:
> >
> > > Hi everyone,
> > >
> > > I was in the process of updating to the latest version of Thrift
> > > <https://github.com/apache/parquet-mr/pull/1138> (from 0.16.0 to
> > 0.19.0).
> > > Mostly because it contains CVEs and makes the release process easier
> > > because you don't have to install Thrift from source (it is just
> > available
> > > on homebrew etc).
> > >
> > > While working on this, I ran into an issue with Elephantbird, which is
> > > using a very old version of Thrift (0.7.0). Trying to bump this I
> noticed
> > > that a lot of classes that we use in the tests have
> > > <https://github.com/apache/parquet-mr/pull/1156> been made private
> > > <https://github.com/apache/parquet-mr/pull/1156>. Therefore it is hard
> > to
> > > test if we break anything.
> > >
> > > It looks like parquet-thrift is not used by anyone anymore
> > > <https://mvnrepository.com/artifact/org.apache.parquet/parquet-thrift
> >.
> > I
> > > would suggest removing the module from the repository
> > > <https://github.com/apache/parquet-mr/pull/1158> unless anyone
> objects.
> > >
> > > Kind regards, Fokko
> > >
> >
>


-- 
Xinli Shang

Reply via email to