Thanks for the response Micah. I could implement this and contribute to Arrow Java. To help me get started, are there any pointers on how the C++ or Rust implementations currently read Parquet into Arrow? Are they reading Parquet row-by-row and building Arrow batches or are there better ways of implementing this?
On Tue, Jul 30, 2019 at 1:56 PM Micah Kornfield <emkornfi...@gmail.com> wrote: > Hi Anoop, > There isn't currently anything in the Arrow Java library that does this. > It is something that I think we want to add at some point. Dremio [1] has > some Parquet related code, but I haven't looked at it to understand how > easy it is to use as a standalone library and whether is supports predicate > push-down/column selection. > > Thanks, > Micah > > [1] > > https://github.com/dremio/dremio-oss/tree/master/sabot/kernel/src/main/java/com/dremio/exec/store/parquet > > On Sun, Jul 28, 2019 at 2:08 PM Anoop Johnson <anoop.k.john...@gmail.com> > wrote: > > > Arrow Newbie here. What is the recommended way to convert Parquet data > > into Arrow, preferably doing predicate/column pushdown? > > > > One can implement this as custom code using the Parquet API, and > re-encode > > it in Arrow using the Arrow APIs, but is this supported by Arrow out of > the > > box? > > > > Thanks, > > Anoop > > >