Thanks Ryan for the confirmation of my suspicions. That would certainly make a quick sample application easier to achieve from an adoption perspective.
I had just put this JIRA in. I'll leave it open for anyone to jump in on. https://issues.apache.org/jira/browse/PARQUET-1776 Thanks, David On Fri, Jan 24, 2020 at 12:08 PM Ryan Blue <[email protected]> wrote: > There's not currently a way to do this without Hadoop. We've been working > on moving to the `InputFile` and `OutputFile` abstractions so that we can > get rid of it, but Parquet still depends on Hadoop libraries for > compression and we haven't pulled out the parts of Parquet that use the new > abstraction from the older ones that accept Hadoop Paths, so you need to > have Hadoop in your classpath either way. > > To get to where you can write a file without Hadoop dependencies, I think > we need to create a new module that parquet-hadoop will depend on with the > `InputFile`/`OutputFile` implementation. Then we would refactor the Hadoop > classes to extend those implementations to avoid breaking the Hadoop > classes. We'd also need to implement the compression API directly on top of > aircompressor in this module. > > On Thu, Jan 23, 2020 at 4:40 PM David Mollitor <[email protected]> wrote: > > > I am usually a user of Parquet through Hive or Spark, but I wanted to sit > > down and write my own small example application of using the library > > directly. > > > > Is there some quick way that I can write a Parquet file to the local file > > system using java.nio.Path (i.e., with no Hadoop dependencies?) > > > > Thanks! > > > > > -- > Ryan Blue > Software Engineer > Netflix >
