[ https://issues.apache.org/jira/browse/ARROW-6720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064908#comment-17064908 ]
Wes McKinney commented on ARROW-6720: ------------------------------------- Where do things stand on this project? > [JAVA][C++]Support Parquet Read and Write in Java > ------------------------------------------------- > > Key: ARROW-6720 > URL: https://issues.apache.org/jira/browse/ARROW-6720 > Project: Apache Arrow > Issue Type: New Feature > Components: C++, Java > Affects Versions: 0.15.0 > Reporter: Chendi.Xue > Assignee: Chendi.Xue > Priority: Major > Labels: pull-request-available > Fix For: 0.17.0 > > Time Spent: 37h 20m > Remaining Estimate: 0h > > We added a new java interface to support parquet read and write from hdfs or > local file. > The purpose of this implementation is that when we loading and dumping > parquet data in Java, we can only use rowBased put and get methods. Since > arrow already has C++ implementation to load and dump parquet, so we wrapped > those codes as Java APIs. > After test, we noticed in our workload, performance improved more than 2x > comparing with rowBased load and dump. So we want to contribute codes to > arrow. > since this is a total independent change, there is no codes change to current > arrow codes. We added two folders as listed: java/adapter/parquet and > cpp/src/jni/parquet -- This message was sent by Atlassian Jira (v8.3.4#803005)