I don't know about the rest of these tasks, but sharing data between Arrow Java and C++ should definitely use the C data interface.

It seems there's work in progress here, feel free to collaborate:
https://issues.apache.org/jira/browse/ARROW-12965

Regards

Antoine.


Le 04/08/2021 à 17:45, Micah Kornfield a écrit :
Hi Hongze,
Sorry I started taking a look at these a while ago, but my focus has been
elsewhere with the time I have available to contribute to the project.  One
thing that can also help is if there is a way to divide any of the PRs into
smaller standalone components it would likely help get them merged sooner
(I seem to recall at least one PR redid both how memory management was
working between C++ and Java as well as adding more functionality for
datasets, apologies if I am misremembering).

  If other people have time to review that would be great.

Thanks,
Micah

On Wed, Aug 4, 2021 at 6:11 AM Wes McKinney <wesmck...@gmail.com> wrote:

hi Hongze — I am not sure who will be able to review these, but in the
future feel free to raise your Java PRs on the mailing list even
sooner, no need to wait for more than a month. There are far fewer
active Java developers vs. C++ or Rust, so it can help to get people's
attention on your work.

- Wes

On Tue, Aug 3, 2021 at 9:44 PM Hongze Zhang <notify...@126.com> wrote:

Hi,

I have some PRs that were to improve Dataset API's Java implementation
have not been reviewing for months. Could someone help me to review
them? Thanks in advance.

1. https://github.com/apache/arrow/pull/10201
ARROW-11776: [Java][Dataset] Support writing to files within dataset
scanner via JNI
2. https://github.com/apache/arrow/pull/10333
ARROW-12607: [Website] Doc section for Dataset Java bindings
3. https://github.com/apache/arrow/pull/10114
ARROW-12480: [Java][Dataset] FileSystemDataset: Support reading from a
directory
4.https://github.com/apache/arrow/pull/10652
ARROW-13257: [Java][Dataset] Allow passing empty columns for projection

One of the most critical changes among the PRs is to add write support
to Java API (The first in the list). This also includes some work that
builds a common way to share Arrow data between C++ and Java over JNI.
Also this work was pretty close to the proposal in ARROW-7272[1].

Other PRs are minor improvements like the the second one to create Java
Dataset doc page on Arrow website. It also received some review
comments already.

Thanks,
Hongze

[1] https://issues.apache.org/jira/browse/ARROW-7272



Reply via email to