I'd like to simplify this discussion and start with clarity of use case. If
we're talking about a Java developer using the datasets API in a java
application, we should respect the Java direct memory size limits set via
-XX:MaxDirectMemorySize. Doing something else would violate the principle
of least surprise for most Java developers.

If we're talking about someone writing a C++ application that has an
embedded JVM for some reasons (e.g. UDF calculation), I don't think this is
a concern.

My sense is the goal of the Java-based dataset API is the former use case,
not the latter. In that case, we should behave like a "good" Java citizen.
If you use something like Snappy in Java (a JNI based Java package), all
non-trivial memory allocation is done within the direct memory size setting
of the JVM (so the user doesn't have to think about the "how" the library
is implemented).

Per my other comments, I have less opinion about the how. In general, if
you're focused on performance, doing chunk based allocations should be able
to help a lot. For example, many allocators do 2MB allocations from the OS
as opposed to doing allocations for every 10k you might want. I'd suggest
you hook there to inform Java. It is more "correct" and also causes less
overhead.

I agree with Micah wrt Netty: decoupling core Arrow from Netty but then
making DatasetAPI use a random Netty API for memory accounting doesn't make
a lot of sense.

On Mon, Jul 20, 2020 at 3:52 AM Hongze Zhang <notify...@126.com> wrote:

> Hi,
>
> I want to discuss a bit about the discussion[1] in the pending PR[2] for
> Java Dataset(it's no longer "Datasets" I guess?) API.
>
>
> - Background:
>
> We are transferring C++ Arrow buffers to Java side BufferAllocators. We
> should decide whether to use -XX:MaxDirectMemorySize as a limit of these
> buffers. If yes, what should be a desired solution?
>
> - Possible alternative solutions so far:
>
> 1. Reserve from Bits.java from Java side
>
> Pros: Share memory counter with JVM direct byte buffers, No JNI overhead,
> less codes
> Cons: More invocations (each buffer a call to Bits#reserveMemory)
>
> 2. Reserve from Bits.java from C++ side
>
> Pros: Share memory counter with JVM direct byte buffers, Less invocations
> (e.g. if using Jemalloc, we can somehow perform one call for one underlying
> trunk)
> Cons: JNI overhead, more codes
>
> 3. Reserve from Netty's PlatformDependent.java from Java side
>
> Pros: Share memory counter with Netty-based buffers, No JNI overhead, less
> codes
> Cons: More invocations
>
> 4. Reserve from Netty's PlatformDependent.java from C++ side
>
> Pros: Share memory counter with Netty-based buffers, Less invocations
> Cons: JNI overhead, more codes
>
> 5. Not to implement any of the above, respect to BufferAllocator's limit
> only.
>
>
> So far I prefer 5, not to use any of the solutions. I am not sure if
> "direct memory" is a good indicator for these off-heap buffers, because we
> should finally have to decide to share counter with either JVM direct byte
> buffers or Netty-based buffers. As far as I could think about, a complete
> solution may ideally either to have a global counter for all types of
> off-heap buffers, or give each type a individual counter.
>
> So do you have any thoughts or suggestions on this topic? It would be
> great if we could have a conclusion soon as the PR was blocked for some
> time. Thanks in advance :)
>
>
> Best,
> Hongze
>
> [1] https://github.com/apache/arrow/pull/7030#issuecomment-657096664
> [2] https://github.com/apache/arrow/pull/7030
>

Reply via email to