I have several questions and concerns regarding DrillBuf usage, design
and implementation. There is a limited documentation available for the
subject (Java doc,
https://github.com/apache/drill/blob/master/exec/memory/base/src/main/java/org/apache/drill/exec/memory/README.md
and https://github.com/paul-rogers/drill/wiki/Memory-Management) and I
hope that a few members of the community may have more information.
What are the design goals behind DrillBuf? It seems like it is supposed
to be Drill access gate for direct byte buffers. How is it different
(for that goal) from UnsafeDirectLittleEndian? Both use
wrapper/delegation pattern, with DrillBuf delegating to
UnsafeDirectLittleEndian (not always) and UnsafeDirectLittleEndian
delegating to ByteBuf it wraps. Is it necessary to have both? Are there
any out of the box netty classes that already provide required
functionality? I guess that answer to the last question was "no" back
when DrillBuf and UnsafeDirectLittleEndian were introduced into Drill.
Is it still "no" for the latest netty release? What extra functionality
DrillBuf (and UnsafeDirectLittleEndian) provides on top of existing
netty classes?
As far as I can see from the source code, DrillBuf changes validation
(boundary and reference count checks) mechanism, making it optional
(compared to always enabled boundary checks inside netty) for get/set
Byte/Char/Short/Long/Float/Double. Is this a proper place to make
validation optional or the validation (or portion of the validation)
must be always on or off (there are different opinions, see
https://issues.apache.org/jira/browse/DRILL-6004,
https://issues.apache.org/jira/browse/DRILL-6202,
https://github.com/apache/drill/pull/1060 and
https://github.com/apache/drill/pull/1144)? Are there any performance
benchmark that justify or explain such behavior (if such benchmark does
not exist, are there any volunteer to do the benchmark)? My experience
is that the reference count check is significantly more expensive
compared to boundary checking and boundary checking adds tens of percent
to direct memory read when reading just a few bytes, so my vote is to
keep validation as optional with the ability to enable it for debug
purposes at run-time. What is the reason the same approach do not apply
to get/set Bytes and those methods are delegated to
UnsafeDirectLittleEndian that delegates it further?
Why DrillBuf reverses how AbstractByteBuf calls _get from get (and _set
from set), making _get to call get (_set to call set)? Why not to follow
a base class design patter?
Another question is usage of netty "io.netty.buffer" package for Drill
classes. Is this absolutely necessary? I don't think that netty
developers expect this and support semantic version compatibility for
package private classes/members.
Thank you,
Vlad
- [DISCUSS] DrillBuf Vlad Rozov
-