[ https://issues.apache.org/jira/browse/DRILL-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16466455#comment-16466455 ]
salim achouche commented on DRILL-6301: --------------------------------------- *Benchmark Results* * Updated the Drill JMH benchmark [here|https://github.com/sachouche/drill-jmh] * The benchmark results and conclusions have been published to this [document|https://docs.google.com/document/d/1BSNem_ItP-Vxlr6auSP_iwwOLM9rwWZYxGwCsXi-IE8/edit#heading=h.57coyirqkop6] *In summary, it was concluded that* * The current Parquet flat reader performance was negatively impacted by the DrillBuf APIs when accessing few bytes at a time * Using intermediary buffers address such performance issues as the data access pattern became bulk * Using bulk processing (within the reader) had also the advantage of minimizing processing overhead > Parquet Performance Analysis > ---------------------------- > > Key: DRILL-6301 > URL: https://issues.apache.org/jira/browse/DRILL-6301 > Project: Apache Drill > Issue Type: Task > Components: Storage - Parquet > Reporter: salim achouche > Assignee: salim achouche > Priority: Major > Fix For: 1.14.0 > > > _*Description -*_ > * DRILL-5846 is meant to improve the Flat Parquet reader performance > * The associated implementation resulted in a 2x - 4x performance improvement > * Though during the review process ([pull > request|[https://github.com/apache/drill/pull/1060])] few key questions arised > > *_Intermediary Processing via Direct Memory vs Byte Arrays_* > * The main reasons for using byte arrays for intermediary processing is to > a) avoid the high cost of the DrillBuf checks (especially the reference > counting) and b) benefit from some observed Java optimizations when accessing > byte arrays > * Starting with version 1.12.0, the DrillBuf enablement checks have been > refined so that memory access and reference counting checks can be enabled > independently > * Benchmarking of Java's Direct Memory unsafe method using JMH indicates the > performance gap between heap vs direct memory is very narrow except for few > use-cases > * There are also concerns that the extra copy step (from direct memory into > byte arrays) will have a negative effect on performance; note that this > overhead was not observed using Intel's Vtune as the intermediary buffer were > a) pinned to a single CPU, b) reused, and c) small enough to remain in the L1 > cache during columnar processing. > _*Goal*_ > * The Flat Parquet reader is amongst the few Drill columnar operators > * It is imperative that we agree on the most optimal processing pattern so > that the decisions that we take within this Jira are not only applied to > Parquet but to all Drill columnar operators > _*Methodology*_ > # Assess the performance impact of using intermediary byte arrays (as > described above) > # Prototype a solution using Direct Memory and DrillBuf checks off, access > checks on, all checks on > # Make an educated decision on which processing pattern should be adopted > # Decide whether it is ok to use Java's unsafe API (and through what > mechanism) on byte arrays (when the use of byte arrays is a necessity) > > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)