[ https://issues.apache.org/jira/browse/AVRO-1208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yin Huai updated AVRO-1208: --------------------------- Resolution: Won't Do Status: Resolved (was: Patch Available) > Improve Trevni's performance on row-oriented data access > -------------------------------------------------------- > > Key: AVRO-1208 > URL: https://issues.apache.org/jira/browse/AVRO-1208 > Project: Apache Avro > Issue Type: Improvement > Components: java > Affects Versions: 1.7.3 > Reporter: Yin Huai > Assignee: Yin Huai > Priority: Major > Attachments: AVRO-1208.1.patch, AVRO-1208.2.patch > > > Trevni uses an 64KB internal buffer to store values of a column. When > accessing a column, it reads 64KB (if we do not consider compression and > checksum) data from the storage layer. However, when the table is accessed in > a row-oriented fashion (a entire row needs to be handed over to the upper > layer), in the worst case (a full table scan and values of this table are all > the same size), every 64KB data read can cause a seek. > This jira is used to discuss if we should consider the data access pattern > mentioned above and if so, how to improve the performance of Trevni. > Row-oriented data processing engines, e.g. Hive, can benefit from this work. -- This message was sent by Atlassian JIRA (v7.6.3#76005)