----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/60289/#review179152 -----------------------------------------------------------
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java Line 2974 (original) <https://reviews.apache.org/r/60289/#comment253877> Do we need this for offheap cache? For smaller cache, we don't want metadata objects taking up 100% of cache. As long as we are storing only the serialized footers, this shouldn't be a problem. llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java Lines 614 (patched) <https://reviews.apache.org/r/60289/#comment253691> can this all be baked in OrcFileMetadata class? since stats, stripes are derived from tail, will be useful if we can just store OrcTail in OrcFileMetadata and lazily construct stats and stripes when getter is invoked. llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java Lines 715 (patched) <https://reviews.apache.org/r/60289/#comment253764> nit: remove or add to trace log? llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java Line 142 (original), 116 (patched) <https://reviews.apache.org/r/60289/#comment253878> nit: can metadata and stripeMetadata be merged to single CHM? buffer needs to implement a getKey() interface method. notifyEvicted could be merged then. llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java Lines 143 (patched) <https://reviews.apache.org/r/60289/#comment253879> same as below. get/put methods looks duplicated which could be merged if Object and OrcBatchKey can implement common interface. ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java Lines 133 (patched) <https://reviews.apache.org/r/60289/#comment253880> nit: rename to reader or file schema? ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java Lines 212 (patched) <https://reviews.apache.org/r/60289/#comment253881> get enum length instead? adding new streams to ORC will break this. ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java Lines 1685 (patched) <https://reviews.apache.org/r/60289/#comment253882> This looks complicated. It will be better if ORC can provide API that returns stripe footer and indexes as ByteBuffer which can be directly cached. Stripe footers and Indexes could be stored with medium priority. Priorities could be: Serialized file footer - HIGH (this is required to not choke NN, with config change this is already part of split) Index + Stripe footer - MEDIUM (with locality re-reading these will not be a problem) Data - LOW (same as reading index, stripe footer) Since backward seeks no longer close connections for cloud storage, reading index and stripe could be done faster. I think it would be easier, if ORC and parquet readers can provide 2 high level interfaces - Interface to read footers, index as ByteBuffers which LLAP will cache - Reader interface to accept ByteBuffer from which footers and index can be read which LLAP or file will provide ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java Lines 52 (patched) <https://reviews.apache.org/r/60289/#comment253883> These interfaces are already in org.apache.orc.Reader - Prasanth_J On June 21, 2017, 8:31 p.m., Sergey Shelukhin wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/60289/ > ----------------------------------------------------------- > > (Updated June 21, 2017, 8:31 p.m.) > > > Review request for hive, Gopal V and Prasanth_J. > > > Repository: hive-git > > > Description > ------- > > see jira > > > Diffs > ----- > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java be38f381e6 > > llap-server/src/java/org/apache/hadoop/hive/llap/cache/EvictionDispatcher.java > c73f1a1a7d > > llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java > 53c9bae5c1 > > llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java > 2a76f5c4da > > llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java > dc053ee7cf > > llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileMetadata.java > b9d7a77d5b > > llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java > 601b622b49 > > llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcStripeMetadata.java > 4565d11988 > > llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java > 13c7767a3b > > llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestOrcMetadataCache.java > 03a955c6f7 > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 0ef7c758d4 > ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReader.java > 7540e72b53 > ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java > d5807b77e2 > ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java 31b0609b83 > ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/ReaderImpl.java > 4856fb3ceb > ql/src/test/results/clientpositive/llap/orc_llap_counters.q.out 8af84dce19 > ql/src/test/results/clientpositive/llap/orc_llap_counters1.q.out 4536cbbfb9 > ql/src/test/results/clientpositive/llap/orc_ppd_basic.q.out cd7a392e08 > ql/src/test/results/clientpositive/llap/orc_ppd_schema_evol_3a.q.out > b799527e30 > > > Diff: https://reviews.apache.org/r/60289/diff/1/ > > > Testing > ------- > > > Thanks, > > Sergey Shelukhin > >