[ https://issues.apache.org/jira/browse/HIVE-26699?focusedWorklogId=833903&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-833903 ]
ASF GitHub Bot logged work on HIVE-26699: ----------------------------------------- Author: ASF GitHub Bot Created on: 15/Dec/22 16:46 Start Date: 15/Dec/22 16:46 Worklog Time Spent: 10m Work Description: ayushtkn commented on code in PR #3862: URL: https://github.com/apache/hive/pull/3862#discussion_r1049893472 ########## iceberg/iceberg-shading/pom.xml: ########## @@ -112,7 +112,11 @@ <include>com.google*:*</include> <include>com.fasterxml*:*</include> <include>com.github.ben-manes*:*</include> + <include>org.apache.hive:patched-iceberg-core</include> Review Comment: as part of this: ``` <include>org.apache.iceberg:*</include> ``` I think I should add the iceberg-core as part of exclusion as well Issue Time Tracking ------------------- Worklog Id: (was: 833903) Time Spent: 1h (was: 50m) > Iceberg: S3 fadvise can hurt JSON parsing significantly in DWX > -------------------------------------------------------------- > > Key: HIVE-26699 > URL: https://issues.apache.org/jira/browse/HIVE-26699 > Project: Hive > Issue Type: Improvement > Reporter: Rajesh Balamohan > Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Hive reads JSON metadata information (TableMetadataParser::read()) multiple > times; E.g during query compilation, AM split computation, stats computation, > during commits etc. > > With large JSON files (due to multiple inserts), it takes a lot longer time > with S3 FS with "fs.s3a.experimental.input.fadvise" set to "random". (e.g in > the order of 10x).To be on safer side, it will be good to set this to > "normal" mode in configs, when reading iceberg tables. -- This message was sent by Atlassian Jira (v8.20.10#820010)