[jira] [Work logged] (HIVE-23553) Upgrade ORC version to 1.6.7

ASF GitHub Bot (Jira) Mon, 01 Feb 2021 14:10:25 -0800


     [ 
https://issues.apache.org/jira/browse/HIVE-23553?focusedWorklogId=545643&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-545643
 ]


ASF GitHub Bot logged work on HIVE-23553:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 01/Feb/21 22:09
            Start Date: 01/Feb/21 22:09
    Worklog Time Spent: 10m 
      Work Description: pgaref commented on a change in pull request #1823:
URL: https://github.com/apache/hive/pull/1823#discussion_r568174337



##########
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##########
@@ -4509,7 +4509,7 @@ private static void populateLlapDaemonVarsSet(Set<String> 
llapDaemonVarsSetLocal
         "Minimum allocation possible from LLAP buddy allocator. Allocations 
below that are\n" +
         "padded to minimum allocation. For ORC, should generally be the same 
as the expected\n" +
         "compression buffer size, or next lowest power of 2. Must be a power 
of 2."),
-    LLAP_ALLOCATOR_MAX_ALLOC("hive.llap.io.allocator.alloc.max", "16Mb", new 
SizeValidator(),
+    LLAP_ALLOCATOR_MAX_ALLOC("hive.llap.io.allocator.alloc.max", "4Mb", new 
SizeValidator(),

Review comment:
       LLAP_ALLOCATOR_MAX_ALLOC is used both for the LowLevelCacheImpl 
(buddyAllocator) and bufferSize on 
[WriterOptions](https://github.com/apache/hive/blob/da1aa077716a65c2a02d850828b16cdeece1f574/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/SerDeEncodedDataReader.java#L1553)
 
   
   Please check how this propagated from 
[SerDeEncodedDataReader](https://github.com/apache/hive/blob/da1aa077716a65c2a02d850828b16cdeece1f574/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/SerDeEncodedDataReader.java#L248)
 
   
   Llap is tightly coupled to ORC, thus it could make sense to use the same 
buffer size for serialized Buffers, and the ORC writer as we would not need to 
split/merge them  -- however I have nothing against splitting the conf or 
checking is the 8Mb limit is a hard one.
   All I am trying to say here is that this is orthogonal the ORC version bump.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 545643)
    Time Spent: 7h  (was: 6h 50m)

> Upgrade ORC version to 1.6.7
> ----------------------------
>
>                 Key: HIVE-23553
>                 URL: https://issues.apache.org/jira/browse/HIVE-23553
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Panagiotis Garefalakis
>            Assignee: Panagiotis Garefalakis
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 7h
>  Remaining Estimate: 0h
>
>  Apache Hive is currently on 1.5.X version and in order to take advantage of 
> the latest ORC improvements such as column encryption we have to bump to 
> 1.6.X.
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12343288&styleName=&projectId=12318320&Create=Create&atl_token=A5KQ-2QAV-T4JA-FDED_4ae78f19321c7fb1e7f337fba1dd90af751d8810_lin
> Even though ORC reader could work out of the box, HIVE LLAP is heavily 
> depending on internal ORC APIs e.g., to retrieve and store File Footers, 
> Tails, streams – un/compress RG data etc. As there ware many internal changes 
> from 1.5 to 1.6 (Input stream offsets, relative BufferChunks etc.) the 
> upgrade is not straightforward.
> This Umbrella Jira tracks this upgrade effort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23553) Upgrade ORC version to 1.6.7

Reply via email to