[jira] [Work logged] (HIVE-24736) Make buffer tracking in LLAP cache with BP wrapper more accurate

ASF GitHub Bot (Jira) Mon, 08 Feb 2021 03:30:06 -0800


     [ 
https://issues.apache.org/jira/browse/HIVE-24736?focusedWorklogId=549518&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-549518
 ]


ASF GitHub Bot logged work on HIVE-24736:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Feb/21 11:29
            Start Date: 08/Feb/21 11:29
    Worklog Time Spent: 10m 
      Work Description: szlta commented on a change in pull request #1951:
URL: https://github.com/apache/hive/pull/1951#discussion_r571973578



##########
File path: 
llap-server/src/java/org/apache/hadoop/hive/llap/cache/LowLevelLrfuCachePolicy.java
##########
@@ -836,13 +912,33 @@ public String description() {
      * @return long array with LRFU stats
      */
     public long[] getUsageStats() {
-      long dataOnHeap = 0L;   // all non-meta related buffers on min-heap
-      long dataOnList = 0L;   // all non-meta related buffers on eviction list
-      long metaOnHeap = 0L;   // meta data buffers on min-heap
-      long metaOnList = 0L;   // meta data buffers on eviction list
-      long listSize   = 0L;   // number of entries on eviction list
-      long lockedData = 0L;   // number of bytes in locked data buffers
-      long lockedMeta = 0L;   // number of bytes in locked metadata buffers
+      long dataOnHeap     = 0L;   // all non-meta related buffers on min-heap
+      long dataOnList     = 0L;   // all non-meta related buffers on eviction 
list
+      long metaOnHeap     = 0L;   // meta data buffers on min-heap
+      long metaOnList     = 0L;   // meta data buffers on eviction list
+      long listSize       = 0L;   // number of entries on eviction list
+      long lockedData     = 0L;   // number of bytes in locked data buffers
+      long lockedMeta     = 0L;   // number of bytes in locked metadata buffers
+      long bpWrapCount    = 0L;   // number of buffers in BP wrapper 
threadlocals
+      long bpWrapDistinct = 0L;   // number of distinct buffers in BP wrapper 
threadlocals
+      long bpWrapData     = 0L;   // number of bytes stored in BP wrapper data 
buffers
+      long bpWrapMeta     = 0L;   // number of bytes stored in BP wrapper 
metadata buffers
+
+      // Using set to produce result of distinct buffers only
+      // (same buffer may be present in multiple thread local bp wrappers, or 
even inside heap/list, but ultimately
+      // it uses the same cache space)
+      Set<LlapCacheableBuffer> bpWrapperBuffers = new HashSet<>();
+      for (BPWrapper bpWrapper : bpWrappers.values()) {
+        bpWrapper.lock.lock();

Review comment:
       It is only called when someone (e.g. cluster admin) asks for memory 
stats through web UI. Hence it is okay to block all structures for these cases.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 549518)
    Time Spent: 1h 20m  (was: 1h 10m)

> Make buffer tracking in LLAP cache with BP wrapper more accurate
> ----------------------------------------------------------------
>
>                 Key: HIVE-24736
>                 URL: https://issues.apache.org/jira/browse/HIVE-24736
>             Project: Hive
>          Issue Type: Improvement
>          Components: llap
>            Reporter: Ádám Szita
>            Assignee: Ádám Szita
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> HIVE-22492 has introduced threadlocal buffers in which LlapCachableBuffer 
> instances are stored before entering LRFU's heap - so that lock contention is 
> eased up.
> This is a nice performance improvement, but comes at the cost of losing the 
> exact accounting of llap buffer instances - e.g. if user gives a purge 
> command, not all the cache space is free'd up as one'd expect because purge 
> only considers buffers that the policy knows about. In this case we'd see in 
> LLAP's iomem servlet that the LRFU policy is empty, but a table may still 
> have the full content loaded.
> Also, if we use text based tables, during cache load, a set of -OrcEncode 
> threads are used that are ephemeral in nature. Attaching buffers to these 
> threads' thread local structures are ultimately lost. In an edge case we 
> could load lots of data into the cache by reading in many distinct smaller 
> text tables, whose buffers never reach LRFU policy, and hence cache hit ratio 
> will be suffering as a consequence (memory manager will give up asking LRFU 
> to evict, and will free up random buffers).
> I propose we try and track the amount of data stored in the BP wrapper 
> threadlocals, and flush them into the heap as a first step of a purge 
> request. This will enhance supportability.
> We should also replace the ephemeral OrcEncode threads with a thread pool, 
> that could actually serve as small performance improvement on its own by 
> saving time and memory to deal with thread lifecycle management.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24736) Make buffer tracking in LLAP cache with BP wrapper more accurate

Reply via email to