Ádám Szita created HIVE-24736:
---------------------------------
Summary: Make buffer tracking in LLAP cache with BP wrapper more
accurate
Key: HIVE-24736
URL: https://issues.apache.org/jira/browse/HIVE-24736
Project: Hive
Issue Type: Improvement
Components: llap
Reporter: Ádám Szita
Assignee: Ádám Szita
HIVE-22492 has introduced threadlocal buffers in which LlapCachableBuffer
instances are stored before entering LRFU's heap - so that lock contention is
eased up.
This is a nice performance improvement, but comes at the cost of losing the
exact accounting of llap buffer instances - e.g. if user gives a purge command,
not all the cache space is free'd up as one'd expect because purge only
considers buffers that the policy knows about. In this case we'd see in LLAP's
iomem servlet that the LRFU policy is empty, but a table may still have the
full content loaded.
Also, if we use text based tables, during cache load, a set of -OrcEncode
threads are used that are ephemeral in nature. Attaching buffers to these
threads' thread local structures are ultimately lost. In an edge case we could
load lots of data into the cache by reading in many distinct smaller text
tables, whose buffers never reach LRFU policy, and hence cache hit ratio will
be suffering as a consequence (memory manager will give up asking LRFU to
evict, and will free up random buffers).
I propose we try and track the amount of data stored in the BP wrapper
threadlocals, and flush them into the heap as a first step of a purge request.
This will enhance supportability.
We should also replace the ephemeral OrcEncode threads with a thread pool, that
could actually serve as small performance improvement on its own by saving time
and memory to deal with thread lifecycle management.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)