Lei Sun created ORC-846:
---------------------------
Summary: Refactoring Memory-Manager for better extensibility
Key: ORC-846
URL: https://issues.apache.org/jira/browse/ORC-846
Project: ORC
Issue Type: Improvement
Reporter: Lei Sun
Assignee: Lei Sun
Hi ORC Community,
In our use case, dynamic partitioning is quite common and our engine
([Gobblin|[https://github.com/apache/gobblin])] doesn't have a shuffle stage so
that having single thread dealing with multiple writers (dealing with different
partitions) is a usual pattern. When the number of partitions reach a large
number, there will be challenges in avoiding OOM given the current
memory-manager implementation:
* `rows.between.memory.check` is on writer level (the condition gating the
expensive check between estimated memory of treeWriter versus memoryLimit of
each writer), and memory manager is not aware of it. memory manager only
control the "scale" of allocation which hints each writer to reduce the
threshold of memory limit (initial to StripSize)
What is proposed here is to:
* Having centralized awareness of how many rows have been bufferred among all
writers, in the memoryManager. The data structure being used will be
thread-safe so that the issue fixed in
https://issues.apache.org/jira/browse/ORC-361# won't re-surface. There's no
additional synchronization introduced beyond the intrinsic control from the
concurrent data structure managing how much rows buffered in each writer. The
existing memory-manager will be interfacing with each writer in a
backward-compatible way.
With this, the existing memory manager can be extended in terms of controlling
flush in each -writer's granularity and treat different writers with priority.
e.g. some of the writers dealing with less favored partition could be flushed
more often so that the overall pressure on memory could be reduced, all these
"prioritization" should be localized to the engines that uses orc-writer.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)