Hi Paul, thanks for the tip. I will set it up to see if that makes a difference.
Looking at the heap dump, it appears that there are 19,971 O(20k) AssignmentTrackingFrame and the first AssignmentTrackingFrame has 1,260 localVariablesSet size. And each localVariableSet is made of many individual Integer objects. So in total, we have O(20k) AssignmentTrackingFrame x O(100) localVariablesSet per AssignmentTrackingFrame O(10) Integer objects per localVariablesSet Total ~O(20M) objects! Looking at the code on AssignmentTrackingFrame, this class has been updated in 1.16.0 to introduce a new member variable: *private* *final* Deque<Set<Integer>> localVariablesSet; The memory dump seems to indicate this new internal data structure is consuming a lot of memory. We have been running the same queries in Drill 1.14 multiple times a day over many months without memory issues. -- Jiang On Tue, Sep 10, 2019 at 1:09 PM Paul Rogers <[email protected]> wrote: > Hi Jiang, > > Many factors influence memory usage; the trick is to tease them apart. > > An obvious question is the memory use of your custom storage plugin. This > will depend on any buffering done by the plugin, the number of threads > (minor fragments) per node, and the number of concurrent queries. Since > this is your code, you presumably understand these issues. > > In the dump, it looks like you have many objects associated with Drill's > byte code optimization mechanism. Byte code size will be based on query > complexity. As a rough measure of query size, about how many K in size is > the SELECT statement you are trying to run? Very large expressions, or > large numbers of projected columns, could drive the generated code to be > large. > > If the problem is, indeed, related to the byte code rewrite, there is a > trick you can try: you can switch to using the "Plain Java" mechanism. > Briefly, this mechanism generates Java source code, then lets the compiler > generate byte codes directly without the usual Drill byte code rewrite. > This works because modern Java compilers are at least as good at Drill when > doing scalar replacement. > > Here are the options: > > drill.exec: { > compile: { > compiler: "JDK", prefer_plain_java: true > }, > > This forces use of the JDK compiler (instead of Janino) and bypasses the > byte code rewrite step. > > No guarantee this will work, but something to try. > > Thanks, > > - Paul > > > > On Tuesday, September 10, 2019, 12:28:07 PM PDT, Jiang Wu > <[email protected]> wrote: > > While doing testing against Apache Drill 1.16.0, we are running into this > error: java.lang.OutOfMemoryError: GC overhead limit exceeded > > In our use case, Apache Drill is using a custom storage plugin and no other > storage plugins like PostgreSQL, MySQL, etc. Some of the queries are very > large involving many subquery, join, functions, etc. And we are running > through the same set of queries that work without issue in Drill version > 1.14.0. > > We generated a heap dump at the time of out of memory exception. Heap dump > file is about 5.8 GB. Opening the dump showed: > > Heap: > Size: 3.1 GB > Classes: 21.1k > Objects: 82.4m > Class Loader: 538 > > Showing the dominator tree for the allocated heap indicate two threads, > both with similar ownership stack for the bulk of the memory allocated. > E.g. > > Class Name > | Shallow Heap | Retained Heap | Percentage > > ------------------------------------------------------------------------------------------------------------------------------------------------ > java.lang.Thread @ 0x73af9b238 > 2288c900-b265-3988-1524-8e920a884075:frag:4:0 Thread | > 120 | 1,882,336,336 | 56.51% > |- org.apache.drill.exec.compile.bytecode.MethodAnalyzer @ 0x73cc3fe88 > | 56 | 1,873,674,888 | 56.25% > | |- org.objectweb.asm.tree.analysis.Frame[33487] @ 0x73d19b570 > | 133,968 | 1,873,239,392 | 56.24% > | | |- > > org.apache.drill.exec.compile.bytecode.MethodAnalyzer$AssignmentTrackingFrame > @ 0x786c6c470| 40 | 206,576 | 0.01% > | | | |- java.util.ArrayDeque @ 0x786c6c550 > | 24 | 198,120 | 0.01% > | | | | '- java.lang.Object[2048] @ 0x786ce99f8 > | 8,208 | 198,096 | 0.01% > | | | | |- java.util.HashSet @ 0x786c6e2d8 > | 16 | 288 | 0.00% > | | | | |- java.util.HashSet @ 0x786c6ec68 > | 16 | 288 | 0.00% > | | | | |- java.util.HashSet @ 0x786cd1ce8 > | 16 | 288 | 0.00% > | | | | |- java.util.HashSet @ 0x786cd2ad8 > | 16 | 288 | 0.00% > | | | | |- ...... > | | | > | | | | *Total: 25 of 1,260 entries* > | | | > | | | |- java.util.ArrayDeque @ 0x786cf3128 > | 24 | 8,232 | 0.00% > | | | | '- java.lang.Object[2048] @ 0x786cf3140 > | 8,208 | 8,208 | 0.00% > | | | |- org.objectweb.asm.tree.analysis.Value[42] @ 0x786c6c498 > | 184 | 184 | 0.00% > | | | | *Total: 3 entries* > | | | > | | |- > > org.apache.drill.exec.compile.bytecode.MethodAnalyzer$AssignmentTrackingFrame > @ 0x786cf5150| 40 | 206,576 | 0.01% > | | |- > > org.apache.drill.exec.compile.bytecode.MethodAnalyzer$AssignmentTrackingFrame > @ 0x78697ee00| 40 | 206,416 | 0.01% > | | |- > > org.apache.drill.exec.compile.bytecode.MethodAnalyzer$AssignmentTrackingFrame > @ 0x7869b1440| 40 | 206,416 | 0.01% > | | |- > > org.apache.drill.exec.compile.bytecode.MethodAnalyzer$AssignmentTrackingFrame > @ 0x784d5a328| 40 | 206,336 | 0.01% > | | |- > > org.apache.drill.exec.compile.bytecode.MethodAnalyzer$AssignmentTrackingFrame > @ 0x784d8c918| 40 | 206,336 | 0.01% > | | |- > > org.apache.drill.exec.compile.bytecode.MethodAnalyzer$AssignmentTrackingFrame > @ 0x784f9cc88| 40 | 206,336 | 0.01% > | | |- ...... > | | | > | | | *Total: 25 of 19,971 entries* > | | | > ........... > > ------------------------------------------------------------------------------------------------------------------------------------------------ > > Not sure if the above is normal or not with many > > org.apache.drill.exec.compile.bytecode.MethodAnalyzer$AssignmentTrackingFrame > > > Unreachable objects: > Size: 3.3 GB > Objects: 80k > Classes: 454 > > Seems like a lot of unreachable objects. Where do I go from here to debug > this? Is there some JVM setting to fix this issue? Thanks. > > -- Jiang >
