[ https://issues.apache.org/jira/browse/DRILL-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15105979#comment-15105979 ]
Jacques Nadeau commented on DRILL-4266: --------------------------------------- Based on these metrics, the leak isn't in the RPC layer. Let me add some more metrics and we'll get a better snapshot of the memory allocation caching layer. > Possible memory leak (fragmentation ?) in rpc layer > ---------------------------------------------------- > > Key: DRILL-4266 > URL: https://issues.apache.org/jira/browse/DRILL-4266 > Project: Apache Drill > Issue Type: Bug > Components: Execution - RPC > Affects Versions: 1.5.0 > Reporter: Victoria Markman > Assignee: Jacques Nadeau > Attachments: WebUI_500_iterations.txt, drill.log.2016-01-12-16, > memComsumption.txt, > memComsumption_framework.output_Fri_Jan_15_width_per_node=4.log, > memComsumption_framework.output_Mon_Jan_18_15_500_iterations.txt, > memComsumption_framework.output_Sun_Jan_17_04_jacques_branch_drill-4131, > test.tar > > > I have executed 5 tests from Advanced/mondrian test suite in a loop overnight. > My observation is that direct memory steadily grew from 117MB to 1.8GB and > remained on that level for 14875 iteration of the tests. > My question is: why do 5 queries that were able to execute with 117MB of > memory require 1.8GB of memory after 5 hours of execution ? > Attached: > * Memory used after each test iteration : memComsumption.txt > * Log of the framework run: drill.log.2016-01-12-16 > * Tests: test.tar > Setup: > {noformat} > Single node 32 core box. > DRILL_MAX_DIRECT_MEMORY="4G" > DRILL_HEAP="1G" > 0: jdbc:drill:schema=dfs> select * from sys.options where status like > '%CHANGED%'; > +-----------------------------------+----------+---------+----------+----------+-------------+-----------+------------+ > | name | kind | type | status | num_val > | string_val | bool_val | float_val | > +-----------------------------------+----------+---------+----------+----------+-------------+-----------+------------+ > | planner.enable_decimal_data_type | BOOLEAN | SYSTEM | CHANGED | null > | null | true | null | > +-----------------------------------+----------+---------+----------+----------+-------------+-----------+------------+ > 1 row selected (1.309 seconds) > {noformat} > {noformat} > Reproduction: > * tar xvf test.tar into Functional/test directory > * ./run.sh -s Functional/test -g regression -t 180 -n 5 -i 10000000 -m > {noformat} > This is very similar behavior as Hakim and I observed long time ago with > window functions. Now, that new allocator is in place we rerun this test and > we see the similar things, and allocator does not seem to think that we have > a memory leak. Hence the speculation that memory is leaked in RPC layer. > I'm going to reduce planner.width.max_per_node and see if it has any effect > on memory allocation (speculating again ...) -- This message was sent by Atlassian JIRA (v6.3.4#6332)