Jacob Tolar created PIG-5418:
--------------------------------
Summary: Utils.parseSchema(String), parseConstant(String) leak
memory
Key: PIG-5418
URL: https://issues.apache.org/jira/browse/PIG-5418
Project: Pig
Issue Type: Improvement
Reporter: Jacob Tolar
A minor issue: I noticed that Utils.parseSchema() and parseConstant() leak
memory. I noticed this while running a unit test for a UDF many thousand times
and checking the heap.
Links is to latest commit as of creating this ticket:
https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/util/Utils.java#L244-L256
{{new PigContext()}} [creates a MapReduce
ExecutionEngine|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/impl/PigContext.java#L269].
This creates a
[MapReduceLauncher|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRExecutionEngine.java#L34].
This registers a [Hadoop shutdown
hook|https://github.com/apache/pig/blob/59ec4a326079c9f937a052194405415b1e3a2b06/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java#L104-L105]
which doesn't go away until the JVM dies. See:
https://hadoop.apache.org/docs/r2.8.2/hadoop-project-dist/hadoop-common/api/org/apache/hadoop/util/ShutdownHookManager.html
.
I will attach a proposed patch. From my reading of the code and running tests,
the existing schema parse APIs do not actually use anything from this dummy
PigContext, and with a minor tweak it can be passed in as NULL, avoiding the
creation of these extra resources.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)