Paul Rogers created DRILL-5071:
----------------------------------

             Summary: CodeGenerator class unnecessarily keeps two copies of 
generated code 
                 Key: DRILL-5071
                 URL: https://issues.apache.org/jira/browse/DRILL-5071
             Project: Apache Drill
          Issue Type: Improvement
    Affects Versions: 1.8.0
            Reporter: Paul Rogers
            Priority: Minor


Drill uses a code cache to avoid recompiling the same code multiple times. The 
cache is keyed on the generated code itself.

The generated code contains an ever-increasing name suffix of the form 
{{ProjectorGen123}}.

The unique name would be necessary if generated code shared a single name 
space. But, as currently implemented, each bit of generated code resides in its 
own private class loader: the code generated for one operator (say) can never 
class with that for another.

As a result, we can reduce the size and cost of the code cache by:

1. Eliminate the numeric suffix on the class name.
2. Eliminate the {{generifiedCode}} member variable in {{CodeGenerator}}.
3. Eliminate the search and replace that produces the "generified" code.
4. Use the actual generated code as the cache key instead of the "generified" 
version.
5. Rely on the distinct class loaders to keep generated class names separate.

The code cache holds up to 1000 classes. Classes can range from a few K to 
hundreds of K. By eliminating the second code copy, we may reduce heap memory 
pressure on the order of 50K * 1000 = 50 MB or so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to