Paul Rogers created DRILL-5071: ---------------------------------- Summary: CodeGenerator class unnecessarily keeps two copies of generated code Key: DRILL-5071 URL: https://issues.apache.org/jira/browse/DRILL-5071 Project: Apache Drill Issue Type: Improvement Affects Versions: 1.8.0 Reporter: Paul Rogers Priority: Minor
Drill uses a code cache to avoid recompiling the same code multiple times. The cache is keyed on the generated code itself. The generated code contains an ever-increasing name suffix of the form {{ProjectorGen123}}. The unique name would be necessary if generated code shared a single name space. But, as currently implemented, each bit of generated code resides in its own private class loader: the code generated for one operator (say) can never class with that for another. As a result, we can reduce the size and cost of the code cache by: 1. Eliminate the numeric suffix on the class name. 2. Eliminate the {{generifiedCode}} member variable in {{CodeGenerator}}. 3. Eliminate the search and replace that produces the "generified" code. 4. Use the actual generated code as the cache key instead of the "generified" version. 5. Rely on the distinct class loaders to keep generated class names separate. The code cache holds up to 1000 classes. Classes can range from a few K to hundreds of K. By eliminating the second code copy, we may reduce heap memory pressure on the order of 50K * 1000 = 50 MB or so. -- This message was sent by Atlassian JIRA (v6.3.4#6332)