[ https://issues.apache.org/jira/browse/DRILL-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289581#comment-16289581 ]
ASF GitHub Bot commented on DRILL-6028: --------------------------------------- Github user priteshm commented on the issue: https://github.com/apache/drill/pull/1071 @paul-rogers can you please review this? > Allow splitting generated code in ChainedHashTable into blocks to avoid "code > too large" error > ---------------------------------------------------------------------------------------------- > > Key: DRILL-6028 > URL: https://issues.apache.org/jira/browse/DRILL-6028 > Project: Apache Drill > Issue Type: Improvement > Affects Versions: 1.10.0 > Reporter: Arina Ielchiieva > Assignee: Arina Ielchiieva > Fix For: 1.13.0 > > > Allow splitting generated code in ChainedHashTable into blocks to avoid "code > too large" error. > *REPRODUCE* > File {{1200_columns.csv}} > {noformat} > 0,1,2,3...1200 > 0,1,2,3...1200 > {noformat} > Query > {noformat} > select columns[0], column[1]...columns[1200] from dfs.`1200_columns.csv` > union > select columns[0], column[1]...columns[1200] from dfs.`1200_columns.csv` > {noformat} > Error > {noformat} > Error: SYSTEM ERROR: CompileException: File > 'org.apache.drill.exec.compile.DrillJavaFileObject[HashTableGen10.java]', > Line -7886, Column 24: HashTableGen10.java:57650: error: code too large > public boolean isKeyMatchInternalBuild(int incomingRowIdx, int > htRowIdx) > ^ (compiler.err.limit.code) > {noformat} > *ROOT CAUSE* > DRILL-4715 added ability to ensure that methods size won't go beyond the 64k > limit imposed by JVM. {{BlkCreateMode.TRUE_IF_BOUND}} was added to create new > block only if # of expressions added hit upper-bound defined by > {{exec.java.compiler.exp_in_method_size}}. Once number of expressions in > methods hits upper bound we create from call inner method. > Example: > {noformat} > public void doSetup(RecordBatch incomingBuild, RecordBatch incomingProbe) > throws SchemaChangeException { > // some logic > return doSetup0(incomingBuild, incomingProbe); > } > {noformat} > During code generation {{ChainedHashTable}} added all code in its methods in > one block (using {{BlkCreateMode.FALSE}}) since {{getHashBuild}} and > {{getHashProbe}} methods contained state and thus could not be split. In > these methods hash was generated for each key expression. For the first key > seed was 0, subsequent keys hash was generated based on seed from previous > key. > To allow splitting for there methods the following was done: > 1. Method signatures was changed: added new parameter {{seedValue}}. > Initially starting seed value was hard-coded during code generation (set to > 0), now it is passed as method parameter. > 2. Initially hash function call for all keys was transformed into one logical > expression which did not allow splitting. Now we create logical expression > for each key and thus splitting is possible. New {{seedValue}} parameter is > used as seed holder to pass seed value for the next key. > 3. {{ParameterExpression}} was added to generate reference to method > parameter during code generation. > Code example: > {noformat} > public int getHashBuild(int incomingRowIdx, int seedValue) > throws SchemaChangeException > { > { > NullableVarCharHolder out3 = new NullableVarCharHolder(); > { > out3 .isSet = vv0 .getAccessor().isSet((incomingRowIdx)); > if (out3 .isSet == 1) { > out3 .buffer = vv0 .getBuffer(); > long startEnd = vv0 > .getAccessor().getStartEnd((incomingRowIdx)); > out3 .start = ((int) startEnd); > out3 .end = ((int)(startEnd >> 32)); > } > } > IntHolder seedValue4 = new IntHolder(); > seedValue4 .value = seedValue; > //---- start of eval portion of hash32 function. ----// > IntHolder out5 = new IntHolder(); > { > final IntHolder out = new IntHolder(); > NullableVarCharHolder in = out3; > IntHolder seed = seedValue4; > > Hash32FunctionsWithSeed$NullableVarCharHash_eval: { > if (in.isSet == 0) { > out.value = seed.value; > } else > { > out.value = > org.apache.drill.exec.expr.fn.impl.HashHelper.hash32(in.start, in.end, > in.buffer, seed.value); > } > } > > out5 = out; > } > //---- end of eval portion of hash32 function. ----// > seedValue = out5 .value; > return getHashBuild0((incomingRowIdx), (seedValue)); > } > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)