Re: How to generate hash code for each build side one of the hash join columns

Boaz Ben-Zvi Thu, 31 May 2018 12:23:24 -0700

 Hi Weijie,

    Another option is to totally avoid the generated code.
We were considering the idea of replacing the generated code used for computing 
hash values with “real java” code.


This idea is analogous to the usage of the copyEntry() method in the 
ValueVector interface (that Paul added last year).
See an example of using the copyEntry() (via the appendRow() in 
VectorContainer) in the new Hash-Join-Spill code.
Basically no need to generate “type specific” code, as the virtual copyEntry() 
method does the “type specific” work. 

Similarly we could have a hash64() method in ValueVector, which would perform 
the “type specific” computation.
(One difference from copyEntry() – the hash64() would also need to take the 
“seed” parameter, which is the hash value produced by the previous hash).
And similar to appendRow(), there would be evalHash() iterating over the key 
columns.
(And one difference from appendRow() – need to iterate only on the key columns; 
these are the first columns; their number can be found from the config: e.g., 
htConfig.getKeyExprsBuild().size() )

   With such implementation, that evalHash() could be used anywhere (e.g., to 
match the Bloom filters on the left side of the join).

       Thanks,

             Boaz


On 5/30/18, 7:49 PM, "weijie tong" <[email protected]> wrote:

    Hi Aman:
    
      Thanks for your tips. I have rebased the latest code from the master
    branch . Yes, the spill-to-disk feature does changed the original
    implementation. I have adjusted my implementation according to the new
    feature. But as you say, it will take some challenge to integration as I
    noticed the spill-to-disk feature will continue to tune its implementation
    performance.
    
      The BloomFilter was implemented natively in Drill , not an external
    library. It's implemented the algorithm of the paper which was mentioned by
    you.
    
    
    On Thu, May 31, 2018 at 1:56 AM Aman Sinha <[email protected]> wrote:
    
    > Hi Weijie,
    > I was hoping you could leverage the existing methods..so its good that you
    > found the ones that work for your use case.
    > One thing I want to point out (maybe you're already aware) .. the Hash 
Join
    > code has changed significantly in the master branch due to the
    > spill-to-disk feature.
    > So, this may pose some integration challenges for your run-time join
    > pushdown feature.
    > Also, one other question/clarification:  for the bloom filter itself are
    > you implementing it natively in Drill or using an external library ?
    >
    > -Aman
    >
    > On Tue, May 29, 2018 at 8:23 PM, weijie tong <[email protected]>
    > wrote:
    >
    > > I found ClassGenerator's nestEvalBlock(JBlock block) and
    > unNestEvalBlock()
    > > which has the same effect to what I change to the ClassGenerator. So I
    > give
    > > up what I change to the ClassGenerator and hope this can help someone
    > else.
    > >
    > > On Tue, May 29, 2018 at 1:53 PM weijie tong <[email protected]>
    > > wrote:
    > >
    > > > The code formatting is not nice. Put them again:
    > > >
    > > > private void setupGetBuild64Hash(ClassGenerator<HashTable> cg,
    > > MappingSet
    > > > incomingMapping, VectorAccessible batch, LogicalExpression[] keyExprs,
    > > > TypedFieldId[] buildKeyFieldIds)
    > > > throws SchemaChangeException {
    > > > cg.setMappingSet(incomingMapping);
    > > > if (keyExprs == null || keyExprs.length == 0) {
    > > >   cg.getEvalBlock()._return(JExpr.lit(0));
    > > > }
    > > > String seedValue = "seedValue";
    > > > String fieldId = "fieldId";
    > > > LogicalExpression seed =
    > > > ValueExpressions.getParameterExpression(seedValue, Types.required(
    > > > TypeProtos.MinorType.INT));
    > > >
    > > > LogicalExpression fieldIdParamExpr =
    > > > ValueExpressions.getParameterExpression(fieldId, Types.required(
    > > > TypeProtos.MinorType.INT) );
    > > > HoldingContainer fieldIdParamHolder = cg.addExpr(fieldIdParamExpr);
    > > > int i = 0;
    > > >  for (LogicalExpression expr : keyExprs) {
    > > >      TypedFieldId targetTypeFieldId = buildKeyFieldIds[i];
    > > >      ValueExpressions.IntExpression targetBuildFieldIdExp = new
    > > > ValueExpressions.IntExpression(targetTypeFieldId.getFieldIds()[0],
    > > > ExpressionPosition.UNKNOWN);
    > > >
    > > >     JFieldRef targetBuildSideFieldId =
    > cg.addExpr(targetBuildFieldIdExp,
    > > > ClassGenerator.BlkCreateMode.TRUE_IF_BOUND).getValue();
    > > >     JBlock ifBlock =
    > > > cg.getEvalBlock()._if(fieldIdParamHolder.getValue().
    > > eq(targetBuildSideFieldId))._then();
    > > >     //specify a special JBlock which is a inner one of the eval block
    > to
    > > > the ClassGenerator to substitute the returned JBlock of getEvalBlock()
    > > >     cg.setCustomizedEvalInnerBlock(ifBlock);
    > > >     LogicalExpression hashExpression =
    > > > HashPrelUtil.getHashExpression(expr, seed, incomingProbe != null);
    > > >     LogicalExpression materializedExpr =
    > > > ExpressionTreeMaterializer.materializeAndCheckErrors(hashExpression,
    > > batch,
    > > > context.getFunctionRegistry());
    > > >     HoldingContainer hash = cg.addExpr(materializedExpr,
    > > > ClassGenerator.BlkCreateMode.TRUE_IF_BOUND);
    > > >     ifBlock._return(hash.getValue());
    > > >     //reset the customized block to null ,so the getEvalBlock() return
    > > the
    > > > truly eval JBlock
    > > >     cg.setCustomizedEvalInnerBlock(null);
    > > >     i++;
    > > >  }
    > > > cg.getEvalBlock()._return(JExpr.lit(0));
    > > > }
    > > >
    > > >
    > > >
    > > >
    > > > public long getBuild64HashCodeInner(int incomingRowIdx, int seedValue,
    > > int
    > > > fieldId)
    > > > throws SchemaChangeException
    > > > {
    > > > {
    > > > IntHolder fieldId12 = new IntHolder();
    > > > fieldId12 .value = fieldId;
    > > > if (fieldId12 .value == constant14 .value) {
    > > >    IntHolder out18 = new IntHolder();
    > > >    {
    > > >      out18 .value = vv15 .getAccessor().get((incomingRowIdx));
    > > >    }
    > > >    IntHolder seedValue19 = new IntHolder();
    > > >    seedValue19 .value = seedValue;
    > > >    //---- start of eval portion of hash32AsDouble function. ----//
    > > >    IntHolder out20 = new IntHolder();
    > > >   {
    > > >       final IntHolder out = new IntHolder();
    > > >       IntHolder in = out18;
    > > >       IntHolder seed = seedValue19;
    > > >
    > > >       Hash32WithSeedAsDouble$IntHash_eval: {
    > > >       out.value =
    > > > org.apache.drill.exec.expr.fn.impl.HashHelper.hash32((double) 
in.value,
    > > > seed.value);
    > > >    }
    > > >
    > > >    out20 = out;
    > > > }
    > > > //---- end of eval portion of hash32AsDouble function. ----//
    > > > return out20 .value;
    > > > }
    > > > return 0;
    > > > }
    > > > }
    > > >
    > > >
    > > >
    > > >
    > > >
    > > >
    > > >
    > > >
    > > >
    > > >
    > > >
    > > >
    > > >
    > > >
    > > >
    > > >
    > > >
    > > > On Tue, May 29, 2018 at 1:47 PM weijie tong <[email protected]>
    > > > wrote:
    > > >
    > > >> HI Paul:
    > > >>
    > > >>  Thanks for your enthusiasm. I have managed this skill as you ever
    > > >> mentioned me at another mail thread. It's really helpful ,thanks for
    > > your
    > > >> valuable work.
    > > >>
    > > >>   Now I have solved this tough problem by adding a customized JBlock
    > > >> member field to the ClassGenerator. So once you want the
    > getEvalBlock()
    > > of
    > > >> the ClassGenerator to return a inner customized JBlock , then you set
    > > this
    > > >> member, if you want the method to return eval self JBlock , you reset
    > > this
    > > >> member to null.
    > > >>
    > > >>   Here is my changed setup method :
    > > >>
    > > >>
    > > >> private void setupGetBuild64Hash(ClassGenerator<HashTable> cg,
    > > MappingSet incomingMapping, VectorAccessible batch, LogicalExpression[]
    > > keyExprs, TypedFieldId[] buildKeyFieldIds)
    > > >>   throws SchemaChangeException {
    > > >>   cg.setMappingSet(incomingMapping);
    > > >>   if (keyExprs == null || keyExprs.length == 0) {
    > > >>     cg.getEvalBlock()._return(JExpr.lit(0));
    > > >>   }
    > > >>   String seedValue = "seedValue";
    > > >>   String fieldId = "fieldId";
    > > >>   LogicalExpression seed =
    > ValueExpressions.getParameterExpression(seedValue,
    > > Types.required(TypeProtos.MinorType.INT));
    > > >>
    > > >>   LogicalExpression fieldIdParamExpr = ValueExpressions.
    > > getParameterExpression(fieldId, Types.required(TypeProtos.MinorType.INT)
    > > );
    > > >>   HoldingContainer fieldIdParamHolder = cg.addExpr(fieldIdParamExpr);
    > > >>   int i = 0;
    > > >>   for (LogicalExpression expr : keyExprs) {
    > > >>     TypedFieldId targetTypeFieldId = buildKeyFieldIds[i];
    > > >>     ValueExpressions.IntExpression targetBuildFieldIdExp = new
    > > ValueExpressions.IntExpression(targetTypeFieldId.getFieldIds()[0],
    > > ExpressionPosition.UNKNOWN);
    > > >>
    > > >>     JFieldRef targetBuildSideFieldId =
    > cg.addExpr(targetBuildFieldIdExp,
    > > ClassGenerator.BlkCreateMode.TRUE_IF_BOUND).getValue();
    > > >>     JBlock ifBlock = cg.getEvalBlock()._if(
    > > fieldIdParamHolder.getValue().eq(targetBuildSideFieldId))._then();
    > > >>     //specify a special JBlock which is a inner one of the eval block
    > > to the ClassGenerator to substitute the returned JBlock of 
getEvalBlock()
    > > >>     cg.setCustomizedEvalInnerBlock(ifBlock);
    > > >>     LogicalExpression hashExpression =
    > HashPrelUtil.getHashExpression(expr,
    > > seed, incomingProbe != null);
    > > >>     LogicalExpression materializedExpr = ExpressionTreeMaterializer.
    > > materializeAndCheckErrors(hashExpression, batch,
    > > context.getFunctionRegistry());
    > > >>     HoldingContainer hash = cg.addExpr(materializedExpr,
    > > ClassGenerator.BlkCreateMode.TRUE_IF_BOUND);
    > > >>     ifBlock._return(hash.getValue());
    > > >>     //reset the customized block to null ,so the getEvalBlock() 
return
    > > the truly eval JBlock
    > > >>     cg.setCustomizedEvalInnerBlock(null);
    > > >>     i++;
    > > >>   }
    > > >>   cg.getEvalBlock()._return(JExpr.lit(0));
    > > >> }
    > > >>
    > > >>
    > > >>  The corresponding generated codes :
    > > >>
    > > >>     public long getBuild64HashCodeInner(int incomingRowIdx, int
    > > seedValue, int fieldId)
    > > >>         throws SchemaChangeException
    > > >>     {
    > > >>         {
    > > >>             IntHolder fieldId12 = new IntHolder();
    > > >>             fieldId12 .value = fieldId;
    > > >>             if (fieldId12 .value == constant14 .value) {
    > > >>                 IntHolder out18 = new IntHolder();
    > > >>                 {
    > > >>                     out18 .value = vv15 .getAccessor().get((
    > > incomingRowIdx));
    > > >>                 }
    > > >>                 IntHolder seedValue19 = new IntHolder();
    > > >>                 seedValue19 .value = seedValue;
    > > >>                 //---- start of eval portion of hash32AsDouble
    > > function. ----//
    > > >>                 IntHolder out20 = new IntHolder();
    > > >>                 {
    > > >>                     final IntHolder out = new IntHolder();
    > > >>                     IntHolder in = out18;
    > > >>                     IntHolder seed = seedValue19;
    > > >>
    > > >> Hash32WithSeedAsDouble$IntHash_eval: {
    > > >>     out.value =
    > org.apache.drill.exec.expr.fn.impl.HashHelper.hash32((double)
    > > in.value, seed.value);
    > > >> }
    > > >>
    > > >>                     out20 = out;
    > > >>                 }
    > > >>                 //---- end of eval portion of hash32AsDouble 
function.
    > > ----//
    > > >>                 return out20 .value;
    > > >>             }
    > > >>             return  0;
    > > >>         }
    > > >>     }
    > > >>
    > > >>
    > > >>
    > > >>   Some other explanation:
    > > >>   1st : The if checking won't hurt the performance , as I invoke this
    > > >> method column by column , so it's branch predication friendly.
    > > >>   2nd: I will use the murmur3_64 not the murmur3_32 ，since the
    > efficient
    > > >> bloom filter algorithm needs the 64 bit hash code to avoid the
    > conflict.
    > > >>
    > > >>
    > > >>
    > > >>
    > > >>
    > > >>
    > > >>
    > > >>
    > > >>
    > > >>
    > > >>
    > > >>
    > > >>
    > > >> On Tue, May 29, 2018 at 12:37 PM Paul Rogers
    > <[email protected]
    > > >
    > > >> wrote:
    > > >>
    > > >>> Hi Weijie,
    > > >>>
    > > >>> Seeing the discussion about the details of JCodeModel suggests you
    > may
    > > >>> be trying to debug your generated code at the level of the code
    > > generator.
    > > >>>
    > > >>> Some time ago we added the ability to step through the generated
    > code.
    > > >>> Look for the following line in the generator code:
    > > >>>
    > > >>>
    > > >>>     // Uncomment out this line to debug the generated code.
    > > >>>
    > > >>> //    cg.saveCodeForDebugging(true);
    > > >>>
    > > >>>
    > > >>> Uncomment the code line and Drill will save each generated file to a
    > > >>> configured location (which, if I recall correctly, is
    > > /tmp/drill/codegen,
    > > >>> though it may have changed after Tim's test directory changes.)
    > > >>>
    > > >>> Then, set a breakpoint in the template setup() method and you can
    > step
    > > >>> directly into the generated doSetup() method. Same for the eval()
    > > method.
    > > >>>
    > > >>> This way, you can not only see the generated code, you can step
    > through
    > > >>> it. I've found this to be a far easier way to understand the
    > generated
    > > code
    > > >>> than the older techniques folks have used (look at byte codes, use
    > > print
    > > >>> statements, brute force reasoning, etc.)
    > > >>>
    > > >>> Tim, Boaz and others have used this technique more recently and can
    > > >>> probably give you additional pointers.
    > > >>>
    > > >>> Thanks,
    > > >>> - Paul
    > > >>>
    > > >>>
    > > >>>
    > > >>>     On Monday, May 28, 2018, 8:52:19 PM PDT, weijie tong <
    > > >>> [email protected]> wrote:
    > > >>>
    > > >>>  @aman thanks for your reply. "For the ifBlock, do you need an
    > _else()
    > > >>> block
    > > >>> also ?"  I give a default return logic at the method, so I don't 
need
    > > the
    > > >>> _else() block.  I have noticed the IfExpression's evaluation method
    > at
    > > >>> EvaluationVisitor which also uses the JConditional . But that also
    > > >>> doesn't
    > > >>> match my requirement. I think the key point here is the
    > > >>> FunctionHolderExpression and ValueVectorReadExpression will put 
their
    > > >>> corresponding generated codes to the eval method's JBlock , not our
    > > >>> specific IfBlock which is a inner block of the eval method's JBlock 
.
    > > >>>
    > > >>> So it seems I should make some changes to the ClassGenerator to let
    > the
    > > >>> getEvalBlock return the IfBlock (maybe accurately the JConditional's
    > > then
    > > >>> block) or implement some special FunctionHolderExpression
    > > >>> 、ValueVectorReadExpression and corresponding visiting methods at the
    > > >>> EvaluationVisitor to generate the special code blocks. Hope someone
    > who
    > > >>> are
    > > >>> familiar with these part of codes to point out whether there are 
more
    > > >>> easy
    > > >>> or different choices to achieve the target.
    > > >>>
    > > >>> To make discussion more accurate, I put the generated codes of the
    > > >>> previous
    > > >>> setupGetBuild64Hash method here:
    > > >>>
    > > >>>     public long getBuild64HashCodeInner(int incomingRowIdx, int
    > > >>> seedValue, int fieldId)
    > > >>>         throws SchemaChangeException
    > > >>>     {
    > > >>>         {
    > > >>>             IntHolder fieldId16 = new IntHolder();
    > > >>>             fieldId16 .value = fieldId;
    > > >>>             if (fieldId16 .value == constant18 .value) {
    > > >>>                 return out24 .value;
    > > >>>             }
    > > >>>             IntHolder out22 = new IntHolder();
    > > >>>             {
    > > >>>                 out22 .value = vv19 .getAccessor().get((
    > > incomingRowIdx));
    > > >>>             }
    > > >>>             IntHolder seedValue23 = new IntHolder();
    > > >>>             seedValue23 .value = seedValue;
    > > >>>             //---- start of eval portion of hash32AsDouble function.
    > > >>> ----//
    > > >>>             IntHolder out24 = new IntHolder();
    > > >>>             {
    > > >>>                 final IntHolder out = new IntHolder();
    > > >>>                 IntHolder in = out22;
    > > >>>                 IntHolder seed = seedValue23;
    > > >>>
    > > >>> Hash32WithSeedAsDouble$IntHash_eval: {
    > > >>>     out.value =
    > > >>> org.apache.drill.exec.expr.fn.impl.HashHelper.hash32((double)
    > > >>> in.value, seed.value);
    > > >>> }
    > > >>>
    > > >>>                 out24 = out;
    > > >>>             }
    > > >>>             //---- end of eval portion of hash32AsDouble function.
    > > ----//
    > > >>>             if (fieldId16 .value == constant18 .value) {
    > > >>>                 return out26 .value;
    > > >>>             }
    > > >>>             IntHolder seedValue25 = new IntHolder();
    > > >>>             seedValue25 .value = seedValue;
    > > >>>             //---- start of eval portion of hash32AsDouble function.
    > > >>> ----//
    > > >>>             IntHolder out26 = new IntHolder();
    > > >>>             {
    > > >>>                 final IntHolder out = new IntHolder();
    > > >>>                 IntHolder in = out22;
    > > >>>                 IntHolder seed = seedValue25;
    > > >>>
    > > >>> Hash32WithSeedAsDouble$IntHash_eval: {
    > > >>>     out.value =
    > > >>> org.apache.drill.exec.expr.fn.impl.HashHelper.hash32((double)
    > > >>> in.value, seed.value);
    > > >>> }
    > > >>>
    > > >>>                 out26 = out;
    > > >>>             }
    > > >>>             //---- end of eval portion of hash32AsDouble function.
    > > ----//
    > > >>>             return  0;
    > > >>>         }
    > > >>>     }
    > > >>>
    > > >>>
    > > >>>
    > > >>>
    > > >>>
    > > >>> On Tue, May 29, 2018 at 10:51 AM Aman Sinha <[email protected]>
    > > >>> wrote:
    > > >>>
    > > >>> > sorry, the previous email is incomplete.
    > > >>> > For the ifBlock, do you need an _else() block also ?
    > > >>> >
    > > >>> > I have sometimes found that 'JConditional' is a good way to break
    > > down
    > > >>> the
    > > >>> > logic further.  Please see example usages of JConditional here 
[1].
    > > >>> >
    > > >>> > -Aman
    > > >>> >
    > > >>> > [1]
    > > >>> >
    > > >>> >
    > > >>> 
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.programcreek.com_java-2Dapi-2Dexamples_-3Fapi-3Dcom&d=DwIFaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=EqulKDxxEDCX6zbp1AZAa1-iAPQGgCioAqgDp7DE2BU&m=doaiFF3edu9-prktKvLSIoNdmzt_nV6nzCtF_ZGQRBk&s=O2Th00tVjOSHTLlOn_lFp8JiUlh_FueCbHs8giRVS3k&e=.
    > > sun.codemodel.JBlock
    > > >>> >
    > > >>> > On Mon, May 28, 2018 at 7:46 PM, Aman Sinha <[email protected]>
    > > >>> wrote:
    > > >>> >
    > > >>> > > Hi Weijie,
    > > >>> > > It would be a little cumbersome to debug such issues over email
    > > >>> since one
    > > >>> > > has to look at the generated code output and iteratively debug.
    > > >>> > > Couple of thoughts I have that might help:
    > > >>> > >
    > > >>> > > For this particular if-then block, should you also
    > > >>> > > JBlock ifBlock =
    > > >>> > > cg.getEvalBlock()._if(fieldIdParamHolder.getValue().eq(targe
    > > >>> > > tBuildSideFieldId))._then();
    > > >>> > >
    > > >>> > >
    > > >>> > >
    > > >>> > > On Mon, May 28, 2018 at 4:17 AM, weijie tong <
    > > >>> [email protected]>
    > > >>> > > wrote:
    > > >>> > >
    > > >>> > >> HI All:
    > > >>> > >>  Through implementing the JPPD feature (
    > > >>> > >> 
https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_DRILL-2D6385&d=DwIFaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=EqulKDxxEDCX6zbp1AZAa1-iAPQGgCioAqgDp7DE2BU&m=doaiFF3edu9-prktKvLSIoNdmzt_nV6nzCtF_ZGQRBk&s=FIkIkgR6E_qJADP1J55y11SgJZD8NyPaNv_AeTabiaY&e=)
 , I was
    > blocked
    > > >>> by
    > > >>> > the
    > > >>> > >> problem: how to get the hash code of each build side of the 
hash
    > > >>> join
    > > >>> > >> columns through the dynamic generated java code. Hope someone
    > can
    > > >>> give
    > > >>> > >> some
    > > >>> > >> advice.
    > > >>> > >>
    > > >>> > >>    I supposed to add methods as below to the HashTableTemplate 
:
    > > >>> > >>
    > > >>> > >> public long getBuild64HashCode(int incomingRowIdx, int
    > seedValue,
    > > >>> int
    > > >>> > >> fieldId) throws SchemaChangeException{
    > > >>> > >>    return getBuild64HashCodeInner(incomingRowIdx, seedValue,
    > > >>> fieldId);
    > > >>> > >> }
    > > >>> > >>
    > > >>> > >> protected abstract long
    > > >>> > >> getBuild64HashCodeInner(@Named("incomingRowIdx") int
    > > incomingRowIdx,
    > > >>> > >> @Named("seedValue") int seedValue, @Named("fieldId") int
    > fieldId)
    > > >>> > >> throws SchemaChangeException;
    > > >>> > >>
    > > >>> > >>
    > > >>> > >>    The high level code to invoke the getBuild64HashCode method
    > is
    > > >>> at the
    > > >>> > >> HashJoinBatch's executeBuildPhase() :
    > > >>> > >>
    > > >>> > >> //create runtime filter
    > > >>> > >> if (cycleNum == 0 && enableRuntimeFilter) {
    > > >>> > >>  //create runtime filter and send out async
    > > >>> > >>  int condFieldIndex = 0;
    > > >>> > >>  for (BloomFilter bloomFilter : bloomFilters) {
    > > >>> > >>    //VV
    > > >>> > >>    for (int ind = 0; ind < currentRecordCount; ind++) {
    > > >>> > >>      long hashCode = partitions[0].getBuild64HashCode(ind,
    > > >>> > >> condFieldIndex);
    > > >>> > >>      bloomFilter.insert(hashCode);
    > > >>> > >>    }
    > > >>> > >>    condFieldIndex++;
    > > >>> > >>  }
    > > >>> > >>  //TODO sered out async
    > > >>> > >> }
    > > >>> > >>
    > > >>> > >>
    > > >>> > >>  As you know, the abstract method getBuild64HashCodeInner needs
    > to
    > > >>> > >> calculate the hash codes of each build side column by the
    > fieldId
    > > >>> input
    > > >>> > >> parameter. In order to achieve this target, I plan to have
    > > different
    > > >>> > >> solving parts corresponding to different column ValueVector ,
    > > using
    > > >>> the
    > > >>> > if
    > > >>> > >> statement to distinguish different solving parts through the id
    > of
    > > >>> the
    > > >>> > >> column.  The corresponding method to generate the dynamic codes
    > > is
    > > >>> as
    > > >>> > >> below:
    > > >>> > >>
    > > >>> > >> private void setupGetBuild64Hash(ClassGenerator<HashTable> cg,
    > > >>> > >> MappingSet incomingMapping, VectorAccessible batch,
    > > >>> > >> LogicalExpression[] keyExprs, TypedFieldId[] buildKeyFieldIds)
    > > >>> > >>  throws SchemaChangeException {
    > > >>> > >>  cg.setMappingSet(incomingMapping);
    > > >>> > >>  if (keyExprs == null || keyExprs.length == 0) {
    > > >>> > >>    cg.getEvalBlock()._return(JExpr.lit(0));
    > > >>> > >>  }
    > > >>> > >>  String seedValue = "seedValue";
    > > >>> > >>  String fieldId = "fieldId";
    > > >>> > >>  LogicalExpression seed =
    > > >>> > >> ValueExpressions.getParameterExpression(seedValue,
    > > >>> > >> Types.required(TypeProtos.MinorType.INT));
    > > >>> > >>
    > > >>> > >>  LogicalExpression fieldIdParamExpr =
    > > >>> > >> ValueExpressions.getParameterExpression(fieldId,
    > > >>> > >> Types.required(TypeProtos.MinorType.INT) );
    > > >>> > >>  HoldingContainer fieldIdParamHolder =
    > > cg.addExpr(fieldIdParamExpr);
    > > >>> > >>  int i = 0;
    > > >>> > >>  for (LogicalExpression expr : keyExprs) {
    > > >>> > >>    TypedFieldId targetTypeFieldId = buildKeyFieldIds[i];
    > > >>> > >>    ValueExpressions.IntExpression targetBuildFieldIdExp = new
    > > >>> > >> ValueExpressions.IntExpression(targetTypeFieldId.getFieldIds(
    > > )[0],
    > > >>> > >> ExpressionPosition.UNKNOWN);
    > > >>> > >>    JFieldRef targetBuildSideFieldId =
    > > >>> > >> cg.addExpr(targetBuildFieldIdExp,
    > > >>> > >> ClassGenerator.BlkCreateMode.TRUE_IF_BOUND).getValue();
    > > >>> > >>    JBlock ifBlock =
    > > >>> > >> cg.getEvalBlock()._if(fieldIdParamHolder.getValue().eq(targe
    > > >>> > >> tBuildSideFieldId))._then();
    > > >>> > >>
    > > >>> > >>    LogicalExpression hashExpression =
    > > >>> > >> HashPrelUtil.getHashExpression(expr, seed, incomingProbe !=
    > > null);
    > > >>> > >>    LogicalExpression materializedExpr =
    > > >>> > >> ExpressionTreeMaterializer.materializeAndCheckErrors(
    > > hashExpression,
    > > >>> > >> batch, context.getFunctionRegistry());
    > > >>> > >>    HoldingContainer hash = cg.addExpr(materializedExpr,
    > > >>> > >> ClassGenerator.BlkCreateMode.FALSE);
    > > >>> > >>
    > > >>> > >>
    > > >>> > >>    ifBlock._return(hash.getValue());
    > > >>> > >>    i++;
    > > >>> > >>  }
    > > >>> > >>  cg.getEvalBlock()._return(JExpr.lit(0));
    > > >>> > >>
    > > >>> > >> }
    > > >>> > >>
    > > >>> > >> But unfortunately, the generated codes are not what I expected.
    > > The
    > > >>> > codes
    > > >>> > >> to read ValueVector , calculate hash code of the read value do
    > not
    > > >>> stay
    > > >>> > in
    > > >>> > >> the if block.  So how can I let the related codes stay in the 
if
    > > >>> block ?
    > > >>> > >>
    > > >>> > >
    > > >>> > >
    > > >>> >
    > > >>
    > > >>
    > >
    >

Re: How to generate hash code for each build side one of the hash join columns

Reply via email to