HI Paul:

 Thanks for your enthusiasm. I have managed this skill as you ever
mentioned me at another mail thread. It's really helpful ,thanks for your
valuable work.

  Now I have solved this tough problem by adding a customized JBlock member
field to the ClassGenerator. So once you want the getEvalBlock() of the
ClassGenerator to return a inner customized JBlock , then you set this
member, if you want the method to return eval self JBlock , you reset this
member to null.

  Here is my changed setup method :


private void setupGetBuild64Hash(ClassGenerator<HashTable> cg,
MappingSet incomingMapping, VectorAccessible batch,
LogicalExpression[] keyExprs, TypedFieldId[] buildKeyFieldIds)
  throws SchemaChangeException {
  cg.setMappingSet(incomingMapping);
  if (keyExprs == null || keyExprs.length == 0) {
    cg.getEvalBlock()._return(JExpr.lit(0));
  }
  String seedValue = "seedValue";
  String fieldId = "fieldId";
  LogicalExpression seed =
ValueExpressions.getParameterExpression(seedValue,
Types.required(TypeProtos.MinorType.INT));

  LogicalExpression fieldIdParamExpr =
ValueExpressions.getParameterExpression(fieldId,
Types.required(TypeProtos.MinorType.INT) );
  HoldingContainer fieldIdParamHolder = cg.addExpr(fieldIdParamExpr);
  int i = 0;
  for (LogicalExpression expr : keyExprs) {
    TypedFieldId targetTypeFieldId = buildKeyFieldIds[i];
    ValueExpressions.IntExpression targetBuildFieldIdExp = new
ValueExpressions.IntExpression(targetTypeFieldId.getFieldIds()[0],
ExpressionPosition.UNKNOWN);

    JFieldRef targetBuildSideFieldId =
cg.addExpr(targetBuildFieldIdExp,
ClassGenerator.BlkCreateMode.TRUE_IF_BOUND).getValue();
    JBlock ifBlock =
cg.getEvalBlock()._if(fieldIdParamHolder.getValue().eq(targetBuildSideFieldId))._then();
    //specify a special JBlock which is a inner one of the eval block
to the ClassGenerator to substitute the returned JBlock of
getEvalBlock()
    cg.setCustomizedEvalInnerBlock(ifBlock);
    LogicalExpression hashExpression =
HashPrelUtil.getHashExpression(expr, seed, incomingProbe != null);
    LogicalExpression materializedExpr =
ExpressionTreeMaterializer.materializeAndCheckErrors(hashExpression,
batch, context.getFunctionRegistry());
    HoldingContainer hash = cg.addExpr(materializedExpr,
ClassGenerator.BlkCreateMode.TRUE_IF_BOUND);
    ifBlock._return(hash.getValue());
    //reset the customized block to null ,so the getEvalBlock() return
the truly eval JBlock
    cg.setCustomizedEvalInnerBlock(null);
    i++;
  }
  cg.getEvalBlock()._return(JExpr.lit(0));
}


 The corresponding generated codes :

    public long getBuild64HashCodeInner(int incomingRowIdx, int
seedValue, int fieldId)
        throws SchemaChangeException
    {
        {
            IntHolder fieldId12 = new IntHolder();
            fieldId12 .value = fieldId;
            if (fieldId12 .value == constant14 .value) {
                IntHolder out18 = new IntHolder();
                {
                    out18 .value = vv15 .getAccessor().get((incomingRowIdx));
                }
                IntHolder seedValue19 = new IntHolder();
                seedValue19 .value = seedValue;
                //---- start of eval portion of hash32AsDouble function. ----//
                IntHolder out20 = new IntHolder();
                {
                    final IntHolder out = new IntHolder();
                    IntHolder in = out18;
                    IntHolder seed = seedValue19;

Hash32WithSeedAsDouble$IntHash_eval: {
    out.value =
org.apache.drill.exec.expr.fn.impl.HashHelper.hash32((double)
in.value, seed.value);
}

                    out20 = out;
                }
                //---- end of eval portion of hash32AsDouble function. ----//
                return out20 .value;
            }
            return  0;
        }
    }



  Some other explanation:
  1st : The if checking won't hurt the performance , as I invoke this
method column by column , so it's branch predication friendly.
  2nd: I will use the murmur3_64 not the murmur3_32 ,since the efficient
bloom filter algorithm needs the 64 bit hash code to avoid the conflict.













On Tue, May 29, 2018 at 12:37 PM Paul Rogers <par0...@yahoo.com.invalid>
wrote:

> Hi Weijie,
>
> Seeing the discussion about the details of JCodeModel suggests you may be
> trying to debug your generated code at the level of the code generator.
>
> Some time ago we added the ability to step through the generated code.
> Look for the following line in the generator code:
>
>
>     // Uncomment out this line to debug the generated code.
>
> //    cg.saveCodeForDebugging(true);
>
>
> Uncomment the code line and Drill will save each generated file to a
> configured location (which, if I recall correctly, is /tmp/drill/codegen,
> though it may have changed after Tim's test directory changes.)
>
> Then, set a breakpoint in the template setup() method and you can step
> directly into the generated doSetup() method. Same for the eval() method.
>
> This way, you can not only see the generated code, you can step through
> it. I've found this to be a far easier way to understand the generated code
> than the older techniques folks have used (look at byte codes, use print
> statements, brute force reasoning, etc.)
>
> Tim, Boaz and others have used this technique more recently and can
> probably give you additional pointers.
>
> Thanks,
> - Paul
>
>
>
>     On Monday, May 28, 2018, 8:52:19 PM PDT, weijie tong <
> tongweijie...@gmail.com> wrote:
>
>  @aman thanks for your reply. "For the ifBlock, do you need an _else()
> block
> also ?"  I give a default return logic at the method, so I don't need the
> _else() block.  I have noticed the IfExpression's evaluation method at
> EvaluationVisitor which also uses the JConditional . But that also doesn't
> match my requirement. I think the key point here is the
> FunctionHolderExpression and ValueVectorReadExpression will put their
> corresponding generated codes to the eval method's JBlock , not our
> specific IfBlock which is a inner block of the eval method's JBlock .
>
> So it seems I should make some changes to the ClassGenerator to let the
> getEvalBlock return the IfBlock (maybe accurately the JConditional's then
> block) or implement some special FunctionHolderExpression
> 、ValueVectorReadExpression and corresponding visiting methods at the
> EvaluationVisitor to generate the special code blocks. Hope someone who are
> familiar with these part of codes to point out whether there are more easy
> or different choices to achieve the target.
>
> To make discussion more accurate, I put the generated codes of the previous
> setupGetBuild64Hash method here:
>
>     public long getBuild64HashCodeInner(int incomingRowIdx, int
> seedValue, int fieldId)
>         throws SchemaChangeException
>     {
>         {
>             IntHolder fieldId16 = new IntHolder();
>             fieldId16 .value = fieldId;
>             if (fieldId16 .value == constant18 .value) {
>                 return out24 .value;
>             }
>             IntHolder out22 = new IntHolder();
>             {
>                 out22 .value = vv19 .getAccessor().get((incomingRowIdx));
>             }
>             IntHolder seedValue23 = new IntHolder();
>             seedValue23 .value = seedValue;
>             //---- start of eval portion of hash32AsDouble function. ----//
>             IntHolder out24 = new IntHolder();
>             {
>                 final IntHolder out = new IntHolder();
>                 IntHolder in = out22;
>                 IntHolder seed = seedValue23;
>
> Hash32WithSeedAsDouble$IntHash_eval: {
>     out.value =
> org.apache.drill.exec.expr.fn.impl.HashHelper.hash32((double)
> in.value, seed.value);
> }
>
>                 out24 = out;
>             }
>             //---- end of eval portion of hash32AsDouble function. ----//
>             if (fieldId16 .value == constant18 .value) {
>                 return out26 .value;
>             }
>             IntHolder seedValue25 = new IntHolder();
>             seedValue25 .value = seedValue;
>             //---- start of eval portion of hash32AsDouble function. ----//
>             IntHolder out26 = new IntHolder();
>             {
>                 final IntHolder out = new IntHolder();
>                 IntHolder in = out22;
>                 IntHolder seed = seedValue25;
>
> Hash32WithSeedAsDouble$IntHash_eval: {
>     out.value =
> org.apache.drill.exec.expr.fn.impl.HashHelper.hash32((double)
> in.value, seed.value);
> }
>
>                 out26 = out;
>             }
>             //---- end of eval portion of hash32AsDouble function. ----//
>             return  0;
>         }
>     }
>
>
>
>
>
> On Tue, May 29, 2018 at 10:51 AM Aman Sinha <amansi...@apache.org> wrote:
>
> > sorry, the previous email is incomplete.
> > For the ifBlock, do you need an _else() block also ?
> >
> > I have sometimes found that 'JConditional' is a good way to break down
> the
> > logic further.  Please see example usages of JConditional here [1].
> >
> > -Aman
> >
> > [1]
> >
> >
> https://www.programcreek.com/java-api-examples/?api=com.sun.codemodel.JBlock
> >
> > On Mon, May 28, 2018 at 7:46 PM, Aman Sinha <amansi...@apache.org>
> wrote:
> >
> > > Hi Weijie,
> > > It would be a little cumbersome to debug such issues over email since
> one
> > > has to look at the generated code output and iteratively debug.
> > > Couple of thoughts I have that might help:
> > >
> > > For this particular if-then block, should you also
> > > JBlock ifBlock =
> > > cg.getEvalBlock()._if(fieldIdParamHolder.getValue().eq(targe
> > > tBuildSideFieldId))._then();
> > >
> > >
> > >
> > > On Mon, May 28, 2018 at 4:17 AM, weijie tong <tongweijie...@gmail.com>
> > > wrote:
> > >
> > >> HI All:
> > >>  Through implementing the JPPD feature (
> > >> https://issues.apache.org/jira/browse/DRILL-6385) , I was blocked by
> > the
> > >> problem: how to get the hash code of each build side of the hash join
> > >> columns through the dynamic generated java code. Hope someone can give
> > >> some
> > >> advice.
> > >>
> > >>    I supposed to add methods as below to the HashTableTemplate :
> > >>
> > >> public long getBuild64HashCode(int incomingRowIdx, int seedValue, int
> > >> fieldId) throws SchemaChangeException{
> > >>    return getBuild64HashCodeInner(incomingRowIdx, seedValue, fieldId);
> > >> }
> > >>
> > >> protected abstract long
> > >> getBuild64HashCodeInner(@Named("incomingRowIdx") int incomingRowIdx,
> > >> @Named("seedValue") int seedValue, @Named("fieldId") int fieldId)
> > >> throws SchemaChangeException;
> > >>
> > >>
> > >>    The high level code to invoke the getBuild64HashCode method is at
> the
> > >> HashJoinBatch's executeBuildPhase() :
> > >>
> > >> //create runtime filter
> > >> if (cycleNum == 0 && enableRuntimeFilter) {
> > >>  //create runtime filter and send out async
> > >>  int condFieldIndex = 0;
> > >>  for (BloomFilter bloomFilter : bloomFilters) {
> > >>    //VV
> > >>    for (int ind = 0; ind < currentRecordCount; ind++) {
> > >>      long hashCode = partitions[0].getBuild64HashCode(ind,
> > >> condFieldIndex);
> > >>      bloomFilter.insert(hashCode);
> > >>    }
> > >>    condFieldIndex++;
> > >>  }
> > >>  //TODO sered out async
> > >> }
> > >>
> > >>
> > >>  As you know, the abstract method getBuild64HashCodeInner needs to
> > >> calculate the hash codes of each build side column by the fieldId
> input
> > >> parameter. In order to achieve this target, I plan to have different
> > >> solving parts corresponding to different column ValueVector , using
> the
> > if
> > >> statement to distinguish different solving parts through the id of the
> > >> column.  The corresponding method to generate the dynamic codes  is as
> > >> below:
> > >>
> > >> private void setupGetBuild64Hash(ClassGenerator<HashTable> cg,
> > >> MappingSet incomingMapping, VectorAccessible batch,
> > >> LogicalExpression[] keyExprs, TypedFieldId[] buildKeyFieldIds)
> > >>  throws SchemaChangeException {
> > >>  cg.setMappingSet(incomingMapping);
> > >>  if (keyExprs == null || keyExprs.length == 0) {
> > >>    cg.getEvalBlock()._return(JExpr.lit(0));
> > >>  }
> > >>  String seedValue = "seedValue";
> > >>  String fieldId = "fieldId";
> > >>  LogicalExpression seed =
> > >> ValueExpressions.getParameterExpression(seedValue,
> > >> Types.required(TypeProtos.MinorType.INT));
> > >>
> > >>  LogicalExpression fieldIdParamExpr =
> > >> ValueExpressions.getParameterExpression(fieldId,
> > >> Types.required(TypeProtos.MinorType.INT) );
> > >>  HoldingContainer fieldIdParamHolder = cg.addExpr(fieldIdParamExpr);
> > >>  int i = 0;
> > >>  for (LogicalExpression expr : keyExprs) {
> > >>    TypedFieldId targetTypeFieldId = buildKeyFieldIds[i];
> > >>    ValueExpressions.IntExpression targetBuildFieldIdExp = new
> > >> ValueExpressions.IntExpression(targetTypeFieldId.getFieldIds()[0],
> > >> ExpressionPosition.UNKNOWN);
> > >>    JFieldRef targetBuildSideFieldId =
> > >> cg.addExpr(targetBuildFieldIdExp,
> > >> ClassGenerator.BlkCreateMode.TRUE_IF_BOUND).getValue();
> > >>    JBlock ifBlock =
> > >> cg.getEvalBlock()._if(fieldIdParamHolder.getValue().eq(targe
> > >> tBuildSideFieldId))._then();
> > >>
> > >>    LogicalExpression hashExpression =
> > >> HashPrelUtil.getHashExpression(expr, seed, incomingProbe != null);
> > >>    LogicalExpression materializedExpr =
> > >> ExpressionTreeMaterializer.materializeAndCheckErrors(hashExpression,
> > >> batch, context.getFunctionRegistry());
> > >>    HoldingContainer hash = cg.addExpr(materializedExpr,
> > >> ClassGenerator.BlkCreateMode.FALSE);
> > >>
> > >>
> > >>    ifBlock._return(hash.getValue());
> > >>    i++;
> > >>  }
> > >>  cg.getEvalBlock()._return(JExpr.lit(0));
> > >>
> > >> }
> > >>
> > >> But unfortunately, the generated codes are not what I expected. The
> > codes
> > >> to read ValueVector , calculate hash code of the read value do not
> stay
> > in
> > >> the if block.  So how can I let the related codes stay in the if
> block ?
> > >>
> > >
> > >
> >

Reply via email to