Some questions about SelectionVector2 and SelectionVector4: I want to create SelectionVector4 or SelectionVector2 to represent the filtered ScanBatch to avoid memory copy. But I found the ProjectBatch does not support SelectVector4 . And the SelectionVector2's record count size is char type size . So why SelectionVector4 is not supported by the ProjectBatch ? The same question is to the FilterBatch's SelectVector2 which also only support the 2 Byte size record count.
On Fri, Jun 1, 2018 at 1:40 PM weijie tong <tongweijie...@gmail.com> wrote: > Hi Boaz: > > Your propose is valuable though I have implemented the dynamic > generating code logic. If a ``` long hash64(int index, long seed) ``` > method is added to the ValueVector , it will also benefit others to > implement specific storage plugin's filter logic by using the pushed down > bloom filter. To HashJoin and HashAggregate , methods ```double > hash32AsDouble(int index, int seed) ``` and ```int hash32(int index, int > seed)``` will also be needed to the ValueVector. If no one else gives > objection , I will be pleasure to take this work. > > Btw, I will share my thought about the scan side's filter logic by the > BloomFilter. The scan side filter logic here I supposed to do is to filter > the materialized ValueVector ,not at the process to construct the > ValueVector from the original storage format data. The reason is the > checking logic will break down the performance to materialize the original > deep storage format data to ValueVector. > > On Fri, Jun 1, 2018 at 3:22 AM Boaz Ben-Zvi <bben-...@mapr.com> wrote: > >> Hi Weijie, >> >> Another option is to totally avoid the generated code. >> We were considering the idea of replacing the generated code used for >> computing hash values with “real java” code. >> >> This idea is analogous to the usage of the copyEntry() method in the >> ValueVector interface (that Paul added last year). >> See an example of using the copyEntry() (via the appendRow() in >> VectorContainer) in the new Hash-Join-Spill code. >> Basically no need to generate “type specific” code, as the virtual >> copyEntry() method does the “type specific” work. >> >> Similarly we could have a hash64() method in ValueVector, which would >> perform the “type specific” computation. >> (One difference from copyEntry() – the hash64() would also need to take >> the “seed” parameter, which is the hash value produced by the previous >> hash). >> And similar to appendRow(), there would be evalHash() iterating over the >> key columns. >> (And one difference from appendRow() – need to iterate only on the key >> columns; these are the first columns; their number can be found from the >> config: e.g., htConfig.getKeyExprsBuild().size() ) >> >> With such implementation, that evalHash() could be used anywhere >> (e.g., to match the Bloom filters on the left side of the join). >> >> Thanks, >> >> Boaz >> >> >> On 5/30/18, 7:49 PM, "weijie tong" <tongweijie...@gmail.com> wrote: >> >> Hi Aman: >> >> Thanks for your tips. I have rebased the latest code from the master >> branch . Yes, the spill-to-disk feature does changed the original >> implementation. I have adjusted my implementation according to the new >> feature. But as you say, it will take some challenge to integration >> as I >> noticed the spill-to-disk feature will continue to tune its >> implementation >> performance. >> >> The BloomFilter was implemented natively in Drill , not an external >> library. It's implemented the algorithm of the paper which was >> mentioned by >> you. >> >> >> On Thu, May 31, 2018 at 1:56 AM Aman Sinha <amansi...@apache.org> >> wrote: >> >> > Hi Weijie, >> > I was hoping you could leverage the existing methods..so its good >> that you >> > found the ones that work for your use case. >> > One thing I want to point out (maybe you're already aware) .. the >> Hash Join >> > code has changed significantly in the master branch due to the >> > spill-to-disk feature. >> > So, this may pose some integration challenges for your run-time join >> > pushdown feature. >> > Also, one other question/clarification: for the bloom filter >> itself are >> > you implementing it natively in Drill or using an external library ? >> > >> > -Aman >> > >> > On Tue, May 29, 2018 at 8:23 PM, weijie tong < >> tongweijie...@gmail.com> >> > wrote: >> > >> > > I found ClassGenerator's nestEvalBlock(JBlock block) and >> > unNestEvalBlock() >> > > which has the same effect to what I change to the ClassGenerator. >> So I >> > give >> > > up what I change to the ClassGenerator and hope this can help >> someone >> > else. >> > > >> > > On Tue, May 29, 2018 at 1:53 PM weijie tong < >> tongweijie...@gmail.com> >> > > wrote: >> > > >> > > > The code formatting is not nice. Put them again: >> > > > >> > > > private void setupGetBuild64Hash(ClassGenerator<HashTable> cg, >> > > MappingSet >> > > > incomingMapping, VectorAccessible batch, LogicalExpression[] >> keyExprs, >> > > > TypedFieldId[] buildKeyFieldIds) >> > > > throws SchemaChangeException { >> > > > cg.setMappingSet(incomingMapping); >> > > > if (keyExprs == null || keyExprs.length == 0) { >> > > > cg.getEvalBlock()._return(JExpr.lit(0)); >> > > > } >> > > > String seedValue = "seedValue"; >> > > > String fieldId = "fieldId"; >> > > > LogicalExpression seed = >> > > > ValueExpressions.getParameterExpression(seedValue, >> Types.required( >> > > > TypeProtos.MinorType.INT)); >> > > > >> > > > LogicalExpression fieldIdParamExpr = >> > > > ValueExpressions.getParameterExpression(fieldId, Types.required( >> > > > TypeProtos.MinorType.INT) ); >> > > > HoldingContainer fieldIdParamHolder = >> cg.addExpr(fieldIdParamExpr); >> > > > int i = 0; >> > > > for (LogicalExpression expr : keyExprs) { >> > > > TypedFieldId targetTypeFieldId = buildKeyFieldIds[i]; >> > > > ValueExpressions.IntExpression targetBuildFieldIdExp = new >> > > > >> ValueExpressions.IntExpression(targetTypeFieldId.getFieldIds()[0], >> > > > ExpressionPosition.UNKNOWN); >> > > > >> > > > JFieldRef targetBuildSideFieldId = >> > cg.addExpr(targetBuildFieldIdExp, >> > > > ClassGenerator.BlkCreateMode.TRUE_IF_BOUND).getValue(); >> > > > JBlock ifBlock = >> > > > cg.getEvalBlock()._if(fieldIdParamHolder.getValue(). >> > > eq(targetBuildSideFieldId))._then(); >> > > > //specify a special JBlock which is a inner one of the eval >> block >> > to >> > > > the ClassGenerator to substitute the returned JBlock of >> getEvalBlock() >> > > > cg.setCustomizedEvalInnerBlock(ifBlock); >> > > > LogicalExpression hashExpression = >> > > > HashPrelUtil.getHashExpression(expr, seed, incomingProbe != >> null); >> > > > LogicalExpression materializedExpr = >> > > > >> ExpressionTreeMaterializer.materializeAndCheckErrors(hashExpression, >> > > batch, >> > > > context.getFunctionRegistry()); >> > > > HoldingContainer hash = cg.addExpr(materializedExpr, >> > > > ClassGenerator.BlkCreateMode.TRUE_IF_BOUND); >> > > > ifBlock._return(hash.getValue()); >> > > > //reset the customized block to null ,so the getEvalBlock() >> return >> > > the >> > > > truly eval JBlock >> > > > cg.setCustomizedEvalInnerBlock(null); >> > > > i++; >> > > > } >> > > > cg.getEvalBlock()._return(JExpr.lit(0)); >> > > > } >> > > > >> > > > >> > > > >> > > > >> > > > public long getBuild64HashCodeInner(int incomingRowIdx, int >> seedValue, >> > > int >> > > > fieldId) >> > > > throws SchemaChangeException >> > > > { >> > > > { >> > > > IntHolder fieldId12 = new IntHolder(); >> > > > fieldId12 .value = fieldId; >> > > > if (fieldId12 .value == constant14 .value) { >> > > > IntHolder out18 = new IntHolder(); >> > > > { >> > > > out18 .value = vv15 .getAccessor().get((incomingRowIdx)); >> > > > } >> > > > IntHolder seedValue19 = new IntHolder(); >> > > > seedValue19 .value = seedValue; >> > > > //---- start of eval portion of hash32AsDouble function. >> ----// >> > > > IntHolder out20 = new IntHolder(); >> > > > { >> > > > final IntHolder out = new IntHolder(); >> > > > IntHolder in = out18; >> > > > IntHolder seed = seedValue19; >> > > > >> > > > Hash32WithSeedAsDouble$IntHash_eval: { >> > > > out.value = >> > > > org.apache.drill.exec.expr.fn.impl.HashHelper.hash32((double) >> in.value, >> > > > seed.value); >> > > > } >> > > > >> > > > out20 = out; >> > > > } >> > > > //---- end of eval portion of hash32AsDouble function. ----// >> > > > return out20 .value; >> > > > } >> > > > return 0; >> > > > } >> > > > } >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > On Tue, May 29, 2018 at 1:47 PM weijie tong < >> tongweijie...@gmail.com> >> > > > wrote: >> > > > >> > > >> HI Paul: >> > > >> >> > > >> Thanks for your enthusiasm. I have managed this skill as you >> ever >> > > >> mentioned me at another mail thread. It's really helpful >> ,thanks for >> > > your >> > > >> valuable work. >> > > >> >> > > >> Now I have solved this tough problem by adding a customized >> JBlock >> > > >> member field to the ClassGenerator. So once you want the >> > getEvalBlock() >> > > of >> > > >> the ClassGenerator to return a inner customized JBlock , then >> you set >> > > this >> > > >> member, if you want the method to return eval self JBlock , >> you reset >> > > this >> > > >> member to null. >> > > >> >> > > >> Here is my changed setup method : >> > > >> >> > > >> >> > > >> private void setupGetBuild64Hash(ClassGenerator<HashTable> cg, >> > > MappingSet incomingMapping, VectorAccessible batch, >> LogicalExpression[] >> > > keyExprs, TypedFieldId[] buildKeyFieldIds) >> > > >> throws SchemaChangeException { >> > > >> cg.setMappingSet(incomingMapping); >> > > >> if (keyExprs == null || keyExprs.length == 0) { >> > > >> cg.getEvalBlock()._return(JExpr.lit(0)); >> > > >> } >> > > >> String seedValue = "seedValue"; >> > > >> String fieldId = "fieldId"; >> > > >> LogicalExpression seed = >> > ValueExpressions.getParameterExpression(seedValue, >> > > Types.required(TypeProtos.MinorType.INT)); >> > > >> >> > > >> LogicalExpression fieldIdParamExpr = ValueExpressions. >> > > getParameterExpression(fieldId, Types.required( >> TypeProtos.MinorType.INT) >> > > ); >> > > >> HoldingContainer fieldIdParamHolder = >> cg.addExpr(fieldIdParamExpr); >> > > >> int i = 0; >> > > >> for (LogicalExpression expr : keyExprs) { >> > > >> TypedFieldId targetTypeFieldId = buildKeyFieldIds[i]; >> > > >> ValueExpressions.IntExpression targetBuildFieldIdExp = new >> > > ValueExpressions.IntExpression(targetTypeFieldId.getFieldIds()[0], >> > > ExpressionPosition.UNKNOWN); >> > > >> >> > > >> JFieldRef targetBuildSideFieldId = >> > cg.addExpr(targetBuildFieldIdExp, >> > > ClassGenerator.BlkCreateMode.TRUE_IF_BOUND).getValue(); >> > > >> JBlock ifBlock = cg.getEvalBlock()._if( >> > > fieldIdParamHolder.getValue().eq(targetBuildSideFieldId))._then(); >> > > >> //specify a special JBlock which is a inner one of the >> eval block >> > > to the ClassGenerator to substitute the returned JBlock of >> getEvalBlock() >> > > >> cg.setCustomizedEvalInnerBlock(ifBlock); >> > > >> LogicalExpression hashExpression = >> > HashPrelUtil.getHashExpression(expr, >> > > seed, incomingProbe != null); >> > > >> LogicalExpression materializedExpr = >> ExpressionTreeMaterializer. >> > > materializeAndCheckErrors(hashExpression, batch, >> > > context.getFunctionRegistry()); >> > > >> HoldingContainer hash = cg.addExpr(materializedExpr, >> > > ClassGenerator.BlkCreateMode.TRUE_IF_BOUND); >> > > >> ifBlock._return(hash.getValue()); >> > > >> //reset the customized block to null ,so the >> getEvalBlock() return >> > > the truly eval JBlock >> > > >> cg.setCustomizedEvalInnerBlock(null); >> > > >> i++; >> > > >> } >> > > >> cg.getEvalBlock()._return(JExpr.lit(0)); >> > > >> } >> > > >> >> > > >> >> > > >> The corresponding generated codes : >> > > >> >> > > >> public long getBuild64HashCodeInner(int incomingRowIdx, int >> > > seedValue, int fieldId) >> > > >> throws SchemaChangeException >> > > >> { >> > > >> { >> > > >> IntHolder fieldId12 = new IntHolder(); >> > > >> fieldId12 .value = fieldId; >> > > >> if (fieldId12 .value == constant14 .value) { >> > > >> IntHolder out18 = new IntHolder(); >> > > >> { >> > > >> out18 .value = vv15 .getAccessor().get(( >> > > incomingRowIdx)); >> > > >> } >> > > >> IntHolder seedValue19 = new IntHolder(); >> > > >> seedValue19 .value = seedValue; >> > > >> //---- start of eval portion of hash32AsDouble >> > > function. ----// >> > > >> IntHolder out20 = new IntHolder(); >> > > >> { >> > > >> final IntHolder out = new IntHolder(); >> > > >> IntHolder in = out18; >> > > >> IntHolder seed = seedValue19; >> > > >> >> > > >> Hash32WithSeedAsDouble$IntHash_eval: { >> > > >> out.value = >> > org.apache.drill.exec.expr.fn.impl.HashHelper.hash32((double) >> > > in.value, seed.value); >> > > >> } >> > > >> >> > > >> out20 = out; >> > > >> } >> > > >> //---- end of eval portion of hash32AsDouble >> function. >> > > ----// >> > > >> return out20 .value; >> > > >> } >> > > >> return 0; >> > > >> } >> > > >> } >> > > >> >> > > >> >> > > >> >> > > >> Some other explanation: >> > > >> 1st : The if checking won't hurt the performance , as I >> invoke this >> > > >> method column by column , so it's branch predication friendly. >> > > >> 2nd: I will use the murmur3_64 not the murmur3_32 ,since the >> > efficient >> > > >> bloom filter algorithm needs the 64 bit hash code to avoid the >> > conflict. >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> On Tue, May 29, 2018 at 12:37 PM Paul Rogers >> > <par0...@yahoo.com.invalid >> > > > >> > > >> wrote: >> > > >> >> > > >>> Hi Weijie, >> > > >>> >> > > >>> Seeing the discussion about the details of JCodeModel >> suggests you >> > may >> > > >>> be trying to debug your generated code at the level of the >> code >> > > generator. >> > > >>> >> > > >>> Some time ago we added the ability to step through the >> generated >> > code. >> > > >>> Look for the following line in the generator code: >> > > >>> >> > > >>> >> > > >>> // Uncomment out this line to debug the generated code. >> > > >>> >> > > >>> // cg.saveCodeForDebugging(true); >> > > >>> >> > > >>> >> > > >>> Uncomment the code line and Drill will save each generated >> file to a >> > > >>> configured location (which, if I recall correctly, is >> > > /tmp/drill/codegen, >> > > >>> though it may have changed after Tim's test directory >> changes.) >> > > >>> >> > > >>> Then, set a breakpoint in the template setup() method and you >> can >> > step >> > > >>> directly into the generated doSetup() method. Same for the >> eval() >> > > method. >> > > >>> >> > > >>> This way, you can not only see the generated code, you can >> step >> > through >> > > >>> it. I've found this to be a far easier way to understand the >> > generated >> > > code >> > > >>> than the older techniques folks have used (look at byte >> codes, use >> > > print >> > > >>> statements, brute force reasoning, etc.) >> > > >>> >> > > >>> Tim, Boaz and others have used this technique more recently >> and can >> > > >>> probably give you additional pointers. >> > > >>> >> > > >>> Thanks, >> > > >>> - Paul >> > > >>> >> > > >>> >> > > >>> >> > > >>> On Monday, May 28, 2018, 8:52:19 PM PDT, weijie tong < >> > > >>> tongweijie...@gmail.com> wrote: >> > > >>> >> > > >>> @aman thanks for your reply. "For the ifBlock, do you need an >> > _else() >> > > >>> block >> > > >>> also ?" I give a default return logic at the method, so I >> don't need >> > > the >> > > >>> _else() block. I have noticed the IfExpression's evaluation >> method >> > at >> > > >>> EvaluationVisitor which also uses the JConditional . But that >> also >> > > >>> doesn't >> > > >>> match my requirement. I think the key point here is the >> > > >>> FunctionHolderExpression and ValueVectorReadExpression will >> put their >> > > >>> corresponding generated codes to the eval method's JBlock , >> not our >> > > >>> specific IfBlock which is a inner block of the eval method's >> JBlock . >> > > >>> >> > > >>> So it seems I should make some changes to the ClassGenerator >> to let >> > the >> > > >>> getEvalBlock return the IfBlock (maybe accurately the >> JConditional's >> > > then >> > > >>> block) or implement some special FunctionHolderExpression >> > > >>> 、ValueVectorReadExpression and corresponding visiting methods >> at the >> > > >>> EvaluationVisitor to generate the special code blocks. Hope >> someone >> > who >> > > >>> are >> > > >>> familiar with these part of codes to point out whether there >> are more >> > > >>> easy >> > > >>> or different choices to achieve the target. >> > > >>> >> > > >>> To make discussion more accurate, I put the generated codes >> of the >> > > >>> previous >> > > >>> setupGetBuild64Hash method here: >> > > >>> >> > > >>> public long getBuild64HashCodeInner(int incomingRowIdx, >> int >> > > >>> seedValue, int fieldId) >> > > >>> throws SchemaChangeException >> > > >>> { >> > > >>> { >> > > >>> IntHolder fieldId16 = new IntHolder(); >> > > >>> fieldId16 .value = fieldId; >> > > >>> if (fieldId16 .value == constant18 .value) { >> > > >>> return out24 .value; >> > > >>> } >> > > >>> IntHolder out22 = new IntHolder(); >> > > >>> { >> > > >>> out22 .value = vv19 .getAccessor().get(( >> > > incomingRowIdx)); >> > > >>> } >> > > >>> IntHolder seedValue23 = new IntHolder(); >> > > >>> seedValue23 .value = seedValue; >> > > >>> //---- start of eval portion of hash32AsDouble >> function. >> > > >>> ----// >> > > >>> IntHolder out24 = new IntHolder(); >> > > >>> { >> > > >>> final IntHolder out = new IntHolder(); >> > > >>> IntHolder in = out22; >> > > >>> IntHolder seed = seedValue23; >> > > >>> >> > > >>> Hash32WithSeedAsDouble$IntHash_eval: { >> > > >>> out.value = >> > > >>> org.apache.drill.exec.expr.fn.impl.HashHelper.hash32((double) >> > > >>> in.value, seed.value); >> > > >>> } >> > > >>> >> > > >>> out24 = out; >> > > >>> } >> > > >>> //---- end of eval portion of hash32AsDouble >> function. >> > > ----// >> > > >>> if (fieldId16 .value == constant18 .value) { >> > > >>> return out26 .value; >> > > >>> } >> > > >>> IntHolder seedValue25 = new IntHolder(); >> > > >>> seedValue25 .value = seedValue; >> > > >>> //---- start of eval portion of hash32AsDouble >> function. >> > > >>> ----// >> > > >>> IntHolder out26 = new IntHolder(); >> > > >>> { >> > > >>> final IntHolder out = new IntHolder(); >> > > >>> IntHolder in = out22; >> > > >>> IntHolder seed = seedValue25; >> > > >>> >> > > >>> Hash32WithSeedAsDouble$IntHash_eval: { >> > > >>> out.value = >> > > >>> org.apache.drill.exec.expr.fn.impl.HashHelper.hash32((double) >> > > >>> in.value, seed.value); >> > > >>> } >> > > >>> >> > > >>> out26 = out; >> > > >>> } >> > > >>> //---- end of eval portion of hash32AsDouble >> function. >> > > ----// >> > > >>> return 0; >> > > >>> } >> > > >>> } >> > > >>> >> > > >>> >> > > >>> >> > > >>> >> > > >>> >> > > >>> On Tue, May 29, 2018 at 10:51 AM Aman Sinha < >> amansi...@apache.org> >> > > >>> wrote: >> > > >>> >> > > >>> > sorry, the previous email is incomplete. >> > > >>> > For the ifBlock, do you need an _else() block also ? >> > > >>> > >> > > >>> > I have sometimes found that 'JConditional' is a good way to >> break >> > > down >> > > >>> the >> > > >>> > logic further. Please see example usages of JConditional >> here [1]. >> > > >>> > >> > > >>> > -Aman >> > > >>> > >> > > >>> > [1] >> > > >>> > >> > > >>> > >> > > >>> >> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.programcreek.com_java-2Dapi-2Dexamples_-3Fapi-3Dcom&d=DwIFaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=EqulKDxxEDCX6zbp1AZAa1-iAPQGgCioAqgDp7DE2BU&m=doaiFF3edu9-prktKvLSIoNdmzt_nV6nzCtF_ZGQRBk&s=O2Th00tVjOSHTLlOn_lFp8JiUlh_FueCbHs8giRVS3k&e= >> . >> > > sun.codemodel.JBlock >> > > >>> > >> > > >>> > On Mon, May 28, 2018 at 7:46 PM, Aman Sinha < >> amansi...@apache.org> >> > > >>> wrote: >> > > >>> > >> > > >>> > > Hi Weijie, >> > > >>> > > It would be a little cumbersome to debug such issues over >> email >> > > >>> since one >> > > >>> > > has to look at the generated code output and iteratively >> debug. >> > > >>> > > Couple of thoughts I have that might help: >> > > >>> > > >> > > >>> > > For this particular if-then block, should you also >> > > >>> > > JBlock ifBlock = >> > > >>> > > >> cg.getEvalBlock()._if(fieldIdParamHolder.getValue().eq(targe >> > > >>> > > tBuildSideFieldId))._then(); >> > > >>> > > >> > > >>> > > >> > > >>> > > >> > > >>> > > On Mon, May 28, 2018 at 4:17 AM, weijie tong < >> > > >>> tongweijie...@gmail.com> >> > > >>> > > wrote: >> > > >>> > > >> > > >>> > >> HI All: >> > > >>> > >> Through implementing the JPPD feature ( >> > > >>> > >> >> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_DRILL-2D6385&d=DwIFaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=EqulKDxxEDCX6zbp1AZAa1-iAPQGgCioAqgDp7DE2BU&m=doaiFF3edu9-prktKvLSIoNdmzt_nV6nzCtF_ZGQRBk&s=FIkIkgR6E_qJADP1J55y11SgJZD8NyPaNv_AeTabiaY&e=) >> , I was >> > blocked >> > > >>> by >> > > >>> > the >> > > >>> > >> problem: how to get the hash code of each build side of >> the hash >> > > >>> join >> > > >>> > >> columns through the dynamic generated java code. Hope >> someone >> > can >> > > >>> give >> > > >>> > >> some >> > > >>> > >> advice. >> > > >>> > >> >> > > >>> > >> I supposed to add methods as below to the >> HashTableTemplate : >> > > >>> > >> >> > > >>> > >> public long getBuild64HashCode(int incomingRowIdx, int >> > seedValue, >> > > >>> int >> > > >>> > >> fieldId) throws SchemaChangeException{ >> > > >>> > >> return getBuild64HashCodeInner(incomingRowIdx, >> seedValue, >> > > >>> fieldId); >> > > >>> > >> } >> > > >>> > >> >> > > >>> > >> protected abstract long >> > > >>> > >> getBuild64HashCodeInner(@Named("incomingRowIdx") int >> > > incomingRowIdx, >> > > >>> > >> @Named("seedValue") int seedValue, @Named("fieldId") int >> > fieldId) >> > > >>> > >> throws SchemaChangeException; >> > > >>> > >> >> > > >>> > >> >> > > >>> > >> The high level code to invoke the getBuild64HashCode >> method >> > is >> > > >>> at the >> > > >>> > >> HashJoinBatch's executeBuildPhase() : >> > > >>> > >> >> > > >>> > >> //create runtime filter >> > > >>> > >> if (cycleNum == 0 && enableRuntimeFilter) { >> > > >>> > >> //create runtime filter and send out async >> > > >>> > >> int condFieldIndex = 0; >> > > >>> > >> for (BloomFilter bloomFilter : bloomFilters) { >> > > >>> > >> //VV >> > > >>> > >> for (int ind = 0; ind < currentRecordCount; ind++) { >> > > >>> > >> long hashCode = >> partitions[0].getBuild64HashCode(ind, >> > > >>> > >> condFieldIndex); >> > > >>> > >> bloomFilter.insert(hashCode); >> > > >>> > >> } >> > > >>> > >> condFieldIndex++; >> > > >>> > >> } >> > > >>> > >> //TODO sered out async >> > > >>> > >> } >> > > >>> > >> >> > > >>> > >> >> > > >>> > >> As you know, the abstract method >> getBuild64HashCodeInner needs >> > to >> > > >>> > >> calculate the hash codes of each build side column by the >> > fieldId >> > > >>> input >> > > >>> > >> parameter. In order to achieve this target, I plan to >> have >> > > different >> > > >>> > >> solving parts corresponding to different column >> ValueVector , >> > > using >> > > >>> the >> > > >>> > if >> > > >>> > >> statement to distinguish different solving parts through >> the id >> > of >> > > >>> the >> > > >>> > >> column. The corresponding method to generate the >> dynamic codes >> > > is >> > > >>> as >> > > >>> > >> below: >> > > >>> > >> >> > > >>> > >> private void >> setupGetBuild64Hash(ClassGenerator<HashTable> cg, >> > > >>> > >> MappingSet incomingMapping, VectorAccessible batch, >> > > >>> > >> LogicalExpression[] keyExprs, TypedFieldId[] >> buildKeyFieldIds) >> > > >>> > >> throws SchemaChangeException { >> > > >>> > >> cg.setMappingSet(incomingMapping); >> > > >>> > >> if (keyExprs == null || keyExprs.length == 0) { >> > > >>> > >> cg.getEvalBlock()._return(JExpr.lit(0)); >> > > >>> > >> } >> > > >>> > >> String seedValue = "seedValue"; >> > > >>> > >> String fieldId = "fieldId"; >> > > >>> > >> LogicalExpression seed = >> > > >>> > >> ValueExpressions.getParameterExpression(seedValue, >> > > >>> > >> Types.required(TypeProtos.MinorType.INT)); >> > > >>> > >> >> > > >>> > >> LogicalExpression fieldIdParamExpr = >> > > >>> > >> ValueExpressions.getParameterExpression(fieldId, >> > > >>> > >> Types.required(TypeProtos.MinorType.INT) ); >> > > >>> > >> HoldingContainer fieldIdParamHolder = >> > > cg.addExpr(fieldIdParamExpr); >> > > >>> > >> int i = 0; >> > > >>> > >> for (LogicalExpression expr : keyExprs) { >> > > >>> > >> TypedFieldId targetTypeFieldId = buildKeyFieldIds[i]; >> > > >>> > >> ValueExpressions.IntExpression targetBuildFieldIdExp >> = new >> > > >>> > >> >> ValueExpressions.IntExpression(targetTypeFieldId.getFieldIds( >> > > )[0], >> > > >>> > >> ExpressionPosition.UNKNOWN); >> > > >>> > >> JFieldRef targetBuildSideFieldId = >> > > >>> > >> cg.addExpr(targetBuildFieldIdExp, >> > > >>> > >> ClassGenerator.BlkCreateMode.TRUE_IF_BOUND).getValue(); >> > > >>> > >> JBlock ifBlock = >> > > >>> > >> >> cg.getEvalBlock()._if(fieldIdParamHolder.getValue().eq(targe >> > > >>> > >> tBuildSideFieldId))._then(); >> > > >>> > >> >> > > >>> > >> LogicalExpression hashExpression = >> > > >>> > >> HashPrelUtil.getHashExpression(expr, seed, incomingProbe >> != >> > > null); >> > > >>> > >> LogicalExpression materializedExpr = >> > > >>> > >> ExpressionTreeMaterializer.materializeAndCheckErrors( >> > > hashExpression, >> > > >>> > >> batch, context.getFunctionRegistry()); >> > > >>> > >> HoldingContainer hash = cg.addExpr(materializedExpr, >> > > >>> > >> ClassGenerator.BlkCreateMode.FALSE); >> > > >>> > >> >> > > >>> > >> >> > > >>> > >> ifBlock._return(hash.getValue()); >> > > >>> > >> i++; >> > > >>> > >> } >> > > >>> > >> cg.getEvalBlock()._return(JExpr.lit(0)); >> > > >>> > >> >> > > >>> > >> } >> > > >>> > >> >> > > >>> > >> But unfortunately, the generated codes are not what I >> expected. >> > > The >> > > >>> > codes >> > > >>> > >> to read ValueVector , calculate hash code of the read >> value do >> > not >> > > >>> stay >> > > >>> > in >> > > >>> > >> the if block. So how can I let the related codes stay >> in the if >> > > >>> block ? >> > > >>> > >> >> > > >>> > > >> > > >>> > > >> > > >>> > >> > > >> >> > > >> >> > > >> > >> >> >>