I have done it. Thanks a lot Weijie and all of you for your time. *---------------------* *Muhammad Gelbana* http://www.linkedin.com/in/mgelbana
On Thu, Apr 6, 2017 at 3:15 PM, weijie tong <tongweijie...@gmail.com> wrote: > some tips: > 1. you need to know the RexInputRef index relationship between the > JoinRel's and its inputs's . > > join ( 1,2 ,3,4,5) > > left input(1,2,3) right input (1,2) > > 1,2,3, ===> left input (1 ,2,3) > > 4,5 ====>right input (1,2) > > 2. you capture the index map relationship when you iterate over your > JoinRelNode of your defined Rule( CartesianProductJoinRule) , and store > these index mapping data in your defined BGroupScan( name convention of my > last example ) > this mapping struct may be: destination index ------------->( source > ScanRel : source Index) . > to 1 example data ,the struct will be: > 1 ==>(left scan1 : 1) > 2 ==>(left scan1 : 2) > 3 ==>(left scan1 : 3) > 4 ==>(right scan2 : 1) > 5 ==>(right scan2 : 2) > > 3. you define another Rule (match Project RelNode)which depends on the > index mapping data of your last step . At this rule you pick the final > output project's index and pick its mapped index by the mapping struct, > then you find the final output column name and related tables. > > > > > On Tue, Apr 4, 2017 at 1:51 AM, Muhammad Gelbana <m.gelb...@gmail.com> > wrote: > > > I've succeeded, theoretically, in what I wanted to do because I had to > send > > the selected columns manually to my datasource. Would someone please tell > > me how can I identify the selected columns in the join ? I searched a lot > > without success. > > > > *---------------------* > > *Muhammad Gelbana* > > http://www.linkedin.com/in/mgelbana > > > > On Sat, Apr 1, 2017 at 1:43 AM, Muhammad Gelbana <m.gelb...@gmail.com> > > wrote: > > > > > So I intend to use this constructor for the new *RelNode*: > > *org.apache.drill.exec.planner.logical.DrillScanRel. > > DrillScanRel(RelOptCluster, > > > RelTraitSet, RelOptTable, GroupScan, RelDataType, List<SchemaPath>)* > > > > > > How can I provide it's parameters ? > > > > > > 1. *RelOptCluster*: Can I pass *DrillJoinRel.getCluster()* ? > > > > > > 2. *RelTraitSet*: Can I pass *DrillJoinRel.getTraitSet()* ? > > > > > > 3. *RelOptTable*: I assume I can use this factory method > > (*org.apache.calcite.prepare.RelOptTableImpl.create(RelOptSchema, > > > RelDataType, Table, Path)*). Any hints of how I can provide these > > > parameters too ? Should I just go ahead and manually create a new > > instance > > > of each parameter ? > > > > > > 4. *GroupScan*: I understand I have to create a new implementation > > > class for this one so now questions here so far. > > > > > > 5. *RelDataType*: This one is confusing. Because I understand that > for > > > *DrillJoinRel.transformTo(newRel)* to work, I have to provide a > > > *newRel* instance that has a *RelDataType* instance with the same > > > amount of fields and compatible types (i.e. this is mandated by > > *org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelNode, > > > RelNode, Object)*). Why couldn't I provide a *RelDataType* with > > > a different set of fields ? How can I resolve this ? > > > > > > 6. *List<SchemaPath>*: I assume I can call this method and pass my > > > columns names to it, one by one. (i.e. > > > *org.apache.drill.common.expression.SchemaPath. > > getCompoundPath(String...)* > > > ) > > > > > > Thanks. > > > > > > *---------------------* > > > *Muhammad Gelbana* > > > http://www.linkedin.com/in/mgelbana > > > > > > On Fri, Mar 31, 2017 at 1:59 PM, weijie tong <tongweijie...@gmail.com> > > > wrote: > > > > > >> your code seems right , just to implement the 'call.transformTo()' > ,but > > >> the > > >> left detail , maybe I think I can't express the left things so > > precisely, > > >> just as @Paul Rogers mentioned the plugin detail is a little trivial. > > >> > > >> 1. drillScanRel.getGroupScan . > > >> 2. you need to extend the AbstractGroupScan ,and let it holds some > > >> information about your storage . This defined GroupScan just call it > > >> AGroupScan corresponds to a joint scan RelNode. Then you can define > > >> another > > >> GroupScan called BGroupScan which extends AGroupScan, The BGroupScan > > acts > > >> as a aggregate container which holds the two joint AGroupScan. > > >> 3 . The new DrillScanRel has the same RowType as the JoinRel. The > > >> requirement and exmple of transforming between two different RelNodes > > can > > >> be found from other codes. This DrillScanRel's GroupScan is the > > >> BGroupScan. > > >> This new DrillScanRel is the one applys to the code > > >> `call.transformTo(xxxx)`. > > >> > > >> maybe the picture below may help you understand my idea: > > >> > > >> > > >> ---Scan (AGroupScan) > > >> suppose the initial RelNode tree is : Project ----Join --| > > >> > > >> | ---Scan (AGroupScan) > > >> > > >> | > > >> > > >> \|/ > > >> after applied this rule ,the final tree is: Project-----Scan ( > > BGroupScan > > >> ( > > >> List(AGroupScan ,AGroupScan) ) ) > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> On Thu, Mar 30, 2017 at 10:01 PM, Muhammad Gelbana < > m.gelb...@gmail.com > > > > > >> wrote: > > >> > > >> > *This is my rule class* > > >> > > > >> > public class CartesianProductJoinRule extends RelOptRule { > > >> > > > >> > public static final CartesianProductJoinRule INSTANCE = new > > >> > CartesianProductJoinRule(DrillJoinRel.class); > > >> > > > >> > public CartesianProductJoinRule(Class<DrillJoinRel> clazz) { > > >> > super(operand(clazz, operand(RelNode.class, any()), > > >> > operand(RelNode.class, any())), > > >> > "CartesianProductJoin"); > > >> > } > > >> > > > >> > @Override > > >> > public boolean matches(RelOptRuleCall call) { > > >> > DrillJoinRel drillJoin = call.rel(0); > > >> > return drillJoin.getJoinType() == JoinRelType.INNER && > > >> > drillJoin.getCondition().isAlwaysTrue(); > > >> > } > > >> > > > >> > @Override > > >> > public void onMatch(RelOptRuleCall call) { > > >> > DrillJoinRel join = call.rel(0); > > >> > RelNode firstRel = call.rel(1); > > >> > RelNode secondRel = call.rel(2); > > >> > HepRelVertex right = (HepRelVertex) join.getRight(); > > >> > HepRelVertex left = (HepRelVertex) join.getLeft(); > > >> > > > >> > List<RelDataTypeField> firstFields = firstRel.getRowType(). > > >> > getFieldList(); > > >> > List<RelDataTypeField> secondFields = > secondRel.getRowType(). > > >> > getFieldList(); > > >> > > > >> > RelNode firstTable = ((HepRelVertex)firstRel. > > >> > getInput(0)).getCurrentRel(); > > >> > RelNode secondTable = ((HepRelVertex)secondRel. > > >> > getInput(0)).getCurrentRel(); > > >> > > > >> > //call.transformTo(???); > > >> > } > > >> > } > > >> > > > >> > *To register the rule*, I overrode the *getOptimizerRules* method in > > my > > >> > storage plugin class > > >> > > > >> > public Set<? extends RelOptRule> getOptimizerRules(OptimizerRul > > >> esContext > > >> > optimizerContext, PlannerPhase phase) { > > >> > switch (phase) { > > >> > case LOGICAL_PRUNE_AND_JOIN: > > >> > case LOGICAL_PRUNE: > > >> > case LOGICAL: > > >> > return getLogicalOptimizerRules(optimizerContext); > > >> > case PHYSICAL: > > >> > return getPhysicalOptimizerRules(optimizerContext); > > >> > case PARTITION_PRUNING: > > >> > case JOIN_PLANNING: > > >> > * return ImmutableSet.of(CartesianProductJoinRule. > INSTANCE);* > > >> > default: > > >> > return ImmutableSet.of(); > > >> > } > > >> > > > >> > } > > >> > > > >> > The rule is firing as expected but I'm lost when it comes to the > > >> > conversion. Earlier, you said "the new equivalent ScanRel is to have > > the > > >> > joined > > >> > ScanRel nodes's GroupScans", so > > >> > > > >> > 1. How can I obtain the left and right tables group scans ? > > >> > 2. What exactly do you mean by joining them ? Is there a utility > > >> method > > >> > to do so ? Or should I manually create a new single group scan > and > > >> add > > >> > the > > >> > information I need there ? Looking into other *GroupScan* > > >> > implementations, I found that they have references to some > runtime > > >> > objects > > >> > such as the storage plugin and the storage plugin configuration. > At > > >> this > > >> > stage, I don't know how to obtain those ! > > >> > 3. Precisely, what kind of object should I use to represent a > > >> *RelNode* > > >> > that represents the whole join ? I understand that I need to use > an > > >> > object > > >> > that has implements the *RelNode* interface. Then I should add > the > > >> > created *GroupScan* to that *RelNode* instance and call > > >> > *call.transformTo(newRelNode)*, correct ? > > >> > > > >> > > > >> > *---------------------* > > >> > *Muhammad Gelbana* > > >> > http://www.linkedin.com/in/mgelbana > > >> > > > >> > On Thu, Mar 30, 2017 at 2:46 AM, weijie tong < > tongweijie...@gmail.com > > > > > >> > wrote: > > >> > > > >> > > I mean the rule you write could be placed in the > > >> > PlannerPhase.JOIN_PlANNING > > >> > > which uses the HepPlanner. This phase is to solve the logical > > relnode > > >> . > > >> > > Hope to help you. > > >> > > Muhammad Gelbana <m.gelb...@gmail.com>于2017年3月30日 周四上午12:07写道: > > >> > > > > >> > > > Thanks a lot Weijie, I believe I'm very close now. I hope you > > don't > > >> > mind > > >> > > > few more questions please: > > >> > > > > > >> > > > > > >> > > > 1. The new rule you are mentioning is a physical rule ? So I > > >> should > > >> > > > implement the Prel interface ? > > >> > > > 2. By "traversing the join to find the ScanRel" > > >> > > > - This sounds like I have to "search" for something. > > >> Shouldn't I > > >> > > just > > >> > > > work on transforming the left (i.e. DrillJoinRel's > getLeft() > > >> > > method) > > >> > > > and > > >> > > > right (i.e. DrillJoinRel's getLeft() method) join objects > ? > > >> > > > - The "left" and "right" elements of the DrillJoinRel > object > > >> are > > >> > of > > >> > > > type RelSubset, not *ScanRel* and I can't find a type > called > > >> > > > *ScanRel*. > > >> > > > I suppose you meant *ScanPrel*, specially because it > > >> implements > > >> > the > > >> > > > *Prel* interface that provides the *getPhysicalOperator* > > >> method. > > >> > > > 3. What if multiple physical or logical rules match for a > > single > > >> > node, > > >> > > > what decides which rule will be applied and which will be > > >> rejected ? > > >> > > Is > > >> > > > it > > >> > > > the *AbstractRelNode.computeSelfCost(RelOptPlanner)* method > ? > > >> What > > >> > if > > >> > > > more than one rule produces the same cost ? > > >> > > > > > >> > > > I'll go ahead and see what I can do for now before hopefully you > > may > > >> > > offer > > >> > > > more guidance. THANKS A LOT. > > >> > > > > > >> > > > *---------------------* > > >> > > > *Muhammad Gelbana* > > >> > > > http://www.linkedin.com/in/mgelbana > > >> > > > > > >> > > > On Wed, Mar 29, 2017 at 4:23 AM, weijie tong < > > >> tongweijie...@gmail.com> > > >> > > > wrote: > > >> > > > > > >> > > > > to avoid misunderstanding , the new equivalent ScanRel is to > > have > > >> the > > >> > > > > joined ScanRel nodes's GroupScans, as the GroupScans > indirectly > > >> hold > > >> > > the > > >> > > > > underlying storage information. > > >> > > > > > > >> > > > > On Wed, Mar 29, 2017 at 10:15 AM, weijie tong < > > >> > tongweijie...@gmail.com > > >> > > > > > >> > > > > wrote: > > >> > > > > > > >> > > > > > > > >> > > > > > my suggestion is you define a rule which matches the > > >> DrillJoinRel > > >> > > > RelNode > > >> > > > > > , then at the onMatch method ,you traverse the join children > > to > > >> > find > > >> > > > the > > >> > > > > > ScanRel nodes . You define a new ScanRel which include the > > >> ScanRel > > >> > > > nodes > > >> > > > > > you find last step. Then transform the JoinRel to this > > >> equivalent > > >> > new > > >> > > > > > ScanRel. > > >> > > > > > Finally , the plan tree will not have the JoinRel but the > > >> ScanRel. > > >> > > > You > > >> > > > > > can let your join plan rule in the > > PlannerPhase.JOIN_PLANNING. > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > > > > > > >