I've succeeded, theoretically, in what I wanted to do because I had to send the selected columns manually to my datasource. Would someone please tell me how can I identify the selected columns in the join ? I searched a lot without success.
*---------------------* *Muhammad Gelbana* http://www.linkedin.com/in/mgelbana On Sat, Apr 1, 2017 at 1:43 AM, Muhammad Gelbana <m.gelb...@gmail.com> wrote: > So I intend to use this constructor for the new *RelNode*: > *org.apache.drill.exec.planner.logical.DrillScanRel.DrillScanRel(RelOptCluster, > RelTraitSet, RelOptTable, GroupScan, RelDataType, List<SchemaPath>)* > > How can I provide it's parameters ? > > 1. *RelOptCluster*: Can I pass *DrillJoinRel.getCluster()* ? > > 2. *RelTraitSet*: Can I pass *DrillJoinRel.getTraitSet()* ? > > 3. *RelOptTable*: I assume I can use this factory method > (*org.apache.calcite.prepare.RelOptTableImpl.create(RelOptSchema, > RelDataType, Table, Path)*). Any hints of how I can provide these > parameters too ? Should I just go ahead and manually create a new instance > of each parameter ? > > 4. *GroupScan*: I understand I have to create a new implementation > class for this one so now questions here so far. > > 5. *RelDataType*: This one is confusing. Because I understand that for > *DrillJoinRel.transformTo(newRel)* to work, I have to provide a > *newRel* instance that has a *RelDataType* instance with the same > amount of fields and compatible types (i.e. this is mandated by > *org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelNode, > RelNode, Object)*). Why couldn't I provide a *RelDataType* with > a different set of fields ? How can I resolve this ? > > 6. *List<SchemaPath>*: I assume I can call this method and pass my > columns names to it, one by one. (i.e. > *org.apache.drill.common.expression.SchemaPath.getCompoundPath(String...)* > ) > > Thanks. > > *---------------------* > *Muhammad Gelbana* > http://www.linkedin.com/in/mgelbana > > On Fri, Mar 31, 2017 at 1:59 PM, weijie tong <tongweijie...@gmail.com> > wrote: > >> your code seems right , just to implement the 'call.transformTo()' ,but >> the >> left detail , maybe I think I can't express the left things so precisely, >> just as @Paul Rogers mentioned the plugin detail is a little trivial. >> >> 1. drillScanRel.getGroupScan . >> 2. you need to extend the AbstractGroupScan ,and let it holds some >> information about your storage . This defined GroupScan just call it >> AGroupScan corresponds to a joint scan RelNode. Then you can define >> another >> GroupScan called BGroupScan which extends AGroupScan, The BGroupScan acts >> as a aggregate container which holds the two joint AGroupScan. >> 3 . The new DrillScanRel has the same RowType as the JoinRel. The >> requirement and exmple of transforming between two different RelNodes can >> be found from other codes. This DrillScanRel's GroupScan is the >> BGroupScan. >> This new DrillScanRel is the one applys to the code >> `call.transformTo(xxxx)`. >> >> maybe the picture below may help you understand my idea: >> >> >> ---Scan (AGroupScan) >> suppose the initial RelNode tree is : Project ----Join --| >> >> | ---Scan (AGroupScan) >> >> | >> >> \|/ >> after applied this rule ,the final tree is: Project-----Scan ( BGroupScan >> ( >> List(AGroupScan ,AGroupScan) ) ) >> >> >> >> >> >> >> >> On Thu, Mar 30, 2017 at 10:01 PM, Muhammad Gelbana <m.gelb...@gmail.com> >> wrote: >> >> > *This is my rule class* >> > >> > public class CartesianProductJoinRule extends RelOptRule { >> > >> > public static final CartesianProductJoinRule INSTANCE = new >> > CartesianProductJoinRule(DrillJoinRel.class); >> > >> > public CartesianProductJoinRule(Class<DrillJoinRel> clazz) { >> > super(operand(clazz, operand(RelNode.class, any()), >> > operand(RelNode.class, any())), >> > "CartesianProductJoin"); >> > } >> > >> > @Override >> > public boolean matches(RelOptRuleCall call) { >> > DrillJoinRel drillJoin = call.rel(0); >> > return drillJoin.getJoinType() == JoinRelType.INNER && >> > drillJoin.getCondition().isAlwaysTrue(); >> > } >> > >> > @Override >> > public void onMatch(RelOptRuleCall call) { >> > DrillJoinRel join = call.rel(0); >> > RelNode firstRel = call.rel(1); >> > RelNode secondRel = call.rel(2); >> > HepRelVertex right = (HepRelVertex) join.getRight(); >> > HepRelVertex left = (HepRelVertex) join.getLeft(); >> > >> > List<RelDataTypeField> firstFields = firstRel.getRowType(). >> > getFieldList(); >> > List<RelDataTypeField> secondFields = secondRel.getRowType(). >> > getFieldList(); >> > >> > RelNode firstTable = ((HepRelVertex)firstRel. >> > getInput(0)).getCurrentRel(); >> > RelNode secondTable = ((HepRelVertex)secondRel. >> > getInput(0)).getCurrentRel(); >> > >> > //call.transformTo(???); >> > } >> > } >> > >> > *To register the rule*, I overrode the *getOptimizerRules* method in my >> > storage plugin class >> > >> > public Set<? extends RelOptRule> getOptimizerRules(OptimizerRul >> esContext >> > optimizerContext, PlannerPhase phase) { >> > switch (phase) { >> > case LOGICAL_PRUNE_AND_JOIN: >> > case LOGICAL_PRUNE: >> > case LOGICAL: >> > return getLogicalOptimizerRules(optimizerContext); >> > case PHYSICAL: >> > return getPhysicalOptimizerRules(optimizerContext); >> > case PARTITION_PRUNING: >> > case JOIN_PLANNING: >> > * return ImmutableSet.of(CartesianProductJoinRule.INSTANCE);* >> > default: >> > return ImmutableSet.of(); >> > } >> > >> > } >> > >> > The rule is firing as expected but I'm lost when it comes to the >> > conversion. Earlier, you said "the new equivalent ScanRel is to have the >> > joined >> > ScanRel nodes's GroupScans", so >> > >> > 1. How can I obtain the left and right tables group scans ? >> > 2. What exactly do you mean by joining them ? Is there a utility >> method >> > to do so ? Or should I manually create a new single group scan and >> add >> > the >> > information I need there ? Looking into other *GroupScan* >> > implementations, I found that they have references to some runtime >> > objects >> > such as the storage plugin and the storage plugin configuration. At >> this >> > stage, I don't know how to obtain those ! >> > 3. Precisely, what kind of object should I use to represent a >> *RelNode* >> > that represents the whole join ? I understand that I need to use an >> > object >> > that has implements the *RelNode* interface. Then I should add the >> > created *GroupScan* to that *RelNode* instance and call >> > *call.transformTo(newRelNode)*, correct ? >> > >> > >> > *---------------------* >> > *Muhammad Gelbana* >> > http://www.linkedin.com/in/mgelbana >> > >> > On Thu, Mar 30, 2017 at 2:46 AM, weijie tong <tongweijie...@gmail.com> >> > wrote: >> > >> > > I mean the rule you write could be placed in the >> > PlannerPhase.JOIN_PlANNING >> > > which uses the HepPlanner. This phase is to solve the logical relnode >> . >> > > Hope to help you. >> > > Muhammad Gelbana <m.gelb...@gmail.com>于2017年3月30日 周四上午12:07写道: >> > > >> > > > Thanks a lot Weijie, I believe I'm very close now. I hope you don't >> > mind >> > > > few more questions please: >> > > > >> > > > >> > > > 1. The new rule you are mentioning is a physical rule ? So I >> should >> > > > implement the Prel interface ? >> > > > 2. By "traversing the join to find the ScanRel" >> > > > - This sounds like I have to "search" for something. >> Shouldn't I >> > > just >> > > > work on transforming the left (i.e. DrillJoinRel's getLeft() >> > > method) >> > > > and >> > > > right (i.e. DrillJoinRel's getLeft() method) join objects ? >> > > > - The "left" and "right" elements of the DrillJoinRel object >> are >> > of >> > > > type RelSubset, not *ScanRel* and I can't find a type called >> > > > *ScanRel*. >> > > > I suppose you meant *ScanPrel*, specially because it >> implements >> > the >> > > > *Prel* interface that provides the *getPhysicalOperator* >> method. >> > > > 3. What if multiple physical or logical rules match for a single >> > node, >> > > > what decides which rule will be applied and which will be >> rejected ? >> > > Is >> > > > it >> > > > the *AbstractRelNode.computeSelfCost(RelOptPlanner)* method ? >> What >> > if >> > > > more than one rule produces the same cost ? >> > > > >> > > > I'll go ahead and see what I can do for now before hopefully you may >> > > offer >> > > > more guidance. THANKS A LOT. >> > > > >> > > > *---------------------* >> > > > *Muhammad Gelbana* >> > > > http://www.linkedin.com/in/mgelbana >> > > > >> > > > On Wed, Mar 29, 2017 at 4:23 AM, weijie tong < >> tongweijie...@gmail.com> >> > > > wrote: >> > > > >> > > > > to avoid misunderstanding , the new equivalent ScanRel is to have >> the >> > > > > joined ScanRel nodes's GroupScans, as the GroupScans indirectly >> hold >> > > the >> > > > > underlying storage information. >> > > > > >> > > > > On Wed, Mar 29, 2017 at 10:15 AM, weijie tong < >> > tongweijie...@gmail.com >> > > > >> > > > > wrote: >> > > > > >> > > > > > >> > > > > > my suggestion is you define a rule which matches the >> DrillJoinRel >> > > > RelNode >> > > > > > , then at the onMatch method ,you traverse the join children to >> > find >> > > > the >> > > > > > ScanRel nodes . You define a new ScanRel which include the >> ScanRel >> > > > nodes >> > > > > > you find last step. Then transform the JoinRel to this >> equivalent >> > new >> > > > > > ScanRel. >> > > > > > Finally , the plan tree will not have the JoinRel but the >> ScanRel. >> > > > You >> > > > > > can let your join plan rule in the PlannerPhase.JOIN_PLANNING. >> > > > > > >> > > > > >> > > > >> > > >> > >> > >