I'm focusing on JOINs now, specially a query such as this: *SELECT * FROM TABLE1, TABLE2*, drill plans to transform this into 2 separate full scan queries and then perform the cartesian product join on it's own. I'm trying to make drill send the query as it is in a single scan (group scan ?)
@weijie I've found that if I opt-out the JDBC's JdbcDrelConverterRule rule (i.e. JdbcStoragePlugin.DrillJdbcConvention.DrillJdbcConvention), an exception is thrown because Drill refuses to plan cartesian product joins. Are you saying that I need to keep such rule and let Drill plan it to 2 different group scans, then I should change this plan to merge these 2 group scans into one ? Is there a way to make Drill accept planning cartesian product joins ? *---------------------* *Muhammad Gelbana* http://www.linkedin.com/in/mgelbana On Sun, Mar 26, 2017 at 1:33 AM, Muhammad Gelbana <m.gelb...@gmail.com> wrote: > Priceless information ! Thank you all. > > I managed to debug Drill in Eclipse hoping to get a better understanding > but I can't get my head around some stuff: > > - What is the purpose of these clases\interfaces: > - ConverterRule > - DrillRel > - Prel > - JdbcStoragePlugin.JdbcPrule > - JdbcIntermediatePrel > - What does the words *Prel* and *Prule* stand for ? *Prel*iminary and > *P*reliminary *Rule* ? > - What is a calling convention ? (i.e. mentioned in *ConverterRule*'s > documentation) > > Is there a way configure the costing model for the JDBC plugin without > having to customize it through code ? After all, my ultimate goal is to > push down filters and joins. > > I'll continue debugging\browsing the code and come back with more > questions, or hopefully an achievement ! > > Thanks again, your help is very much appreciated. > > *---------------------* > *Muhammad Gelbana* > http://www.linkedin.com/in/mgelbana > > On Fri, Mar 24, 2017 at 1:29 AM, weijie tong <tongweijie...@gmail.com> > wrote: > >> I am working on pushing down joins to Druid storage plugin. To my >> experience, you need to write a rule to know whether the joins could be >> pushed down by your storage plugin metadata first,then if ok ,you transfer >> the join node to the scan node with the query relevant information in the >> scan node. The key point is to do this rule in the HepPlanner. >> Zelaine Fong <zf...@mapr.com>于2017年3月24日 周五上午5:15写道: >> >> > The JDBC storage plugin does attempt to do pushdowns of joins. However, >> > the Drill optimizer will evaluate different query plans. In doing so, >> it >> > may choose an alternative plan that does not do a full pushdown if it >> > believes that’s a less costly plan than a full pushdown. There are a >> > number of open bugs with the JDBC storage plugin, including DRILL-4696. >> > For that particular issue, I believe that when it was investigated, it >> was >> > determined that the costing model for the JDBC storage plugin needed >> more >> > work. Hence Drill wasn’t picking the more optimal full pushdown plan. >> > >> > -- Zelaine >> > >> > On 3/23/17, 1:53 PM, "Paul Rogers" <prog...@mapr.com> wrote: >> > >> > Hi Muhammad, >> > >> > It seems that the goal for filters should be possible; I’m not >> > familiar enough with the code to know if joins are currently supported, >> or >> > if this is where you’d have to make some contributions to Drill. >> > >> > The storage plugin is called at various places in the planning >> > process, and can insert planning rules. We have plugins that push down >> > filters, so this seems possible. For example, check Parquet and JDBC for >> > hints. See my answer to a previous question for hints on how to get >> started >> > with storage plugins. >> > >> > Joins may be a bit more complex. You’d have to insert planner rules; >> > such code *may* be available, or may require extensions to Drill. Drill >> > should certainly do this, so if the code is not there, we’d welcome your >> > contribution. >> > >> > You’d have to create an rule that creates a new scan operator that >> > includes the information you wish to push down. For example, if you >> push a >> > filter, the scan definition (AKA group scan and scan entry) would need >> to >> > hold the information needed to implement the push-down. Again, you can >> > probably find examples of filters, you’d have to be creative to push >> joins. >> > >> > Assembling the pieces: your plugin would add planner rules that >> > determine when joins can be pushed. Those rules would case your plugin >> to >> > create a semantic node (group scan) that holds the required information. >> > The planner then converts group scan nodes to specific plans passed to >> the >> > execution engine. On the execution side, your plugin provides a “Record >> > Reader” for your format, and that reader does the actual work to push >> the >> > filter or join down to your data source. >> > >> > Your best bet is to mine existing plugins for ideas, and then >> > experiment. Start simply and gradually add functionality. And, ask >> > questions back on this list. >> > >> > >> > Thanks, >> > >> > - Paul >> > >> > > On Mar 22, 2017, at 8:20 AM, Muhammad Gelbana < >> m.gelb...@gmail.com> >> > wrote: >> > > >> > > I'm trying to use Drill with a proprietary datasource that is very >> > fast in >> > > applying data joins (i.e. SQL joins) and query filters (i.e. SQL >> > where >> > > conditions). >> > > >> > > To connect to that datasource, I first have to write a storage >> > plugin, but >> > > I'm not sure if my main goal is applicable. >> > > >> > > May main goal is to configure Drill to let the datasource perform >> > JOINS and >> > > filters and only return the data. Then drill can perform further >> > processing >> > > based on the original SQL query sent to Drill. >> > > >> > > Is this possible by developing a storage plugin ? Where exactly >> > should I be >> > > looking ? >> > > >> > > I've been going through this wiki >> > > <https://github.com/paul-rogers/drill/wiki> and I don't think I >> > understood >> > > every concept. So if there is another source of information about >> > storage >> > > plugins development, please point it out. >> > > >> > > *---------------------* >> > > *Muhammad Gelbana* >> > > http://www.linkedin.com/in/mgelbana >> > >> > >> > >> > >> > >