The JDBC storage plugin does attempt to do pushdowns of joins. However, the Drill optimizer will evaluate different query plans. In doing so, it may choose an alternative plan that does not do a full pushdown if it believes that’s a less costly plan than a full pushdown. There are a number of open bugs with the JDBC storage plugin, including DRILL-4696. For that particular issue, I believe that when it was investigated, it was determined that the costing model for the JDBC storage plugin needed more work. Hence Drill wasn’t picking the more optimal full pushdown plan.
-- Zelaine On 3/23/17, 1:53 PM, "Paul Rogers" <prog...@mapr.com> wrote: Hi Muhammad, It seems that the goal for filters should be possible; I’m not familiar enough with the code to know if joins are currently supported, or if this is where you’d have to make some contributions to Drill. The storage plugin is called at various places in the planning process, and can insert planning rules. We have plugins that push down filters, so this seems possible. For example, check Parquet and JDBC for hints. See my answer to a previous question for hints on how to get started with storage plugins. Joins may be a bit more complex. You’d have to insert planner rules; such code *may* be available, or may require extensions to Drill. Drill should certainly do this, so if the code is not there, we’d welcome your contribution. You’d have to create an rule that creates a new scan operator that includes the information you wish to push down. For example, if you push a filter, the scan definition (AKA group scan and scan entry) would need to hold the information needed to implement the push-down. Again, you can probably find examples of filters, you’d have to be creative to push joins. Assembling the pieces: your plugin would add planner rules that determine when joins can be pushed. Those rules would case your plugin to create a semantic node (group scan) that holds the required information. The planner then converts group scan nodes to specific plans passed to the execution engine. On the execution side, your plugin provides a “Record Reader” for your format, and that reader does the actual work to push the filter or join down to your data source. Your best bet is to mine existing plugins for ideas, and then experiment. Start simply and gradually add functionality. And, ask questions back on this list. Thanks, - Paul > On Mar 22, 2017, at 8:20 AM, Muhammad Gelbana <m.gelb...@gmail.com> wrote: > > I'm trying to use Drill with a proprietary datasource that is very fast in > applying data joins (i.e. SQL joins) and query filters (i.e. SQL where > conditions). > > To connect to that datasource, I first have to write a storage plugin, but > I'm not sure if my main goal is applicable. > > May main goal is to configure Drill to let the datasource perform JOINS and > filters and only return the data. Then drill can perform further processing > based on the original SQL query sent to Drill. > > Is this possible by developing a storage plugin ? Where exactly should I be > looking ? > > I've been going through this wiki > <https://github.com/paul-rogers/drill/wiki> and I don't think I understood > every concept. So if there is another source of information about storage > plugins development, please point it out. > > *---------------------* > *Muhammad Gelbana* > http://www.linkedin.com/in/mgelbana