[
https://issues.apache.org/jira/browse/HIVE-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246244#comment-14246244
]
Ashutosh Chauhan commented on HIVE-7796:
----------------------------------------
Does this work as follows:
* Phoenix Jdbc handler implements {{HiveStorageSubQueryHandler}}
* Using source ast, TokenRewriteStream stream and QBParseInfo, it tries to
recreate sql text.
* Phoenix jdbc handler than sends this query to phoenix which parses and plans
this sql.
* Phoenix jdbc handler than constructs Hive's {{TableScanOperator}} which it
returns via this interface.
* This TSOp is hooked into Hive pipeline.
* All the data from hbase flows through Phoenix client to Hive.
Am I somewhere even remotely close : ) here about design. It will help
immensely to write up a design doc for this with what goal you are trying to
achieve.
I am interested in this work, so want to understand more of this. If design
inferred above is remotely close to what you have implemented, than one area of
concern is last bullet. This design makes phoenix client a bottleneck. It will
be much more scalable if we can suck in data directly from RegionServers
instead of phoenix client.
> Provide subquery pushdown facility for storage handlers
> -------------------------------------------------------
>
> Key: HIVE-7796
> URL: https://issues.apache.org/jira/browse/HIVE-7796
> Project: Hive
> Issue Type: Improvement
> Components: StorageHandler
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-7796.1.patch.txt
>
>
> If underlying storage can handle basic filtering or aggregation, hive can
> delegate execution of whole subquery to the storage and handle it as a simple
> scanning operation.
> Experimentally implemented on JDBC / Phoenix handler and seemed working good.
> Hopefully open the code for those too, but it's not allowed to me yet.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)