Hi Rakesh, this is an interesting question,
Did you look at the cube and rollup possibilities? https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup I believe that you could use this feature to merge your table 1 and table 2 in one single table with a GROUP BY business_unit, balance, account WITH ROLLUP partition the result table with GROUPING__ID and then your select query would have to be select business_unit, balance from new_table where business_unit='mybu' and GROUPING_ID = 1 If you want to get something more automatic after that, I guess there must be tools out there capable of providing a GUI on top of a cube, automatically generating the SQL and running it via JDBC connectors to Hive, Spark or Impala... After googling a little, I believe that Apache Lens matches the description. https://lens.apache.org/user/olap-cube.html I also found this old presentation related to it : http://events.linuxfoundation.org/sites/events/files/slides/ApacheCon-Datacubes_in_Apache_Hive_0.pdf Hope this helps, Furcy On Thu, Aug 25, 2016 at 1:27 PM, Rakesh Kant <rk...@rkant.com> wrote: > I have multiple aggregate tables created at different levels of > aggregation grouped by smaller subset of keys. As an example- > Table 1 : business_unit, account, balance > Table 2 : business_unit, balance > > When a query is written as select business_unit, balance from table1 where > business_unit='mybu'; it would be faster to use table 2 instead but how do > I substitute table 2 instead on table 1 in the query? > > Is there a way to add custom handlers either in the driver, optimizer or > other points ? > > Any help or pointers will be appreciated. > > RK > > Get Outlook for iOS <http://aka.ms/o0ukef> > >