Hi Rakesh,

this is an interesting question,

Did you look at the cube and rollup possibilities?

https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup

I believe that you could use this feature to merge your table 1 and table 2
in one single table with a
GROUP BY business_unit, balance, account WITH ROLLUP

partition the result table with GROUPING__ID

and then your select query would have to be

select business_unit, balance from new_table where business_unit='mybu' and
GROUPING_ID = 1

If you want to get something more automatic after that, I guess there must
be tools
out there capable of providing a GUI on top of a cube, automatically
generating the SQL
and running it via JDBC connectors to Hive, Spark or Impala...

After googling a little, I believe that Apache Lens matches the description.

https://lens.apache.org/user/olap-cube.html

I also found this old presentation related to it :
http://events.linuxfoundation.org/sites/events/files/slides/ApacheCon-Datacubes_in_Apache_Hive_0.pdf

Hope this helps,

Furcy





On Thu, Aug 25, 2016 at 1:27 PM, Rakesh Kant <rk...@rkant.com> wrote:

> I have multiple aggregate tables created at different levels of
> aggregation grouped by smaller subset of keys. As an example-
> Table 1 : business_unit, account, balance
> Table 2 : business_unit, balance
>
> When a query is written as select business_unit, balance from table1 where
> business_unit='mybu'; it would be faster to use table 2 instead but how do
> I substitute table 2 instead on table 1 in the query?
>
> Is there a way to add custom handlers either in the driver, optimizer or
> other points ?
>
> Any help or pointers will be appreciated.
>
> RK
>
> Get Outlook for iOS <http://aka.ms/o0ukef>
>
>

Reply via email to