Hi,
I have been developing a solution that works with Hive. Here is a simple e.g. from <select county, tract, arealand from geo_header_sf1 where sumlev = 140> partition by county order by county, arealand desc with rank() as r, sum(arealand) over rows between unbounded preceding and current row as cum_area select county, tract, arealand, r, cum_area where <r <= 3> into path='/tmp/wout' We have loaded the Census data into our Hive instance. The Query is calculating the Top 3 Tracts(based on land area) by County. Windowing Queries can be expressed on any Hive Query or Table or on an HDFS file. Windowing processing works in MR mode or in Hive mode. MR mode generates a MR job to evaluate the windowing functions. In Hive Mode, the Windowing process is invoked from the Hive Script Operator. Support for 16 functions divided into: Ranking, Aggregation and navigation. Support for both value and range boundaries for Windows. Easy to extend through groovy integration. Code is available at https://github.com/hbutani/SQLWindowing. A writeup is available at https://github.com/hbutani/SQLWindowing/blob/master/docs/Windowing.pdf?raw=true Please send me an email if you are interested in using this. regards, Harish Butani