Re: UDAF and group by

2011-09-04 Thread Huan Li
Setting hive.map.aggr false will reduce the chance of terminatePartial() and merge() being called. Though I don't think it will eliminate the possibility. If your data is large, it's still possible that a group of data is processed by multiple reducers and those two methods are needed. If you need

Re: UDAF and group by

2011-09-04 Thread Koert Kuipers
Hey, my question wasn't very clear. I have a UDAF that I apply per group. The UDAF does not support terminatePartial() and merge(). So to do this i run: set hive.map.aggr=false; select myUdf(col1, col2) from table group by col3; Now this seems to work. But are my assumptions correct that this wil

Re: Best practices for storing data on Hive

2011-09-04 Thread wd
Hive support more than one partitions, have your tried? Maybe you can create to partitions named as date and user. Hive 0.7 also support index, maybe you can have a try. On Sat, Sep 3, 2011 at 1:18 AM, Mark Grover wrote: > Hello folks, > I am fairly new to Hive and am wondering if you could shar