Your rank() is being evaluated map side. Put your distribute by and sort by in an inner query, and then evaluate your rank() in an outer query.
Phil. On Jul 19, 2012 9:00 PM, "comptech geeky" <comptechge...@gmail.com> wrote: > This is the below data in my Table1 > > > BID PID TIME > --------------+-------------------------+-------------------------------- > 1345653 330760137950 2012-07-09 21:42:29 > 1345653 330760137950 2012-07-09 21:43:29 > 1345653 330760137950 2012-07-09 21:40:29 > 1345653 330760137950 2012-07-09 21:41:29 > 1345653 110909316904 2012-07-09 21:29:06 > 1345653 221065796761 2012-07-09 19:31:48 > > So If I need to clarify the above scenario- I have data in above table > like this- > For USER *`1345653` *I have this PID `*330760137950` *four times but with > different timestamps in red color. So I need the output something like this- > > Output that I need:- > > *1345653 330760137950 2012-07-09 21:43:29 * > *1345653 330760137950 2012-07-09 21:42:29 * > *1345653 330760137950 2012-07-09 21:41:29* > 1345653 110909316904 2012-07-09 21:29:06 > 1345653 221065796761 2012-07-09 19:31:48 > > So Basically If BID and PID are same but with different timestamps, then I > need TOP 3 sorted with TIME in descending order > > And for this I created rank UDF (User Defined Function). And I wrote the > below query but its not working for me. Can anyone help me on this? > * > * > * > * > *SELECT buyer_id, item_id, created_time* > * FROM table1* > * where to_date(from_unixtime(cast(UNIX_TIMESTAMP(created_time) as int))) > = '2012-07-09' AND rank(buyer_id) < 3* > * DISTRIBUTE BY buyer_id* > * SORT BY buyer_id, created_time desc* > >