RE: need help on writing hive query

2012-11-03 Thread java8964 java8964
ax on top of it. Since everything is larger than NULL in max, you will get the cde time stamp. 3) If each user could have more than one 'abc' or 'cde', then you need to decide which one you want it out. Yong Date: Fri, 2 Nov 2012 22:35:42 -0500 Subject: need help on wri

need help on writing hive query

2012-11-02 Thread qiaoresearcher
The table format is something like: user_idvisiting_time visiting_web_page user1 time11 page_string_11 user1 time12 page_string_12 with keyword 'abc' user1 time13 page_string_13 user1 time14 page_strin

RE: need help on writing hive query

2012-10-31 Thread java8964 java8964
; Subject: Re: need help on writing hive query > From: matthewt...@gmail.com > Date: Wed, 31 Oct 2012 17:53:06 -0400 > To: user@hive.apache.org > > I did a similar query a few months ago. In short, I left-padded the page > name with the time stamp, grouped with collect_set, and th

Re: need help on writing hive query

2012-10-31 Thread Matt Tucker
I did a similar query a few months ago. In short, I left-padded the page name with the time stamp, grouped with collect_set, and then used sort_array(). There was some other cleanup work and converting back to string to remove the time stamps, but it remained in order. If there's an easier wa

Re: need help on writing hive query

2012-10-31 Thread Tom Brown
It wouldn't retrieve the user's path in a single string, but you could simply select the user id and current page, ordered by the timestamp. It would require a second step to turn it into the single string path, so that might be a deal-breaker. --Tom On Wed, Oct 31, 2012 at 3:32 PM, Philip Troma

Re: need help on writing hive query

2012-10-31 Thread Philip Tromans
You could use collect_set() and GROUP BY. That wouldn't preserve order though. Phil. On Oct 31, 2012 9:18 PM, "qiaoresearcher" wrote: > Hi all, > > here is the question. Assume we have a table like: > > -

Re: need help on writing hive query

2012-10-31 Thread Mark Grover
You should look into Hive's cluster by/distribute by functionality. https://cwiki.apache.org/Hive/languagemanual-sortby.html#LanguageManualSortBy-SyntaxofClusterByandDistributeBy https://cwiki.apache.org/Hive/languagemanual-transform.html On Wed, Oct 31, 2012 at 2:18 PM, qiaoresearcher wrote: >

need help on writing hive query

2012-10-31 Thread qiaoresearcher
Hi all, here is the question. Assume we have a table like: -- user_id|| user_visiting_time|| user_current_web_page || user_previous_web_page user 1