ax on top of it. Since everything is
larger than NULL in max, you will get the cde time stamp.
3) If each user could have more than one 'abc' or 'cde', then you need to
decide which one you want it out.
Yong
Date: Fri, 2 Nov 2012 22:35:42 -0500
Subject: need help on wri
The table format is something like:
user_idvisiting_time visiting_web_page
user1 time11 page_string_11
user1 time12 page_string_12 with keyword 'abc'
user1 time13 page_string_13
user1 time14 page_strin
; Subject: Re: need help on writing hive query
> From: matthewt...@gmail.com
> Date: Wed, 31 Oct 2012 17:53:06 -0400
> To: user@hive.apache.org
>
> I did a similar query a few months ago. In short, I left-padded the page
> name with the time stamp, grouped with collect_set, and th
I did a similar query a few months ago. In short, I left-padded the page name
with the time stamp, grouped with collect_set, and then used sort_array().
There was some other cleanup work and converting back to string to remove the
time stamps, but it remained in order.
If there's an easier wa
It wouldn't retrieve the user's path in a single string, but you could
simply select the user id and current page, ordered by the timestamp.
It would require a second step to turn it into the single string path,
so that might be a deal-breaker.
--Tom
On Wed, Oct 31, 2012 at 3:32 PM, Philip Troma
You could use collect_set() and GROUP BY. That wouldn't preserve order
though.
Phil.
On Oct 31, 2012 9:18 PM, "qiaoresearcher" wrote:
> Hi all,
>
> here is the question. Assume we have a table like:
>
> -
You should look into Hive's cluster by/distribute by functionality.
https://cwiki.apache.org/Hive/languagemanual-sortby.html#LanguageManualSortBy-SyntaxofClusterByandDistributeBy
https://cwiki.apache.org/Hive/languagemanual-transform.html
On Wed, Oct 31, 2012 at 2:18 PM, qiaoresearcher wrote:
>
Hi all,
here is the question. Assume we have a table like:
--
user_id|| user_visiting_time|| user_current_web_page ||
user_previous_web_page
user 1