Here is my stab at it. I have not tested it but this should get you started

Following points are importat

1. I added a WHERE clause in the sub query to limit he data set by any 
partition u may have
2. You have to write a collect UDF to use it. Wampler/Capriolo's book in 
Chapter 13.Functions - refer the class GenericUDAFCollect

SELECT
     page_url,
     token,
     collect(concat_ws('|', pcw. original_category, pcw.weight))
FROM
     (SELECT
          page_url,
                  token,
                  original_category,
                  weight
     FROM
                 media_visit_info)
     WHERE
                 partition_column='partition_col_val'
     GROUP BY
                 original_category,
                 weight
     ) pcw

LIMIT 10
;

From: ch huang <justlo...@gmail.com<mailto:justlo...@gmail.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Date: Monday, August 19, 2013 2:04 AM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: question about hive SQL

hi,all:
       i do not very familar with HQL, and my problem is ,now i have 2 queries

Q1: select page_url, original_category,token from media_visit_info group by 
page_url, original_category,token limit 10
Q2:  select original_category as code , weight from media_visit_info where 
page_url='X' group by original_category,weight;

Q1  page_url value should be send to Q2 where condition ,and the two query 
result should be combined like

{
url:http\\:www.baidu.com,
category:|CN10,
token:20,
categorys:
[
{code:|CN10-1-1,weight:0.5},
{code:|CN11-2-2,weight:0.1},
{code:|CN10-1-3,weight:0.02}
]
}

i do not know if it can write into one query(JOIN+SUBQUERY??) ,any one can help?

CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message along with any attachments, from 
your computer system. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's Email System Administrator.

Reply via email to