I have two tables: pages( title, domain, url ) top_domains(domain) top_domains was created from a group by domain operation on the pages table.
Because the pages table is very large, I only want to be able to sample 5 rows for each domain in top_domains. in a traditional programming language, i could just use a for loop to iterate on the domain field and perform a select with a limit 5 clause. Is there a way to express this query in hive? - @tommychheng Programmer and UC Irvine Graduate Student Find a great grad school based on research interests: http://gradschoolnow.com