How HIVE manages a join

2010-08-05 Thread Cappa Roberto
Hi, I cannot find any documentation about what algorithm performs HIVE to translate JOIN clauses to Map-Reduce tasks. In particular, if I have two tables A and B, each table is written on a separate file and each file is splitted on hadoop nodes. When I perform a JOIN with A.column = B.column,

How to merge small files

2010-08-05 Thread lei liu
When I run below sql: INSERT OVERWRITE TABLE tablename1 select_statement1 FROM from_statement, there are many files which size is zero are stored to hadoop, How can I merge these small files? Thanks, LiuLei

Hypertable Meetup at Facebook Thu Sep 16 @ 6:00-9:00 PM

2010-08-05 Thread Sanjit Jhala
Hypertable-Hive and Performance Benchmark Discussion at Facebook. *Join us at Facebook for pizza, beer, and a FREE Hypertable t-shirt.* We (the Hypertable Development team) will be presenting the recently developed Hypertable Hive extension.

hwo to debug hive and hadoop

2010-08-05 Thread lei liu
I have used 'Remote Java Application' in eclipse to debug hive code, now I want to debug hive and hadoop together, how can I do it? Thanks, LiuLei

java.io.FileNotFoundException: HIVE_PLAN (No such file or directory)

2010-08-05 Thread lei liu
Now I use 0.4.1 version, when running multiple threads with roughly similar queries, hive appear the bug. How I can resolve the problem in 0.4.1 version? The problem is related whith session id, can I use 'set hive.session.id=1' to resolve the problem? Thanks, LiuLei

Re: why is slow when use OR clause instead of IN clause

2010-08-05 Thread lei liu
When there are one thousand OR clause, the hive appear below exception: Total MapReduce jobs = 1 Number of reduce tasks is set to 0 since there's no reduce operator java.lang.StackOverflowError at java.beans.Statement.(Statement.java:60) at java.beans.Expression.(Expression.java:47)