Take a look at [Combine]HiveInputFormat; they are what we wrap around your 
input formats in order to allow Hive to access data from multiple input formats 
in the same job.

JVS

On Jul 1, 2010, at 10:16 AM, yan qi wrote:

sHi, Namit,

  Thanks a lot for your reply!

  I checked the source code. Given a query, (select tmp7.* from tmp7 join tmp2 
on (tmp7.c2 = tmp2.c1)), there is only a MapReduce job generated. As far as I 
know, the function setInputFormat would be used to set the job's InputFormat 
class, in the ExecDriver.java.

  Then I didn't see any chance to set two different InputFormat classes in one 
job. Or did I miss something here?

Thanks,


On Thu, Jul 1, 2010 at 10:00 AM, Namit Jain 
<nj...@facebook.com<mailto:nj...@facebook.com>> wrote:
That's fine
The 2 tables can have different inputformats

Sent from my iPhone

On Jul 1, 2010, at 9:51 AM, "yan qi" 
<wener.shan...@gmail.com<mailto:wener.shan...@gmail.com>> wrote:

> Hi,
>
>   I have a question about the JOIN operation in Hive.
>
>   For example, I have a query, like
>
>    select tmp7.* from tmp7 join tmp2 on (tmp7.c2 = tmp2.c1);
>
>   Clearly, there is a JOIN involved in the statement.
>    1. tmp2 and tmp7 are two tables.
>     2. c2 and c1 are columns belonging to tmp7 and tmp2 respectively.
>
>   I found that this query is executed in Hive with a MapReduce Job.
> Therefore, I am wondering if tmp2 and tmp7 are both assumed to share
> the same InputFormat class.
>
>   What if tmp2 and tmp7 are using different InputFormat classes to
> read records?
>
>
> Thanks,
>
> WS
>


Reply via email to