Re: S3/EMR Hive: Load contents of a single file

Keith Wiley Wed, 27 Mar 2013 10:03:07 -0700

Okay, I also saw your previous response which analyzed queries into two tables 
built around two files in the same directory.  I guess I was simply wrong in my 
understanding that a Hive table is fundamentally associated with a directory 
instead of a file.  Turns out, it be can either one.  A directory table uses 
all files in the directory while a file table uses one specific file and 
properly avoids sibling files.  My bad.


Thanks for the careful analysis and clarification.  TIL!

Cheers!

On Mar 27, 2013, at 02:58 , Tony Burton wrote:

> A bit more info - do an extended description of the table:
>  
> $ desc extended gsrc1;
>  
> And the “location” field is “location:s3://mybucket/path/to/data/src1.txt”
>  
> Do the same on a table created with a location pointing at the directory and 
> the same info gives (not surprisingly) “location:s3://mybucket/path/to/data/”
> 

________________________________________________________________________________
Keith Wiley     kwi...@keithwiley.com     keithwiley.com    music.keithwiley.com

"I used to be with it, but then they changed what it was.  Now, what I'm with
isn't it, and what's it seems weird and scary to me."
                                           --  Abe (Grandpa) Simpson
________________________________________________________________________________

Re: S3/EMR Hive: Load contents of a single file

Reply via email to