Okay, I also saw your previous response which analyzed queries into two tables built around two files in the same directory. I guess I was simply wrong in my understanding that a Hive table is fundamentally associated with a directory instead of a file. Turns out, it be can either one. A directory table uses all files in the directory while a file table uses one specific file and properly avoids sibling files. My bad.
Thanks for the careful analysis and clarification. TIL! Cheers! On Mar 27, 2013, at 02:58 , Tony Burton wrote: > A bit more info - do an extended description of the table: > > $ desc extended gsrc1; > > And the “location” field is “location:s3://mybucket/path/to/data/src1.txt” > > Do the same on a table created with a location pointing at the directory and > the same info gives (not surprisingly) “location:s3://mybucket/path/to/data/” > ________________________________________________________________________________ Keith Wiley kwi...@keithwiley.com keithwiley.com music.keithwiley.com "I used to be with it, but then they changed what it was. Now, what I'm with isn't it, and what's it seems weird and scary to me." -- Abe (Grandpa) Simpson ________________________________________________________________________________