Ramki, I was going through that thread before as Sanjeev said it worked so I was doing some experiment as well. As you I too had the impression that Hive tables are associated with directories and as pointed out I was wrong.
Basically the idea of pointing a table to a file as mentioned on that thread is kind of hack create table without location alter table to point to file >From Mark's answer what he suggest is we can use virtual column INPUT__FILE__NAME to select which file we want to use while querying in case there are multiple files inside a directory and you just want to use a specific one. The bug I mentioned is for files, having particular files from a directory matching the regex. Not for the regex serde. Correct my understanding if I got anything wrong On Fri, Jun 21, 2013 at 12:04 AM, Ramki Palle <ramki.pa...@gmail.com> wrote: > Nitin, > > Can you go through the thread with subject "S3/EMR Hive: Load contents of > a single file" on Tue, 26 Mar, 17:11> at > > > http://mail-archives.apache.org/mod_mbox/hive-user/201303.mbox/thread?1 > > This gives the whole discussion about the topic of table location > pointing to a filename vs. directory. > > Can you give your insight from this discussion and the discussion you > mentioned at stackoverflow link? > > Regards, > Ramki. > > > > On Thu, Jun 20, 2013 at 11:14 AM, Nitin Pawar <nitinpawar...@gmail.com>wrote: > >> Also see this JIRA >> https://issues.apache.org/jira/browse/HIVE-951 >> >> I think issue you are facing is due to the JIRA >> >> >> On Thu, Jun 20, 2013 at 11:41 PM, Nitin Pawar <nitinpawar...@gmail.com>wrote: >> >>> Mark has answered this before >>> >>> http://stackoverflow.com/questions/11269203/when-creating-an-external-table-in-hive-can-i-point-the-location-to-specific-fil >>> >>> If this link does not answer your question, do let us know >>> >>> >>> On Thu, Jun 20, 2013 at 11:33 PM, sanjeev sagar <sanjeev.sa...@gmail.com >>> > wrote: >>> >>>> Two issues: >>>> >>>> 1. I've created external tables in hive based on file location before >>>> and it work without any issue. It don't have to be a directory. >>>> >>>> 2. If there are more than one file in the directory, and you create >>>> external table based on directory then how the table knows that which file >>>> it need to look for the data? >>>> >>>> I tried to create the table based on directory, it created the table >>>> but all the rows were NULL. >>>> >>>> -Sanjeev >>>> >>>> >>>> On Thu, Jun 20, 2013 at 10:30 AM, Nitin Pawar >>>> <nitinpawar...@gmail.com>wrote: >>>> >>>>> in hive when you create table and use the location to refer hdfs path, >>>>> that path is supposed to be a directory. >>>>> If the directory is not existing it will try to create it and if its a >>>>> file it will throw an error as its not a directory >>>>> >>>>> thats the error you are getting that location you referred is a file. >>>>> Change it to the directory and see if that works for you >>>>> >>>>> >>>>> On Thu, Jun 20, 2013 at 10:57 PM, sanjeev sagar < >>>>> sanjeev.sa...@gmail.com> wrote: >>>>> >>>>>> I did mention in my mail the hdfs file exists in that location. See >>>>>> below >>>>>> >>>>>> In HDFS: file exists >>>>>> >>>>>> >>>>>> >>>>>> hadoop fs -ls >>>>>> >>>>>> /user/flume/events/request_logs/ >>>>>> ar1.vgs.mypoints.com/13-06-13/FlumeData.1371144648033 >>>>>> >>>>>> Found 1 items >>>>>> >>>>>> -rw-r--r-- 3 hdfs supergroup 2242037226 2013-06-13 11:14 >>>>>> >>>>>> /user/flume/events/request_logs/ >>>>>> ar1.vgs.mypoints.com/13-06-13/FlumeData.1371144648033 >>>>>> >>>>>> so the directory and file both exists. >>>>>> >>>>>> >>>>>> On Thu, Jun 20, 2013 at 10:24 AM, Nitin Pawar < >>>>>> nitinpawar...@gmail.com> wrote: >>>>>> >>>>>>> MetaException(message:hdfs:// >>>>>>> h1.vgs.mypoints.com:8020/user/flume/events/request_logs/ar1.vgs.mypoints.com/13-06-13/FlumeData.1371144648033 >>>>>>> >>>>>>> is not a directory or unable to create one) >>>>>>> >>>>>>> >>>>>>> it clearly says its not a directory. Point to the dictory and it >>>>>>> will work >>>>>>> >>>>>>> >>>>>>> On Thu, Jun 20, 2013 at 10:52 PM, sanjeev sagar < >>>>>>> sanjeev.sa...@gmail.com> wrote: >>>>>>> >>>>>>>> Hello Everyone, I'm running into the following Hive external table >>>>>>>> issue. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> hive> CREATE EXTERNAL TABLE access( >>>>>>>> >>>>>>>> > host STRING, >>>>>>>> >>>>>>>> > identity STRING, >>>>>>>> >>>>>>>> > user STRING, >>>>>>>> >>>>>>>> > time STRING, >>>>>>>> >>>>>>>> > request STRING, >>>>>>>> >>>>>>>> > status STRING, >>>>>>>> >>>>>>>> > size STRING, >>>>>>>> >>>>>>>> > referer STRING, >>>>>>>> >>>>>>>> > agent STRING) >>>>>>>> >>>>>>>> > ROW FORMAT SERDE >>>>>>>> >>>>>>>> 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' >>>>>>>> >>>>>>>> > WITH SERDEPROPERTIES ( >>>>>>>> >>>>>>>> > "input.regex" = "([^ ]*) ([^ ]*) ([^ ]*) >>>>>>>> (-|\\[[^\\]]*\\]) >>>>>>>> >>>>>>>> ([^ \"]*|\"[^\"]*\") (-|[0-9]*) (-|[0-9]*)(?: ([^ \"]*|\"[^\"]*\") >>>>>>>> ([^ \"]*|\"[^\"]*\"))?", >>>>>>>> >>>>>>>> > "output.format.string" = "%1$s %2$s %3$s %4$s %5$s >>>>>>>> %6$s >>>>>>>> >>>>>>>> %7$s %8$s %9$s" >>>>>>>> >>>>>>>> > ) >>>>>>>> >>>>>>>> > STORED AS TEXTFILE >>>>>>>> >>>>>>>> > LOCATION >>>>>>>> >>>>>>>> '/user/flume/events/request_logs/ >>>>>>>> ar1.vgs.mypoints.com/13-06-13/FlumeData.1371144648033'; >>>>>>>> >>>>>>>> FAILED: Error in metadata: >>>>>>>> >>>>>>>> MetaException(message:hdfs:// >>>>>>>> h1.vgs.mypoints.com:8020/user/flume/events/request_logs/ar1.vgs.mypoints.com/13-06-13/FlumeData.1371144648033 >>>>>>>> >>>>>>>> is not a directory or unable to create one) >>>>>>>> >>>>>>>> FAILED: Execution Error, return code 1 from >>>>>>>> org.apache.hadoop.hive.ql.exec.DDLTask >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> In HDFS: file exists >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> hadoop fs -ls >>>>>>>> >>>>>>>> /user/flume/events/request_logs/ >>>>>>>> ar1.vgs.mypoints.com/13-06-13/FlumeData.1371144648033 >>>>>>>> >>>>>>>> Found 1 items >>>>>>>> >>>>>>>> -rw-r--r-- 3 hdfs supergroup 2242037226 2013-06-13 11:14 >>>>>>>> >>>>>>>> /user/flume/events/request_logs/ >>>>>>>> ar1.vgs.mypoints.com/13-06-13/FlumeData.1371144648033 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I've download the serde2 jar file too and install it in >>>>>>>> /usr/lib/hive/lib/hive-json-serde-0.2.jar and I've bounced all the >>>>>>>> hadoop >>>>>>>> services after that. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I even added the jar file manually in hive and run the above sql >>>>>>>> but still failing. >>>>>>>> >>>>>>>> ive> add jar /usr/lib/hive/lib/hive-json-serde-0.2.jar >>>>>>>> >>>>>>>> > ; >>>>>>>> >>>>>>>> Added /usr/lib/hive/lib/hive-json-serde-0.2.jar to class path Added >>>>>>>> resource: /usr/lib/hive/lib/hive-json-serde-0.2.jar >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Any help would be highly appreciable. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -Sanjeev >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Sanjeev Sagar >>>>>>>> >>>>>>>> *"**Separate yourself from everything that separates you from >>>>>>>> others !" - Nirankari Baba Hardev Singh ji * >>>>>>>> >>>>>>>> ** >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Nitin Pawar >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Sanjeev Sagar >>>>>> >>>>>> *"**Separate yourself from everything that separates you from others >>>>>> !" - Nirankari Baba Hardev Singh ji * >>>>>> >>>>>> ** >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Nitin Pawar >>>>> >>>> >>>> >>>> >>>> -- >>>> Sanjeev Sagar >>>> >>>> *"**Separate yourself from everything that separates you from others !"- >>>> Nirankari >>>> Baba Hardev Singh ji * >>>> >>>> ** >>>> >>> >>> >>> >>> -- >>> Nitin Pawar >>> >> >> >> >> -- >> Nitin Pawar >> > > -- Nitin Pawar