Hello,

I'm getting only NULLs when trying to read data from an external table for 
which the data was staged. I'm using Cloudera's hive-0.4.0+14.tar.gz with 
hadoop-0.20.1+152.tar.gz on a Centos machine.

My steps were:

# CREATE TABLE

CREATE EXTERNAL TABLE mailog_stg(month STRING, day STRING, time STRING, host 
STRING, logline STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
  "input.regex" = "^([^ ]+) ([^ ]+) ([^ ]+) ([^ ]+) (.+)$",
  "output.format.string" = "%1$s %2$s %3$s %4$s %5$s"
)
STORED AS TEXTFILE
LOCATION '/user/data/staging/mailog';

# COPY DATA FILE

hadoop dfs -put /ken/sf45-cc2-maillogs/mail.20091102  /user/data/staging/mailog

# DO A SELECT

hive> select * from mailog_stg;
OK
NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL
NULL    NULL    NULL    NULL    NULL

I verified my data is out there:

[r...@jims-desktop ~]#  hadoop dfs -cat /user/data/staging/mailog/m*|wc
      4      77    1016

And in fact when I used LOAD DATA INPATH with a table defined the same way but 
without making it external, it worked fine.

Any suggestions how to debug this?

Thanks
Ken


Reply via email to