Hi Hive User,
I am running Hive 0.7 and Hadoop 0.20.2 on a 12-node EC2
cluster.
Cloud used : Amazon AWS.
Hadoop and Hive distribution are taken from Apache.
I have created a hive table as follows:
create external table udr(time string,sessionid string,clientid string,url
string,success string,originalresponsesize double,finalresponsesize
double,processingtime bigint) row format delimited fields terminated by
',' LOCATION *'s3n://genpactdemo/direct/';*
and I have stored some file in genpactdemo bucket and under the direct
folder.
But when I run queries from hive CLI, I get --> Failed with exception
java.io.IOException:java.lang.NullPointerException.
and the stack trace is as follows:
2012-06-14 13:43:24,483 WARN httpclient.RestS3Service
(RestS3Service.java:performRequest(393)) - Response '/direct_%24folder%24'
- Unexpected response code 404, expected 200
2012-06-14 13:43:24,624 WARN httpclient.RestS3Service
(RestS3Service.java:performRequest(393)) - Response '/direct' - Unexpected
response code 404, expected 200
2012-06-14 13:43:24,708 WARN httpclient.RestS3Service
(RestS3Service.java:performRequest(393)) - Response '/direct' - Unexpected
response code 404, expected 200
2012-06-14 13:43:24,724 WARN httpclient.RestS3Service
(RestS3Service.java:performRequest(393)) - Response '/direct_%24folder%24'
- Unexpected response code 404, expected 200
2012-06-14 13:43:24,778 WARN httpclient.RestS3Service
(RestS3Service.java:performRequest(393)) - Response '/direct' - Unexpected
response code 404, expected 200
2012-06-14 13:43:24,779 WARN httpclient.RestS3Service
(RestS3Service.java:performRequest(402)) - Response '/direct' - Received
error response with XML message
2012-06-14 13:43:24,780 ERROR CliDriver (SessionState.java:printError(343))
- Failed with exception java.io.IOException:java.lang.NullPointerException
java.io.IOException: java.lang.NullPointerException
at
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:341)
at
org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:133)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1114)
at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.lang.NullPointerException
at
org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106)
at java.io.BufferedInputStream.close(BufferedInputStream.java:451)
at java.io.FilterInputStream.close(FilterInputStream.java:155)
at org.apache.hadoop.util.LineReader.close(LineReader.java:83)
at
org.apache.hadoop.mapred.LineRecordReader.close(LineRecordReader.java:171)
at
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:336)
... 10 more
When I create the bucket, there are no errors on the HIVE CLI. But I see
WARN message and 404 messages in the logs. Is it the problem creating the
table itself? Or is it the problem accessing the data in the folder during
the query?
Note: If i directly save the flat-files in the bucket (and not under the
directory inside the bucket), all the queries works completely fine.
Please provide me the exact way to point to a number of files inside a
directory of a bucket while creating hive tables.
--
Thanks
With Regards
Chandan