Hi, I had to post this question to this list because I feel there might be a bug here.
I'm having problems with HIVE- EC2 reading files on S3 written by other tools I have a lot of files and folders on S3 created by s3cmd and utilized by Elastic MapReduce (HIVE) and they work interchangeably, files created by HIVE-EMR can be read by s3cmd and vice versa. However, I'm having problems with HIVE/Hadoop running on EC2. Both Hive 0.7 and 0.8 seem to create an additional folder "/" on S3 For example, if I have a file s3://bucket/path/00000 created by s3cmd or HIVE-EMR and I try to create an external table on HIVE- EC2 create external table wc(site string, cnt int) row format delimited fields terminated by '\t' stored as textfile location 's3://bucket/path' This does not recognize the EMR created s3 folders, instead I see a new folder "/" <bucket> / "/" / path When I look at the debug information, HIVE seems to be sending an extra "/" when creating a table Here is a debug message and if you see the path, there is a "/" and a "%2f". Probably a bug in the code ? hive> create external table wc(site string, cnt int) .... location 's3://masked/wcoverlay/'; <StringToSign>GETWed, 07 Mar 2012 18:26:03 GMT/masked/%2Fwcoverlay</StringToSign><AWSAccessKeyId>..... Am I missing something? Thanks, Balaji