How do I load data from S3 into Hive using Amazon EMR? I've booted a small cluster, and I want to load a 3-column TSV file from Pig into a table like this:
create table from_to (from_address string, to_address string, dt string); When I run something like this: load data inpath 's3n://rjurney_public_web/from_to_date' into table from_to; I get errors: FAILED: Error in semantic analysis: Line 1:17 Invalid path 's3n://rjurney_public_web/from_to_date': only "file" or "hdfs" file systems accepted. s3n file system is not supported. There is no distcp on the master node of my EMR cluster, so I can't copy it over. I've read the documentation... and so far after a day of trying, I can't load data into HIVE via EMR. What am I missing? Thanks! -- Russell Jurney twitter.com/rjurney [email protected] datasyndrome.com
