Ok, I spoke too soon. Same error. Crapola. Still working on it. On Tue, May 29, 2012 at 2:19 PM, Russell Jurney <[email protected]>wrote:
> I get an error when I create an external table. btw - I can partition on > dt or from/to address. I'm just not clear on how to partition - my efforts > fail. > > hive> create external table from_to(from_address string, to_address > string, dt string) > > row format delimited fields terminated by '\t' stored as > textfile location 's3n://rjurney_public_web/from_to_date'; > FAILED: Error in metadata: java.lang.IllegalArgumentException: Invalid > hostname in URI s3n://rjurney_public_web/from_to_date > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > > > However, I just upgraded to HIVE 0.9, and it works :) No reason to use > the old stuff when I can scp the new one up. > > Thanks! > > On Tue, May 29, 2012 at 1:34 PM, Balaji Rao <[email protected]> wrote: > >> If you are using hive on EMR, you can create a table directly from the >> data on S3: >> >> From hive, you can create tables that use S3 data like this: >> >> create external table from_to(from_address string, to_address string, >> dt string) row format delimited fields terminated by '\t' stored as >> textfile location 's3://rjurney_public_web/from_to_date'; >> >> You could then: >> select <*> from from_to >> >> Balaji >> >> On Tue, May 29, 2012 at 4:20 PM, Russell Jurney >> <[email protected]> wrote: >> > How do I load data from S3 into Hive using Amazon EMR? I've booted a >> small >> > cluster, and I want to load a 3-column TSV file from Pig into a table >> like >> > this: >> > >> > create table from_to (from_address string, to_address string, dt >> string); >> > >> > >> > When I run something like this: >> > >> > load data inpath 's3n://rjurney_public_web/from_to_date' into table >> from_to; >> > >> > >> > I get errors: >> > >> > FAILED: Error in semantic analysis: Line 1:17 Invalid path >> > 's3n://rjurney_public_web/from_to_date': only "file" or "hdfs" file >> systems >> > accepted. s3n file system is not supported. >> > >> > >> > There is no distcp on the master node of my EMR cluster, so I can't >> copy it >> > over. I've read the documentation... and so far after a day of trying, >> I >> > can't load data into HIVE via EMR. >> > >> > What am I missing? Thanks! >> > -- >> > Russell Jurney twitter.com/rjurney [email protected] >> datasyndrome.com >> > > > > -- > Russell Jurney twitter.com/rjurney [email protected] datasyndrome. > com > -- Russell Jurney twitter.com/rjurney [email protected] datasyndrome.com
