i have a MR job to read file on amazon S3 and process the data on local hdfs. 
the files are zipped text file as .gz. i tried to setup the job as below but it 
won't work, anyone know what might be wrong? do i need to add extra step to 
unzip the file first? thanks.


String S3_LOCATION = "s3n://access_key:private_key@bucket_name"

protected void prepareHadoopJob() throws Exception {

    this.getHadoopJob().setMapperClass(Mapper1.class);
    this.getHadoopJob().setInputFormatClass(TextInputFormat.class);

    FileInputFormat.addInputPath(this.getHadoopJob(), new Path(S3_LOCATION));

    this.getHadoopJob().setNumReduceTasks(0);
    this.getHadoopJob().setOutputFormatClass(TableOutputFormat.class);
    this.getHadoopJob().getConfiguration().set(TableOutputFormat.OUTPUT_TABLE, 
myTable.getTableName());
    this.getHadoopJob().setOutputKeyClass(ImmutableBytesWritable.class);
    this.getHadoopJob().setOutputValueClass(Put.class);
}



[cid:73FA9081-776F-4031-93E2-EFC1A9FEAD76]
Dan Yi | Software Engineer, Analytics Engineering
Medio Systems Inc | 701 Pike St. #1500 Seattle, WA 98101
Predictive Analytics for a Connected World

<<inline: medio.gif>>

Reply via email to