string
Here's the code. If folks are interested, I can submit it as a patch as well.
Prasan Ary wrote:
Colin,Is it possible that you share some of the code with us? thx,
PrasanColin Evans [EMAIL PROTECTED] wrote:We ended up subclassing
TextInputFormat and adding
are I/O
bound, they will be able to read 100MB of data in a just a few seconds at
most. Startup time for a hadoop job is typically 10 seconds or more.
On 4/4/08 12:58 PM, Prasan Ary wrote:
I have a question on how input files are split before they are given out to
Map functions.
Say I have
Anybody ? Any thoughts why this might be happening?
Here is what is happening directly from the ec2 screen. The ID and
Secret Key are the only things changed.
I'm running hadoop 15.3 from the public ami. I launched a 2 machine
cluster using the ec2 scripts in the src/contrib/ec2/bin
Hi,
I am running hadoop 0.15.3 on 2 EC2 instances from a public ami (
ami-381df851) . Our input files are on S3.
When I try to do a distcp for an Input file from S3 onto hdfs on EC2, the
copy fails with an error that the file does not exist. However, if I run
copyToLocal from S3
That was a typo in my email. I do have s3:// in my command when it fails.
---
[EMAIL PROTECTED] wrote:
bin/hadoop distcp s3//:@/fileone.txt /somefolder_on_hdfs/fileone.txt :
Fails - Input source doesnt exist.
Should s3//... be s3://...?
Nicholas
and accessed from there.
--
Owen O'Malley [EMAIL PROTECTED] wrote:
On Mar 25, 2008, at 1:07 PM, Prasan Ary wrote:
I am running hadoop on EC2. I want to run a jar MR application on
EC2 such that input and output
I changed the configuration a little so that the MR jar file now runs on my
local hadoop cluster, but takes input files from S3.
I get the following output:
08/03/26 17:32:39 INFO mapred.FileInputFormat: Total input paths to process :
1
08/03/26 17:32:44 INFO mapred.JobClient: Running
I am running hadoop on EC2. I want to run a jar MR application on EC2 such that
input and output files are on S3.
I configured hadoop-site.xml so that fs.default.name property points to my
s3 bucket with all required identifications (eg; s3://ID:secret
key@bucket ). I created an input
I have two Map/Reduce jobs and both of them output a file each. Is there a way
I can name these output files different from the default names of part- ?
thanks.
__
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection
Hi All,
I am using eclipse to write a map/reduce java application that connects to
hadoop on remote cluster.
Is there a way I can display intermediate results of map ( or reduce) much
the same way as I would use System.out.println( variable_name) if I were
running any application on a
10 matches
Mail list logo