Ravi:
Currently there's no way to avoid the map stage and the sort and
shuffle that comes with it. The only real option is to have an
identity mapper that passes the keys / values through as you're doing
now.
On Fri, Jul 9, 2010 at 4:07 PM, Chinni, Ravi wrote:
> I am trying to develop a MR appli
I am trying to develop a MR application. Due to the kind of application
I am trying to develop, the mapper is a dummy (passes it's input to it's
output) task and I am only interested in having a partitioner and
reducer.
The MR framework allows us to set the number of reducers to 0. Is there
a way
Hi Alan,
You don't need to do this complex trickery if you write to the
Sequence File. How do you create the Sequence File? In your case it might
make sense to create a Sequence File where the first object is
the file name or compete path and the second is the content.
Then you just call:
pr
Hi Alex,
My original files are ascii text. I was using
and everything worked fine.
Because my files are small (>2MB on avg.) I get one-map task per file.
For my test I had 2000 files, totalling 5GB and the whole run took
approx 40 minutes.
I read that I could improve performance by merging
Hi All
I am facing a hard problem. I am running a map reduce job using streaming
but it fails and it gives the following error.
Caught: java.lang.OutOfMemoryError: Java heap space
at Nodemapper5.parseXML(Nodemapper5.groovy:25)
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): su
Today is your last chance to submit a CFP abstract for the 2010 Surge
Scalability Conference. The event is taking place on Sept 30 and Oct 1,
2010 in Baltimore, MD. Surge focuses on case studies that address
production failures and the re-engineering efforts that led to victory
in Web Application
Did you check task tracker log and log from your reducer to see if
anythng was wrong ?
Please also capture jstack output so that we can help you diagnose.
On Friday, July 9, 2010, bmdevelopment wrote:
> Hi, I updated to the version here:
> http://github.com/kevinweil/hadoop-lzo
>
> However, when
Hi, I updated to the version here:
http://github.com/kevinweil/hadoop-lzo
However, when I use lzop for intermediate compression I
am still having trouble - the reduce phase now freezes at 99% and
eventually fails.
No immediate problem, because I can use the default codec.
But may be of concern to