Re: hadoop1.2.1 speedup model

2013-09-09 Thread Robert Evans
How many times did you run the experiment at each setting? What is the standard deviation for each of these settings. It could be that you are simply running into the error bounds of Hadoop. Hadoop is far from consistent in it's performance. For our benchmarking we typically will run the test 5

Re: [VOTE] Release Apache Hadoop 0.23.9

2013-07-02 Thread Robert Evans
+1 downloaded the release. Ran a couple of simple jobs and everything worked. On 7/1/13 12:20 PM, "Thomas Graves" wrote: >I've created a release candidate (RC0) for hadoop-0.23.9 that I would like >to release. > >The RC is available at: >http://people.apache.org/~tgraves/hadoop-0.23.9-candidate

Re: mapred.child.ulimit in MR2

2013-06-19 Thread Robert Evans
Sandy, I think it was something that was missed in the port to YARN and the dead code was cleaned up as part of HADOOP-8288. If you have a use case for it or are worried about backwards compatibility we can add it back in. It is not that hard, all it did was add 'ulimt -v ' to the shell script t

Re: InputFormat to regroup splits of underlying InputFormat to control number of map tasks

2013-06-19 Thread Robert Evans
This sounds similar to MultiFileInputFormat http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/h adoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apach e/hadoop/mapred/MultiFileInputFormat.java?revision=1239482&view=markup It would be nice if you could

Re: Visual debugging tools for hadoop

2013-06-18 Thread Robert Evans
Yes data flow visualizations definitely sound like something that would be good for Ambari. If you are interested in debugging Hadoop jobs there is also the Hadoop Development Tools project http://incubator.apache.org/projects/hdt.html It is taking the Eclipse plugin for Hadoop and really improv

Re: [VOTE] Release Apache Hadoop 0.23.8

2013-05-30 Thread Robert Evans
+1 Downloaded the release and ran a few basic tests. --Bobby On 5/28/13 11:00 AM, "Thomas Graves" wrote: > >I've created a release candidate (RC0) for hadoop-0.23.8 that I would like >to release. > >This release is a sustaining release with several important bug fixes in >it. The most critica

Re: [VOTE] Plan to create release candidate for 0.23.8

2013-05-20 Thread Robert Evans
+1 On 5/17/13 4:10 PM, "Thomas Graves" wrote: >Hello all, > >We've had a few critical issues come up in 0.23.7 that I think warrants a >0.23.8 release. The main one is MAPREDUCE-5211. There are a couple of >other issues that I want finished up and get in before we spin it. Those >include HDFS-

Re: JVM vs container memory configs

2013-05-03 Thread Robert Evans
For us we typically leave a 500MB difference between the heap and the container size. I think we can make this smaller, but we have not really tried. --Bobby On 5/3/13 11:20 AM, "Karthik Kambatla" wrote: >Hi > >While looking into MAPREDUCE-5207 (adding defaults for >mapreduce.{map|reduce}.memo

Re: Heads up - 2.0.5-beta

2013-05-03 Thread Robert Evans
I agree that "destructive" is not the correct word to describe features like snapshots and windows support. However, I also agree with Konstantin that any large feature will have a destabilizing effect on the code base, even if it is done on a branch and thoroughly tested before being merged in. H

Re: Versions - Confusion

2013-04-26 Thread Robert Evans
It is kind of complex. Up until 0.20 everything was fairly regular like you would expect. In 0.20 there was a split where security was added in to a branch and started to be numbered as 0.20.20X. But the other releases went on without security and became 0.21 and 0.22. 0.23 was created when YAR

Re: [VOTE] Release Apache Hadoop 2.0.4-alpha

2013-04-17 Thread Robert Evans
+1 (binding) Downloaded the tar ball and ran some simple jobs. --Bobby Evans On 4/17/13 2:01 PM, "Siddharth Seth" wrote: >+1 (binding) >Verified checksums and signatures. >Built from the source tar, deployed a single node cluster and tested a >couple of simple MR jobs. > >- Sid > > >On Fri, Ap

Re: [VOTE] Release Apache Hadoop 0.23.7

2013-04-16 Thread Robert Evans
+1 (binding) I downloaded the release and ran a few sanity tests on it. --Bobby On 4/11/13 2:55 PM, "Thomas Graves" wrote: >I've created a release candidate (RC0) for hadoop-0.23.7 that I would like >to release. > >This release is a sustaining release with several important bug fixes in >it. >

Re: Help on submitting a patch for an unassigned bug

2013-03-26 Thread Robert Evans
Also be aware that sometimes committers don't notice that a patch is not in patch available, so if you need a review and no one has started reviewing it, please send an e-mail to the dev list and we will do our best to take a look at it. --Bobby On 3/26/13 5:28 AM, "Harsh J" wrote: >Hi Niranjan

Re: [Vote] Merge branch-trunk-win to trunk

2013-02-28 Thread Robert Evans
out is an interesting one >>-- >> ie the idea that we would not merge windows support to trunk, but rather >> treat is as a "parallel code line" which lives in the ASF and has its >>own >> builds and releases. The windows team would periodically merge >>tru

Re: [Vote] Merge branch-trunk-win to trunk

2013-02-27 Thread Robert Evans
After this is merged in is Windows still going to be a second class citizen but happens to work for more than just development or is it a fully supported platform where if something breaks it can block a release? How do we as a community intend to keep Windows support from breaking? We don't have

Re: tests in mapreduce.lib excluded in jenkins?

2013-02-26 Thread Robert Evans
All of the pre-commit builds only run tests for the projects that had changes. This is a known issue, but was done because the pre-commit builds were taking a very long time. There have been a few proposals to improve the situation, like having any change in map/reduce run all of the map/reduce t

timeout is now requested to be on all tests

2013-02-20 Thread Robert Evans
Sorry about cross posting, but this will impact all developers and I wanted to give you all a heads-up. HADOOP-9112 was just checked it. This means that the pre commit build will now give a –1 for any patch with junit tests that do not include

Re: Doubt about map reduce version 2

2013-02-08 Thread Robert Evans
Suresh, The 1.0 line is still the stable line and improvements there can have a large impact on existing users. That being said I think there will be a lot of movement to Yarn/MRv2 starting in the second half of this year and all of next year. Also YARN scheduling is a larger area for study beca

Re: [VOTE] Release hadoop-2.0.3-alpha

2013-02-07 Thread Robert Evans
I downloaded the binary package and ran a few example jobs on a 3 node cluster. Everything seems to be working OK on it, I did see WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable For every shell command, but just l

Re: One output file per node

2012-12-13 Thread Robert Evans
Tejay, The way the scheduler works you are not guaranteed to get one reducer per node. Reducers are not scheduled based off of locality of any kind, and even if they were the scheduler typically treats rack local the same as node local. The partitioner interface only allows you to say what numer

Re: Shuffle phase: fine-grained control of data flow

2012-11-08 Thread Robert Evans
tleneck, because the time consumed in >disk seeks outnumbers that in data transmission. If map outputs fit in >memory, then network must be taken seriously. Also note that for evenly >distributed map outputs, current scheduling policy works just fine. > >Jiwei > > >On Wed, Nov

Re: Shuffle phase: fine-grained control of data flow

2012-11-07 Thread Robert Evans
Jiwei, I think you could use that knowledge to launch reducers closer to the map output, but I am not sure that it would make much difference. It may even slow things down. It is a question of several things 1) Can we get enough map tasks close to one another that it will make a difference? 2) D

Re: division by zero in getLocalPathForWrite()

2012-10-25 Thread Robert Evans
It looks like you are running with an older version of 2.0, even though it does not really make much of a difference in this case, The issue shows up when getLocalPathForWrite thinks there is no space on to write to on any of the disks it has configured. This could be because you do not have any

Re: pluggable resources

2012-10-22 Thread Robert Evans
I agree that having it be pluggable opens up a lot of new possibilities. +1 for the idea. Although I think in the short term we are having enough problems as it is with just CPU and memory that it may be a little while before we get to a pluggable solution. Once YARN-2 goes in, if you can get an

Re: Can some committer commit MAPREDUCE-4479?

2012-10-19 Thread Robert Evans
Looking at it now. Thanks for being a squeaky wheel :). --Bobby On 10/19/12 12:38 PM, "Mariappan Asokan" wrote: >This is a minor test update I made a while back. > >Thanks.

Re: API Design: getClusterMetrics

2012-10-10 Thread Robert Evans
Sapsi, I am not positive on this but I think the reason for this is to future proof the API. If in the future we want to add in new optional parameters, I believe that it is impossible in protocol buffers without having that request object. I could be wrong I am not an expert on PB. --Bobby Evan

Re: Fix versions for commits branch-0.23

2012-10-09 Thread Robert Evans
I don't see much of a reason to have the same JIRA listed under both 0.23 and 2.0. I can see some advantage of being able to see what went into 0.23.X by looking at a 2.0.X CHANGES.txt, but unless the two are released at exactly the same time they will be out of date with each other in the best ca

Re: Commits breaking compilation of MR 'classic' tests

2012-09-26 Thread Robert Evans
MR 'classic' tests Fair, however there are still tests which need to be ported over. We can remove them after the port. On Sep 26, 2012, at 9:54 AM, Robert Evans wrote: As per my comment on the bug. I though we were going to remove them. MAPREDUCE-4266 only needs a little bit more wo

Re: Commits breaking compilation of MR 'classic' tests

2012-09-26 Thread Robert Evans
As per my comment on the bug. I though we were going to remove them. MAPREDUCE-4266 only needs a little bit more work, change a patch to a script, before they disappear entirely. I would much rather see dead code die then be maintained for a few tests that are mostly testing the dead code itself

Re: Speculative Execution...

2012-09-13 Thread Robert Evans
Under YARN (branch-2, branch-0.23, and trunk) the speculative execution decision is pluggable, and can be replaced by a user. If you could come up with a better solution to speculative execution that would be great. We have known for a while that it is not very good (most of the time we run a spec

Re: On the topic of task scheduling

2012-09-04 Thread Robert Evans
ly AM would also adjust its timing of requests as well so both work together for a common goal. --Bobby Evans On 9/4/12 8:59 AM, "Vasco Visser" wrote: >On Tue, Sep 4, 2012 at 3:11 PM, Robert Evans wrote: >> The other thing to point out too is that in order to solve this

Re: On the topic of task scheduling

2012-09-04 Thread Robert Evans
The other thing to point out too is that in order to solve this problem perfectly you litterly have to solve the halting problem. You have to predict if the maps are going to finish quickly or slowly. If they finish quickly then you want to launch reduces quickly to start fetching data from the m

Re: Cannot create a new Jira issue for MapReduce

2012-08-09 Thread Robert Evans
It is a bit worse then that though. I found that it did create the JIRA, but it is in a bad state where you cannot put it in patch available or close it. So we may need to do some cleanup of these JIRAs later. --Bobby On 8/9/12 3:19 PM, "Ted Yu" wrote: >This has been reported by HBase develope

Re: Multi-level aggregation with combining the result of maps per node/rack

2012-07-31 Thread Robert Evans
Tsuyoshi, There has been a lot of work happening in the shuffle phase. It is being made pluggable in both 1.0 and 2.0/trunk (MAPREDUCE-4049). There is also some work being done to reuse containers in trunk/2.0 (MAPREDUCE-3902). This should have a similar, although perhaps more limited result, b

Re: Can we use String.intern inside WritableUtils#readString()?

2012-07-13 Thread Robert Evans
Yes I filed a JIRA for something like this a while ago MAPREDUCE-4303. I have not done anything with it for this very reason. There are some potential fixes for this, we could keep a somewhat small weak reference cache of these strings so that if a string is read multiple times it is dedupped and

Re: Cyclic dependency in JobControl job DAG

2012-06-25 Thread Robert Evans
I personally think it is useful. I would say contribute it. (Moved common-dev to bcc, we try not to cross post on these lists) --Bobby Evans On 6/25/12 3:37 AM, "madhu phatak" wrote: Hi, In current implementation of JobControl, whenever there is a cyclic dependency between the jobs it throws

Re: try to fix hadoop streaming bug

2012-06-14 Thread Robert Evans
It looks like your jar's MANIFEST file is missing the Main Class attribute. It may have something to do with how you created the updated jar you are using. Hadoop is trying to run the jar, and because it did not find the MainClass in the jar's manifest it thinks you are supplying it as the nex

Re: Hadoop optimization for Lustre FS

2012-05-16 Thread Robert Evans
Zam, http://wiki.apache.org/hadoop/HowToContribute is a wiki that can tell you in more detail the steps you need to do for this. In general though to push the patch upstream you want to file a Map/Reduce JIRA, and attach your patch. After that several people from the community are likely to co

Re: Building first time

2012-05-09 Thread Robert Evans
http://wiki.apache.org/hadoop/HowToContribute is the best place to start. Checking the code in through git will not trigger a jenkins build, unless you have a special setup that goes beyond Apache provides. You do not need to compile the entire tree to get Map/Reduce, but typically it is not a

Hadoop 3 precommit build issues

2012-04-13 Thread Robert Evans
I did a very non-scientific study and it looks like hadoop 3 is having issues with running the precommit build. It looks like some processes never died and are preventing the mini clusters from binding to needed sockets. Could someone please take a look. -1 core tests. The patch failed t

Re: [RESULT] - [VOTE] Rename hadoop branches post hadoop-1.x

2012-04-12 Thread Robert Evans
Steve, Todd is correct, we are running two yarn trains here at Yahoo. We are trying to stabilize 0.23 and get it pushed out to production, while also working on stabilizing branch-2. Once branch-2 truly stabilizes we will switch over to it and retire branch-0.23. We may call for a vote on a

Mixed Mode Environments

2012-02-02 Thread Robert Evans
I just noticed that HADOOP-7484 and MAPREDUCE-3500 recently got committed to trunk and 0.23. I missed them before they were committed. I am curious if we are dropping support for running Hadoop in mixed mode environments? Meaning I want Hadoop to run as 32-bit by default, because that is faster t

Re: Status of the completed containers (0.23)

2012-01-09 Thread Robert Evans
Praveen, Looking at the code, it does not appear to currently be used outside of testing. I really don't know. Perhaps in the future if it is extended then it might be used more. Or perhaps the author of the API added it in for completeness. Just speculating. --Bobby Evans On 1/9/12 7:42 A

Re: Reduce output is strange

2011-12-19 Thread Robert Evans
Oh I forgot to say that part of the Random Characters are actually random characters. Sequence files store a set of random characters as synch points within the file. This allows for splitting the file easily without a high risk that the random sequence appears inside the data itself just by c

Re: Reduce output is strange

2011-12-19 Thread Robert Evans
It looks mostly correct to me. I am not an expert on sequence files, and I have not checked the text against the spec nor have I checked the binary numbers in it to be sure they add up to the correct lengths etc, but it looks good from a first glance. I can see the SEQ tag at the beginning to

Re: Multiple resource requests for a given node (or all nodes)?

2011-12-13 Thread Robert Evans
Arun, I am saying that I don't know what the correct solution is to updating the scheduler interface. Perhaps the correct solution is no change, I have not taken the time to think about it much. What I am saying is that there are a number of new features that are likely going to be going into

Re: Multiple resource requests for a given node (or all nodes)?

2011-12-12 Thread Robert Evans
I think there may be some need for a bigger redesign in how requests are made to the scheduler because the only use case really was map/reduce at the time it was designed. It works very well for that purpose but has missed a few other use cases. For example there could be something like HBase

Re: mvn eclipse:eclipse failing

2011-12-12 Thread Robert Evans
Ahmed, It is a known issue HDFS-2649 --Bobby Evans On 12/12/11 4:29 AM, "Ahmed Radwan" wrote: After checking the latest trunk and trying: mvn eclipse:eclipse I am getting the following error: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-eclipse-plugin:2.8:eclipse (default-cl

Re: Incremental builds in 0.23 using Maven

2011-12-07 Thread Robert Evans
Praveen, One thing to be aware of with removing the clean is that I have run into situations, in both hadoop and in other projects, where an API changed as part of the update or something and maven did not realize it and did not rebuild something that depended on it. I then got a runtime error

Re: Automatically Documenting Apache Hadoop Configuration

2011-12-05 Thread Robert Evans
>From my work on yarn trying to document the configs there and to standardize >them, writing anything that is going to automatically detect config values >through static analysis is going to be very difficult. This is because most >of the configs in yarn are now built up using static string con

Re: Start Nodemanager with webapp disabled.

2011-10-05 Thread Robert Evans
The simplest way is to use ephemeral ports. Set the port number to 0 in the config and the node manager will pick a free port to listen on. It will then heartbeat back into the Resource Manager with the port it is listening on and the RM can pass that info off to whoever else needs it. I am n

Re: Regarding 'branch-0.20-security'

2011-09-28 Thread Robert Evans
It is kind of a long history and I will try to leave out all of the politics involved to make it shorter. For a long time 0.20 has been the stable release of Hadoop. It is supposedly in sustaining releases now, but many new features keep going in because that is what most people use in product

Re: RecommenderJob Mahout Creating a data model

2011-09-14 Thread Robert Evans
This should probably be directed more toward the Mahout list then the Hadoop Map/reduce one. mahout-u...@apache.org --Bobby Evans On 9/14/11 6:28 AM, "Amit Sangroya" wrote: Hi all, I am trying to run the example from https://cwiki.apache.org/confluence/display/MAHOUT/Itembased+Collaborative+

500 error in review board

2011-09-12 Thread Robert Evans
Whenever I try to post a new patch to review board I get a 500 error. Something broke! (Error 500) It appears something broke when you tried to go to here. This is either a bug in Review Board or a server configuration error. Please report this to your administrator. Who should I talk to/rep

Re: Research projects for hadoop

2011-09-09 Thread Robert Evans
The biggest issue with Xen and other virtualization technologies is that often there is an IO penalty involved with using them. For many jobs this is not an acceptable trade off. I do know, however, that there has been some discussion about using Linux Containers for isolation of Map/Reduce pr

Re: MAPREDUCE-2864 Has been merged to trunk and 0.23

2011-09-09 Thread Robert Evans
A quick update. I found a bug in the script, and it has now been fixed. Please use this script instead. https://issues.apache.org/jira/secure/attachment/12493787/update.pl --Bobby Evans On 9/9/11 8:53 AM, "Robert Evans" wrote: > MAPREDCUE-2864 was an effort to rename and reorga

MAPREDUCE-2864 Has been merged to trunk and 0.23

2011-09-09 Thread Robert Evans
MAPREDCUE-2864 was an effort to rename and reorganize the YARN configuration parameters to make them consistent. If you are setting anything in your yarn-site.xml then you will need to update your configuration. The patch did not provide backwards compatible mappings because there has never been

Re: MRv1 in 0.23+

2011-09-07 Thread Robert Evans
There is a MiniYarnCluster and a MiniMRYarnCluster, it is just that the tests have not been ported over to use them yet. --Bobby On 9/7/11 2:01 PM, "Eli Collins" wrote: My understanding is that the MR1 code is currently needed to run the tests because there is no Mini MR cluster for MR2. So t

Re: Jenkins's Links to FindBugs warnings not useful

2011-09-02 Thread Robert Evans
You can do mvn findbugs:gui and then open up each of the findbugsXml.xml files manually. Or you should be able to run mvn site to generate HTML. You may need to modify the pom.xml file to include findbugs in the report section though. On 9/2/11 9:38 AM, "Vinod Kumar Vavilapalli" wrote: Oh,

Re: Get Hadoop 0.24.0-SNAPSHOT ready for Eclipse fails on retrieve hadoop-yarn-common jar

2011-09-02 Thread Robert Evans
I believe that if you take off the -e then it will work. If not run mvn eclipse:clean and then mvn eclipse:eclipse. It worked for me yesterday. --Bobby On 9/2/11 4:53 AM, "Mario Pastorelli" wrote: Hi all, I'm trying to download and prepare Hadoop trunk to be used on Eclipse using https://wi

Re: Trunk and 0.23 build failing with clean .m2 directory

2011-08-29 Thread Robert Evans
a jira. It should be a minor change: thanks mahadev On Mon, Aug 29, 2011 at 10:34 AM, Robert Evans wrote: > Thanks Alejandro, > > That really clears things up. Is the a JIRA you know of to change test-patch > to do mvn test -DskipTests instead of mvn compile? If not I can file

Re: which Eclipse plugin to use for Maven?

2011-08-29 Thread Robert Evans
Jim, The m2 plugin replaces the normal eclipse build system with maven. If you want to use M2 then you don't need to run mvn eclipse:eclipse at all. What mvn eclipse:eclipse does is it generates source code, and produces a .project and .classpath so that eclipse can use it's normal build syst

Re: Trunk and 0.23 build failing with clean .m2 directory

2011-08-29 Thread Robert Evans
ore. Sometimes I had to descend into child > directories to mvn install them, before I could maven install parents. I'm > hoping/guessing that issue is fixed now > > On Mon, Aug 29, 2011 at 11:39 AM, Robert Evans > wrote: > > > Wow this is odd install works just fin

Re: Trunk and 0.23 build failing with clean .m2 directory

2011-08-29 Thread Robert Evans
home: /home/evans/bin/jdk1.6.0/jre Default locale: en_US, platform encoding: UTF-8 OS name: "linux", version: "2.6.18-238.12.1.el5", arch: "i386", family: "unix" Has anyone else seen this, or is there something messed up with my machine? Thanks, Bobby On 8/2

Trunk and 0.23 build failing with clean .m2 directory

2011-08-29 Thread Robert Evans
I am getting the following errors when I try to build either trunk or 0.23 with a clean maven cache. I don't get any errors if I use my old cache. [INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ hadoop-yarn-common --- [INFO] Compiling 2 source files to /home/evans/src/hadoop-gi

Re: DistCpV2 in 0.23

2011-08-26 Thread Robert Evans
I agree with Mithun. They are related but this goes beyond distcpv2 and should not block distcpv2 from going in. It would be very nice, however, to get the layout settled soon so that we all know where to find something when we want to work on it. Also +1 for Alejandro's I also prefer to keep

Re: Picking up local common changes in mr

2011-08-19 Thread Robert Evans
One thing to be aware of is that with -SNAPSHOT at the end of the version Maven will start looking at dates. So if you have a 0.23.0-SNAPSHOT that you personally modified/built in your .m2 repository and go to build something that depends on it. If the nightly build has pushed it to the apache

Re: Notes for working on mapreduce trunk after the MR-279 merge.

2011-08-18 Thread Robert Evans
It looks like git has not seen the changes yet, even though the last change was over 90 mins ago. Is there any way to kick git to pull in the changes sooner so I can rebase? Thanks, Bobby Evans On 8/18/11 7:49 AM, "Vinod Kumar Vavilapalli" wrote: MR-279 branch is merged into mapreduce trunk

Review Request: MAPREDUCE-2324 Job should fail if a reduce task can't be scheduled anywhere

2011-07-21 Thread Robert Evans
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1164/ --- Review request for hadoop-mapreduce, Todd Lipcon, Tom Graves, and Jeffrey Naisbit

Re: Problem while running eclipse-files for Next Gen Mapreduce branch

2011-07-08 Thread Robert Evans
hing else we need to add/update. Josh On Fri, Jul 8, 2011 at 7:25 AM, Robert Evans wrote: > I mapreduce/INSTALL also has some important information in it, and be aware > that you do not have to install the avro plugin any more. Maven can > download it and install it automatically

Re: Problem while running eclipse-files for Next Gen Mapreduce branch

2011-07-08 Thread Robert Evans
I mapreduce/INSTALL also has some important information in it, and be aware that you do not have to install the avro plugin any more. Maven can download it and install it automatically now, but the README was never updated. Also be sure to install protocol buffers. The build will fail without

Re: MR1 next steps

2011-07-08 Thread Robert Evans
+1 for #2, So long as there are no feature regressions, as was talked about in a different thread. --Bobby On 7/7/11 7:20 PM, "Luke Lu" wrote: On Thu, Jul 7, 2011 at 9:58 AM, Eli Collins wrote: > I think #2 makes the most sense. +1. Supporting legacy servers in 0.23 is really redundant, when

Re: Building and Deploying MRv2

2011-06-16 Thread Robert Evans
Forrest requires java5 unless you are using the beta of forrest, what has been in beta for a few years, which can run on java6 --Bobby On 6/16/11 3:34 PM, "Thomas Graves" wrote: I know at one time maven 3.x didn't work so I've been using maven 2.x. Well I've never tried using java6 for java

Re: Reg ChainReducer usage

2011-06-02 Thread Robert Evans
Moving to mapreduce user. Ravi, The issue is with the shuffle. The chain reducer cannot re-shuffle the output of a previous reducer. If you want that then you need to run a second reduce only job. Instead usually the chain reducer would have a single reducer followed by 0 or more mappers, t