Hi, I am using a hadoop map reduce job + mongoDb. It goes against a data base 252Gb big. During the job the amount of conexions is over 8000 and we gave already 9Gb RAM. The job is still crashing because of a OutOfMemory with only a 8% of the mapping done. Are this numbers normal? Or did we miss something regarding configuration? I attach my code, just in case the problem is with it.
Mapper: public class AveragePriceMapper extends Mapper<Object, BasicDBObject, Text, BSONWritable> { @Override public void map(final Object key, final BasicDBObject val, final Context context) throws IOException, InterruptedException { String id = ""; for(String propertyId : currentId.split(AveragePriceGlobal.SEPARATOR)){ id += val.get(propertyId) + AveragePriceGlobal.SEPARATOR; } BSONWritable bsonWritable = new BSONWritable(val); context.write(new Text(id), bsonWritable); } } Reducer: public class AveragePriceReducer extends Reducer<Text, BSONWritable, Text, Text> { public void reduce(final Text pKey, final Iterable<BSONWritable> pValues, final Context pContext) throws IOException, InterruptedException { while(pValues.iterator().hasNext() && continueLoop){ BSONWritable next = pValues.iterator().next(); //Make some calculations } pContext.write(new Text(currentId), new Text(new MyClass(currentId, AveragePriceGlobal.COMMENT, 0, 0).toString())); } } The configuration includes a query which filters the number of objects to analyze (not the 252Gb will be analyzed). Many thanks. Best regards, Blanca