Re: static data sharing among map funtions

2009-09-10 Thread indoos
Hi, Here are 2 possible ways for static data sharing- 1. Using distributed cache- refer http://hadoop.apache.org/common/docs/current/mapred_tutorial.html#DistributedCache 2. Using JobConf object- http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/conf/Configuration.html#set%28java.

Re: Datanode high memory usage

2009-09-01 Thread indoos
Hi, The recommended RAM for namenode,datanode, jobtracker and tasktracker is 1 GB. The datanode would be using the major part of memory to do following- a. Continuously (at regular interval) send heartbeat messages to namenode to say 'I am live and awake' b. In case, any data/file is added to DFS,

Re: cost model for MR programs

2009-08-28 Thread indoos
Hi, My suggestion would be that we should not be compelling ourselves to compare databases with Hadoop. However, here is something not probably even close to what you may require, but might be helpful- 1. Number of nodes - these are the parameters to look for - - average time taken by a single Map

Re: Where does System.out.println() go?

2009-08-28 Thread indoos
Hi, sysout for Map Reduce should be visible in 50030 task tracker UI against the individual Map Reduce tasks for executed JOB. This UI anyways uses the individual logs created against each attempt in logs/userlogs/attempt folders. Regards,Sanjay Mark Kerzner-2 wrote: > > Hi, > > when I ru