I have been playing with mapreduce.tasktracker.map.tasks.maximum to reduce the 
load
on my Cassandra cluster (using the Cassandra ColumnFamilyInputFormat).  I'd 
like to find ways of throttling the map operations
in the case I may be affecting OLTP activity on the cluster.

What parameters can I use to limit the number of map tasks running concurrently 
across the whole cluster?  mapreduce.tasktracker.map.tasks.maximum 
limits the number of concurrent maps per task tracker.  But can i do this at 
the job level? 

Should I look at the "fair" scheduler?

regards,Michael

Reply via email to