On Jul 12, 2011, at 10:27 AM, Virajith Jalaparti wrote: > I agree that the scheduler has lesser leeway when the replication factor is > 1. However, I would still expect the number of data-local tasks to be more > than 10% even when the replication factor is 1.
How did you load your data? Did you load it from outside the grid or from one of the datanodes? If you loaded from one of the datanodes, you'll basically have no real locality, especially with a rep factor of 1.