We are deploying Hadoop on Demand for a university research environment. Due to the nature of infrastructure, where other computationally intensive processes apart from HOD might be running on each machine. Each of this machine has a limited amount of disk storage available.
Would large cluster each machine with a single dedicated core be more efficient in comparison to multiple cores on fewer machines? Since it is HOD, I think network IO would be the same for both. Any help would be appreciated.
