Hallo, Based on experiences with other software in virtualized environments I cannot really recommend this. However, I am not sure how Spark reacts. You may face unpredictable task failures depending on utilization, tasks connecting to external systems (databases etc.) may fail unexpectedly and this might be a problem for them (transactions not finishing etc.).
Why not increase the tasks per core? Best regards Le 9 janv. 2015 06:46, "Xuelin Cao" <xuelincao2...@gmail.com> a écrit : > > Hi, > > I'm wondering whether it is a good idea to overcommit CPU cores on > the spark cluster. > > For example, in our testing cluster, each worker machine has 24 > physical CPU cores. However, we are allowed to set the CPU core number to > 48 or more in the spark configuration file. As a result, we are allowed to > launch more tasks than the number of physical CPU cores. > > The motivation of overcommit CPU cores is, for many times, a task > cannot consume 100% resource of a single CPU core (due to I/O, shuffle, > etc.). > > So, overcommit the CPU cores allows more tasks running at the same > time, and makes the resource be used economically. > > But, is there any reason that we should not doing like this? Anyone > tried this? > > [image: Inline image 1] > > >