Hi, I am trying to setup a search cluster. Each node has 12 cores and 24 GBytes of memory. Distributed search works properly. However, when I stress my system with a workload driver (i.e., thread clients) my cores are 20-25% utilized (i.e., 10-12 cores out of 48). When I reduce the number of nodes to two and one my cores are utilized 50% and 80% respectively (i.e., 12 cores and 10 cores). The incoming and outgoing rates of the frontend are up to hundred MBits, so the network is not the bottleneck. The frontend is a 12-core machine and only 1 core is utilized when performing the experiment.
I would like to ask you whether I need to change some parameters in the frontend. I looked at the code but I couldn't find any useful. Thanks, -Stavros.

