Hello, here are some news... Yesterday I repeated the tests using only the java request sampler in the scenario, without variables, without sleep time (next thing to try) and without listeners.
This way CPU efficiency is better: after sometime after the beginning of the test, CPU starts to drop to about 90-80-70%. So I was able to raise the thread parallelism, up to 2-3 with some effects on the throughput. With that configuration, I have reached about 2600 q/s.

