On Thu, Mar 05, 2020 at 06:15:36PM +0100, Paolo Bonzini wrote: > On 05/03/20 18:08, Stefan Hajnoczi wrote: > > > > + /* > > + * Optimization: ->io_poll() handlers often contain RCU read critical > > + * sections and we therefore see many rcu_read_lock() -> > > rcu_read_unlock() > > + * -> rcu_read_lock() -> ... sequences with expensive memory > > + * synchronization primitives. Make the entire polling loop an RCU > > + * critical section because nested rcu_read_lock()/rcu_read_unlock() > > calls > > + * are cheap. > > + */ > > + RCU_READ_LOCK_GUARD(); > > + > > Looks good, but I suggest that you compile with --enable-membarrier as > that makes RCU critical sections basically free.
Interesting, --enable-membarrier decreases performance from 105k to 97k IOPS in the NVMe latency benchmark that I'm running. Stefan
signature.asc
Description: PGP signature