On 5/16/16 5:58 PM, Paul E. McKenney wrote:
On Mon, May 16, 2016 at 12:49:41PM -0700, Santosh Shilimkar wrote:
On 5/16/2016 10:34 AM, Paul E. McKenney wrote:
On Mon, May 16, 2016 at 09:33:57AM -0700, Santosh Shilimkar wrote:

[...]

Are you running CONFIG_NO_HZ_FULL=y?  If so, the problem might be that
you need more housekeeping CPUs than you currently have configured.

Yes, CONFIG_NO_HZ_FULL=y. Do you mean "CONFIG_NO_HZ_FULL_ALL=y" for
book keeping. Seems like without that clock-event code will just use
CPU0 for things like broadcasting which might become bottleneck.
This could explain connect the hrtimer_interrupt() path getting slowed
down because of book keeping bottleneck.

$cat .config | grep NO_HZ
CONFIG_NO_HZ_COMMON=y
# CONFIG_NO_HZ_IDLE is not set
CONFIG_NO_HZ_FULL=y
# CONFIG_NO_HZ_FULL_ALL is not set
# CONFIG_NO_HZ_FULL_SYSIDLE is not set
CONFIG_NO_HZ=y
# CONFIG_RCU_FAST_NO_HZ is not set

Yes, CONFIG_NO_HZ_FULL_ALL=y would give you only one CPU for all
housekeeping tasks, including the RCU grace-period kthreads.  So you are
booting without any nohz_full boot parameter?  You can end up with the
same problem with CONFIG_NO_HZ_FULL=y and the nohz_full boot parameter
that you can with CONFIG_NO_HZ_FULL_ALL=y.

I see. Yes, the systems are booting without nohz_full boot parameter.
Will try to add more CPUs to it & update the thread
after the verification since it takes time to reproduce the issue.

Thanks for discussion so far Paul. Its very insightful for me.

Regards,
Santosh

Reply via email to