Re: [PATCH] debugobjects: scale the static pool size
On 11/20/2018 10:12 AM, Qian Cai wrote: > >> On Nov 20, 2018, at 8:50 AM, Waiman Long wrote: >> >> On 11/20/2018 01:42 AM, Qian Cai wrote: >>> The current value of the early boot static pool size is not big enough >>> for systems with large number of CPUs with timer or/and workqueue >>> objects selected. As the results, systems have 60+ CPUs with both timer >>> and workqueue objects enabled could trigger "ODEBUG: Out of memory. >>> ODEBUG disabled". Hence, fixed it by computing it according to >>> CONFIG_NR_CPUS and CONFIG_DEBUG_OBJECTS_* options. >>> >>> Signed-off-by: Qian Cai >>> --- >>> lib/debugobjects.c | 53 +- >>> 1 file changed, 52 insertions(+), 1 deletion(-) >>> >>> diff --git a/lib/debugobjects.c b/lib/debugobjects.c >>> index 70935ed91125..372dc34206d5 100644 >>> --- a/lib/debugobjects.c >>> +++ b/lib/debugobjects.c >>> @@ -23,7 +23,53 @@ >>> #define ODEBUG_HASH_BITS14 >>> #define ODEBUG_HASH_SIZE(1 << ODEBUG_HASH_BITS) >>> >>> +/* >>> + * Some debug objects are allocated during the early boot. Enabling some >>> + * options like timers or workqueue objects may increase the size required >>> + * significantly with large number of CPUs. For example, >>> + * >>> + * No. CPUs x 2 (worker pool) objects: >>> + * >>> + * start_kernel >>> + * workqueue_init_early >>> + * init_worker_pool >>> + * init_timer_key >>> + * debug_object_init >>> + * >>> + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): >>> + * >>> + * sched_init >>> + * hrtick_rq_init >>> + * hrtimer_init >>> + * >>> + * CONFIG_DEBUG_OBJECTS_WORK: >>> + * No. CPUs x 6 (workqueue) objects: >>> + * >>> + * workqueue_init_early >>> + * alloc_workqueue >>> + * __alloc_workqueue_key >>> + * alloc_and_link_pwqs >>> + * init_pwq >>> + * >>> + * Also, plus No. CPUs objects: >>> + * >>> + * perf_event_init >>> + *__init_srcu_struct >>> + * init_srcu_struct_fields >>> + *init_srcu_struct_nodes >>> + * __init_work >>> + * >>> + * Increase the number a bit more in case the implmentatins are changed in >>> + * the future. >>> + */ >>> +#if defined(CONFIG_NR_CPUS) && defined(CONFIG_DEBUG_OBJECTS_TIMERS) && \ >>> +!defined(CONFIG_DEBUG_OBJECTS_WORK) >>> +#define ODEBUG_POOL_SIZE (CONFIG_NR_CPUS * 10) >>> +#elif defined(CONFIG_NR_CPUS) && defined(CONFIG_DEBUG_OBJECTS_WORK) >>> +#define ODEBUG_POOL_SIZE (CONFIG_NR_CPUS * 30) >>> +#else >>> #define ODEBUG_POOL_SIZE1024 >>> +#endif /* CONFIG_NR_CPUS */ >>> #define ODEBUG_POOL_MIN_LEVEL 256 >>> >> CONFIG_NR_CPUS is always defined. You don't need to put that as a #if >> condition. Where does the scaling factor 30 come from? It looks high to me. > Hmm, looks like some architectures could have it undefined since it depends > on CONFIG_SMP where the later can be disabled. For example alpha, > > config NR_CPUS > int "Maximum number of CPUs (2-32)" > range 2 32 > depends on SMP include/linux/threads.h: #ifndef CONFIG_NR_CPUS /* FIXME: This should be fixed in the arch's Kconfig */ #define CONFIG_NR_CPUS 1 #endif > Scaling factor 30 came from the data, with all the debug_objects options > enabled, I have, > > 64-CPU: ODEBUG: 1114 of 1114 active objects replaced > 256-CPU: ODEBUG: 4378 of 4378 active objects replaced > > I also give a bit room for growth in the future since the implementation > details > could always change. (4378-1114)/(256-64) = 17 So the max scaling factor is 17. I would say you could round it up to 20 at most. Cheers, Longman
Re: [PATCH] debugobjects: scale the static pool size
> On Nov 20, 2018, at 8:50 AM, Waiman Long wrote: > > On 11/20/2018 01:42 AM, Qian Cai wrote: >> The current value of the early boot static pool size is not big enough >> for systems with large number of CPUs with timer or/and workqueue >> objects selected. As the results, systems have 60+ CPUs with both timer >> and workqueue objects enabled could trigger "ODEBUG: Out of memory. >> ODEBUG disabled". Hence, fixed it by computing it according to >> CONFIG_NR_CPUS and CONFIG_DEBUG_OBJECTS_* options. >> >> Signed-off-by: Qian Cai >> --- >> lib/debugobjects.c | 53 +- >> 1 file changed, 52 insertions(+), 1 deletion(-) >> >> diff --git a/lib/debugobjects.c b/lib/debugobjects.c >> index 70935ed91125..372dc34206d5 100644 >> --- a/lib/debugobjects.c >> +++ b/lib/debugobjects.c >> @@ -23,7 +23,53 @@ >> #define ODEBUG_HASH_BITS 14 >> #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS) >> >> +/* >> + * Some debug objects are allocated during the early boot. Enabling some >> + * options like timers or workqueue objects may increase the size required >> + * significantly with large number of CPUs. For example, >> + * >> + * No. CPUs x 2 (worker pool) objects: >> + * >> + * start_kernel >> + * workqueue_init_early >> + * init_worker_pool >> + * init_timer_key >> + * debug_object_init >> + * >> + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): >> + * >> + * sched_init >> + * hrtick_rq_init >> + * hrtimer_init >> + * >> + * CONFIG_DEBUG_OBJECTS_WORK: >> + * No. CPUs x 6 (workqueue) objects: >> + * >> + * workqueue_init_early >> + * alloc_workqueue >> + * __alloc_workqueue_key >> + * alloc_and_link_pwqs >> + * init_pwq >> + * >> + * Also, plus No. CPUs objects: >> + * >> + * perf_event_init >> + *__init_srcu_struct >> + * init_srcu_struct_fields >> + *init_srcu_struct_nodes >> + * __init_work >> + * >> + * Increase the number a bit more in case the implmentatins are changed in >> + * the future. >> + */ >> +#if defined(CONFIG_NR_CPUS) && defined(CONFIG_DEBUG_OBJECTS_TIMERS) && \ >> +!defined(CONFIG_DEBUG_OBJECTS_WORK) >> +#define ODEBUG_POOL_SIZE(CONFIG_NR_CPUS * 10) >> +#elif defined(CONFIG_NR_CPUS) && defined(CONFIG_DEBUG_OBJECTS_WORK) >> +#define ODEBUG_POOL_SIZE(CONFIG_NR_CPUS * 30) >> +#else >> #define ODEBUG_POOL_SIZE 1024 >> +#endif /* CONFIG_NR_CPUS */ >> #define ODEBUG_POOL_MIN_LEVEL256 >> > > CONFIG_NR_CPUS is always defined. You don't need to put that as a #if > condition. Where does the scaling factor 30 come from? It looks high to me. Hmm, looks like some architectures could have it undefined since it depends on CONFIG_SMP where the later can be disabled. For example alpha, config NR_CPUS int "Maximum number of CPUs (2-32)" range 2 32 depends on SMP Scaling factor 30 came from the data, with all the debug_objects options enabled, I have, 64-CPU: ODEBUG: 1114 of 1114 active objects replaced 256-CPU: ODEBUG: 4378 of 4378 active objects replaced I also give a bit room for growth in the future since the implementation details could always change. > > For UP system, CONFIG_NR_CPUS will be 1. I think it is better to have a > guarantee minimum plus a multiplier of the # of configured CPUs. > Something like > > 512 + CONFIG_NR_CPUS * > > where should be the sum of all early allocation objects > that scale with the number of cpus. The guarantee minimum will cover > other miscellaneous objects. That is a good catch. I’ll fix that.
Re: [PATCH] debugobjects: scale the static pool size
On 11/20/2018 01:42 AM, Qian Cai wrote: > The current value of the early boot static pool size is not big enough > for systems with large number of CPUs with timer or/and workqueue > objects selected. As the results, systems have 60+ CPUs with both timer > and workqueue objects enabled could trigger "ODEBUG: Out of memory. > ODEBUG disabled". Hence, fixed it by computing it according to > CONFIG_NR_CPUS and CONFIG_DEBUG_OBJECTS_* options. > > Signed-off-by: Qian Cai > --- > lib/debugobjects.c | 53 +- > 1 file changed, 52 insertions(+), 1 deletion(-) > > diff --git a/lib/debugobjects.c b/lib/debugobjects.c > index 70935ed91125..372dc34206d5 100644 > --- a/lib/debugobjects.c > +++ b/lib/debugobjects.c > @@ -23,7 +23,53 @@ > #define ODEBUG_HASH_BITS 14 > #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS) > > +/* > + * Some debug objects are allocated during the early boot. Enabling some > + * options like timers or workqueue objects may increase the size required > + * significantly with large number of CPUs. For example, > + * > + * No. CPUs x 2 (worker pool) objects: > + * > + * start_kernel > + * workqueue_init_early > + * init_worker_pool > + * init_timer_key > + * debug_object_init > + * > + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): > + * > + * sched_init > + * hrtick_rq_init > + * hrtimer_init > + * > + * CONFIG_DEBUG_OBJECTS_WORK: > + * No. CPUs x 6 (workqueue) objects: > + * > + * workqueue_init_early > + * alloc_workqueue > + * __alloc_workqueue_key > + * alloc_and_link_pwqs > + * init_pwq > + * > + * Also, plus No. CPUs objects: > + * > + * perf_event_init > + *__init_srcu_struct > + * init_srcu_struct_fields > + *init_srcu_struct_nodes > + * __init_work > + * > + * Increase the number a bit more in case the implmentatins are changed in > + * the future. > + */ > +#if defined(CONFIG_NR_CPUS) && defined(CONFIG_DEBUG_OBJECTS_TIMERS) && \ > +!defined(CONFIG_DEBUG_OBJECTS_WORK) > +#define ODEBUG_POOL_SIZE (CONFIG_NR_CPUS * 10) > +#elif defined(CONFIG_NR_CPUS) && defined(CONFIG_DEBUG_OBJECTS_WORK) > +#define ODEBUG_POOL_SIZE (CONFIG_NR_CPUS * 30) > +#else > #define ODEBUG_POOL_SIZE 1024 > +#endif /* CONFIG_NR_CPUS */ > #define ODEBUG_POOL_MIN_LEVEL256 > CONFIG_NR_CPUS is always defined. You don't need to put that as a #if condition. Where does the scaling factor 30 come from? It looks high to me. For UP system, CONFIG_NR_CPUS will be 1. I think it is better to have a guarantee minimum plus a multiplier of the # of configured CPUs. Something like 512 + CONFIG_NR_CPUS * where should be the sum of all early allocation objects that scale with the number of cpus. The guarantee minimum will cover other miscellaneous objects. Cheers, Longman
[PATCH] debugobjects: scale the static pool size
The current value of the early boot static pool size is not big enough for systems with large number of CPUs with timer or/and workqueue objects selected. As the results, systems have 60+ CPUs with both timer and workqueue objects enabled could trigger "ODEBUG: Out of memory. ODEBUG disabled". Hence, fixed it by computing it according to CONFIG_NR_CPUS and CONFIG_DEBUG_OBJECTS_* options. Signed-off-by: Qian Cai --- lib/debugobjects.c | 53 +- 1 file changed, 52 insertions(+), 1 deletion(-) diff --git a/lib/debugobjects.c b/lib/debugobjects.c index 70935ed91125..372dc34206d5 100644 --- a/lib/debugobjects.c +++ b/lib/debugobjects.c @@ -23,7 +23,53 @@ #define ODEBUG_HASH_BITS 14 #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS) +/* + * Some debug objects are allocated during the early boot. Enabling some + * options like timers or workqueue objects may increase the size required + * significantly with large number of CPUs. For example, + * + * No. CPUs x 2 (worker pool) objects: + * + * start_kernel + * workqueue_init_early + * init_worker_pool + * init_timer_key + * debug_object_init + * + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): + * + * sched_init + * hrtick_rq_init + * hrtimer_init + * + * CONFIG_DEBUG_OBJECTS_WORK: + * No. CPUs x 6 (workqueue) objects: + * + * workqueue_init_early + * alloc_workqueue + * __alloc_workqueue_key + * alloc_and_link_pwqs + * init_pwq + * + * Also, plus No. CPUs objects: + * + * perf_event_init + *__init_srcu_struct + * init_srcu_struct_fields + *init_srcu_struct_nodes + * __init_work + * + * Increase the number a bit more in case the implmentatins are changed in + * the future. + */ +#if defined(CONFIG_NR_CPUS) && defined(CONFIG_DEBUG_OBJECTS_TIMERS) && \ +!defined(CONFIG_DEBUG_OBJECTS_WORK) +#define ODEBUG_POOL_SIZE (CONFIG_NR_CPUS * 10) +#elif defined(CONFIG_NR_CPUS) && defined(CONFIG_DEBUG_OBJECTS_WORK) +#define ODEBUG_POOL_SIZE (CONFIG_NR_CPUS * 30) +#else #define ODEBUG_POOL_SIZE 1024 +#endif /* CONFIG_NR_CPUS */ #define ODEBUG_POOL_MIN_LEVEL 256 #define ODEBUG_CHUNK_SHIFT PAGE_SHIFT @@ -58,8 +104,13 @@ static int debug_objects_fixups __read_mostly; static int debug_objects_warnings __read_mostly; static int debug_objects_enabled __read_mostly = CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT; +/* + * This is only used after replaced static objects, so no need to scale it + * to use the early boot static pool size and it has already been scaled + * according to actual No. CPUs in the box within debug_objects_mem_init(). + */ static int debug_objects_pool_size __read_mostly - = ODEBUG_POOL_SIZE; + = 1024; static int debug_objects_pool_min_level __read_mostly = ODEBUG_POOL_MIN_LEVEL; static struct debug_obj_descr *descr_test __read_mostly; -- 2.17.2 (Apple Git-113)