On Mon, Mar 18, 2024 at 02:34:29PM -0400, Stefan Hajnoczi wrote: > The coroutine pool implementation can hit the Linux vm.max_map_count > limit, causing QEMU to abort with "failed to allocate memory for stack" > or "failed to set up stack guard page" during coroutine creation. > > This happens because per-thread pools can grow to tens of thousands of > coroutines. Each coroutine causes 2 virtual memory areas to be created.
This sounds quite alarming. What usage scenario is justified in creating so many coroutines ? IIUC, coroutine stack size is 1 MB, and so tens of thousands of coroutines implies 10's of GB of memory just on stacks alone. > Eventually vm.max_map_count is reached and memory-related syscalls fail. On my system max_map_count is 1048576, quite alot higher than 10's of 1000's. Hitting that would imply ~500,000 coroutines and ~500 GB of stacks ! > diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c > index 5fd2dbaf8b..2790959eaf 100644 > --- a/util/qemu-coroutine.c > +++ b/util/qemu-coroutine.c > +static unsigned int get_global_pool_hard_max_size(void) > +{ > +#ifdef __linux__ > + g_autofree char *contents = NULL; > + int max_map_count; > + > + /* > + * Linux processes can have up to max_map_count virtual memory areas > + * (VMAs). mmap(2), mprotect(2), etc fail with ENOMEM beyond this limit. > We > + * must limit the coroutine pool to a safe size to avoid running out of > + * VMAs. > + */ > + if (g_file_get_contents("/proc/sys/vm/max_map_count", &contents, NULL, > + NULL) && > + qemu_strtoi(contents, NULL, 10, &max_map_count) == 0) { > + /* > + * This is a conservative upper bound that avoids exceeding > + * max_map_count. Leave half for non-coroutine users like library > + * dependencies, vhost-user, etc. Each coroutine takes up 2 VMAs so > + * halve the amount again. > + */ > + return max_map_count / 4; That's 256,000 coroutines, which still sounds incredibly large to me. > + } > +#endif > + > + return UINT_MAX; Why UINT_MAX as a default ? If we can't read procfs, we should assume some much smaller sane default IMHO, that corresponds to what current linux default max_map_count would be. > +} > + > +static void __attribute__((constructor)) qemu_coroutine_init(void) > +{ > + qemu_mutex_init(&global_pool_lock); > + global_pool_hard_max_size = get_global_pool_hard_max_size(); > } > -- > 2.44.0 > > With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|