We have a number of abuses of percpu_foreach in tree: percpu_foreach forbids sleeping on allocation, but a number of subsystems use it to allocate memory to which pointers are stored in percpu objects because the percpu objects themselves may move around in memory.
Each ad hoc abuse of percpu_foreach also works only once, but in principle, in a future where we support CPU hotplug, we would need to rerun any percpu initialization or finalization logic on each hatching or dying CPU. To address the abuse of percpu_foreach for allocation/deallocation, and to pave the way for future CPU hotplug, I propose: struct percpu * percpu_create(size_t size, percpu_callback_t ctor, percpu_callback_t dtor, void *cookie); With percpu_create, the idiom pc = percpu_create(size, ctor, dtor, cookie); ... percpu_free(pc, size); is something like what you would now get by abusing percpu_foreach as: pc = percpu_alloc(size); if (ctor) percpu_foreach(pc, ctor, cookie); ... if (dtor) percpu_foreach(pc, dtor, cookie); percpu_free(pc, size); However, unlike percpu_foreach, with percpu_create the ctor and dtor are allowed to sleep -- e.g., on memory allocation, or on waiting for users to drain before you can free memory. With percpu_create, percpu_alloc(size) becomes an alias for percpu_create(size, NULL, NULL, NULL). Note: percpu_create is not redundant with percpu_foreach_xcall discussed in another thread, because allocation is still not allowed in percpu_foreach_xcall. The attached patch set (a) implements percpu_create, (b) adds it to the man page, and (c) converts all the abuses of percpu_foreach for allocation or freeing that I found to use percpu_create instead. The implementation changes the way percpu records are stored internally, from an offset `encrypted' as a pointer to a fictitious struct percpu type, to an allocated record of an actual struct percpu type that holds the dtor and cookie (and could hold the ctor if we needed for CPU hotplug). This means percpu_getref entails one additional memory reference, but it is to memory that changes very seldom so will probably be cached by the CPU in any hot code paths; I don't anticipate a noticeable performance impact from it. Thoughts?