This makes alot of sense. But we are talking about the need for large scale parallelism, not discrete events. Once a given unit of I/O work can be performed on a given socket or pipe, it's going to be time to farm it out to a worker.
Somewhere in this scheme we need to consider dispatching.
If I understand your meaning, that's one of the nice features of IOCPs. You can block your worker threads against an IOCP (actual call is GetQueuedCompletionStatus()) and the NT kernel will dispatch threads in LIFO order to handle io completion notifications. Of course we would hide all those details in an apr_pollset_* call.
Bill