K-P would not be a mere add-on to groff. K-P knows in advance the shape of the space into which a paragraph must fit, but groff doesn't. This means a whole lot of groff state must be carried along with the K-P dynamic program. The latter merely needs to keep a set of candidate line-break points.
For example, an image around which the text must flow can be pinned to a word (actually to whatever line that word finally appears in) by non-breaking 'll and 'in requests. Then the end of the image a fixed distance ahead can be specified by .wh. All of these settings are made dynamically as type is set and thus vary with breakpoint choice I have suggested a conceptually simple, but probably exorbitantly expensive, implementation: run multiple copies of groff in parallel, each picking different breakpoints. Kill off all but the best instance at each paragraph end. The degree of parallelism could be limited to the maximum number of words on a line, at the cost of IPC among the instances of groff. Now it gets ugly! Doug