I finally have Knuth's "Digital Typography" and let myself enlighten by his well-written words. In [1] Simon outlined different strategies for page-breaking, obviously closely following the different approaches defined by Knuth. At first glance, I'd say that "best-fit" is probably the obvious strategy to select, especially if TeX is happy with it. Obviously, it can't find the optimal solution like this but the additional overhead (memory and CPU power) of a look-ahead/total-fit strategy is simply too much and unnecessary for things like invoices and insurance policies which are surely some of the most popular use cases of XSL-FO. Here, speed is extremely important. People writing documentation (maybe using DocBook) or glossy stock reports have additional requirements and don't mind the longer processing time and additional memory requirements. This leads me to the question if we shouldn't actually implement two page-breaking strategies (in the end, not both right now). For a speed-optimized algorithm, we could even think about ignoring side-floats.
Obviously, in this model we would have to make sure that we use a common model for both strategies. For example, we still have to make sure that the line layout gets information on the available IPD on each line, but probably this will not be a big problem to include later. An enhanced/adjusted box/glue/penalty model sounds like a good idea to me especially since Knuth hints at that in his book, too. There's also a question if part of the infrastructure from line breaking can be reused for page breaking, but I guess rather not. As for the plan to implement a new page-breaking mechanism: I've got to do it now. :-) I'm sorry if this may put some pressure on some of you. I'm also not sure if I'm fit already to tackle it, but I've got to do it anyway. Since I don't want to work with a series of patches like you guys did earlier, I'd like to create a branch to do that on as soon as we've agreed on a strategy. Any objections to that? [1] http://wiki.apache.org/xmlgraphics-fop/PageLayout Jeremias Maerki