On 04 Jun 2009, at 14:11, Simon Pepping wrote:
Hi Ben, Simon & Vincent,
<snip />
Indeed, it is a horrible hack with regard to the meaning of a
page-sequence. But it is an interesting solution to the problem of
influencing FOP's page breaking algorithm.
The very same thoughts over here. A really interesting showcase of
what FOP can/should do, but I'd go about the implementation
differently. Still a worthwhile overview of what needs to happen,
albeit behind the scenes, without requiring the user to do anything
special.
<snip />
B.T.W., why does the algorithm not stop at hard page breaks?
IIC from recent debug-sessions, it does. Well, it's not really the
algorithm that stops...
If the FlowLM signals a forced page-break, the current block-list is
returned, page-breaks are computed and the areas are immediately added
to the tree. After that, the PageBreaker resumes fetching the
following block-lists. The breaks for the latter part are computed
later by an entirely separate PageBreakingAlgorithm. In fact, this is
one scenario where the line-breaking continues with a possibly
different available i-p-d.
Span-changes are another example where FOP currently already processes
part of the page-sequence with a different PageBreakingAlgorithm.
I seem to recall that in the past this happened for hard line breaks.
This is indeed not so. Hard line-breaks just trigger the end of the
current Paragraph and start a new one (an empty one, if it only
contains a preserved linefeed, to produce a blank line), but the main
getNextKnuthElements() loop is not interrupted. The forced breaks do,
however, help the algorithm. I once ran a test with a document
containing one single fo:block with the pre-formatted text of an
entire book. Without 'linefeed-treatment="preserve"', FOP needed at
least 768MB to avoid running out of memory, because it had to
recompute all the line-breaks. Preserving the linefeeds, I needed only
64MB (maybe even lower, but I don't think I tried that).
Regards
Andreas