RE: Apache FOP 0.95 Patch

Ben Wuest Thu, 04 Jun 2009 05:53:47 -0700

I agree that this should happen behind the scenes without the user
having to specify anything.  I had to perform this work-around the way I
did because of our current time constraints.  Hopefully, this can lead
to something else. Unfortunately, my knowledge of the Apache FOP Source
only extends to the last week of work getting this work-around in place.

I am pretty sure, however, that to implement memory management
effectively for FOP (behind the scenes) for RTF and PDF, the two
handlers (RTFHandler and AreaTreeHandler) will have to be modified.
Having said that, I think it will be much easier to modify the RTF
rendering because it does not use the Page Breaking Algorithm.  

Regards,

Ben.

-----Original Message-----
From: Andreas Delmelle [mailto:andreas.delme...@telenet.be] 
Sent: Thursday, June 04, 2009 9:35 AM
To: fop-dev@xmlgraphics.apache.org
Subject: Re: Apache FOP 0.95 Patch

On 04 Jun 2009, at 14:11, Simon Pepping wrote:

Hi Ben, Simon & Vincent,

>> <snip />
>
> Indeed, it is a horrible hack with regard to the meaning of a
> page-sequence. But it is an interesting solution to the problem of
> influencing FOP's page breaking algorithm.

The very same thoughts over here. A really interesting showcase of  
what FOP can/should do, but I'd go about the implementation  
differently. Still a worthwhile overview of what needs to happen,  
albeit behind the scenes, without requiring the user to do anything  
special.

> <snip />
> B.T.W., why does the algorithm not stop at hard page breaks?

IIC from recent debug-sessions, it does. Well, it's not really the  
algorithm that stops...
If the FlowLM signals a forced page-break, the current block-list is  
returned, page-breaks are computed and the areas are immediately added  
to the tree. After that, the PageBreaker resumes fetching the  
following block-lists. The breaks for the latter part are computed  
later by an entirely separate PageBreakingAlgorithm. In fact, this is  
one scenario where the line-breaking continues with a possibly  
different available i-p-d.

Span-changes are another example where FOP currently already processes  
part of the page-sequence with a different PageBreakingAlgorithm.

> I seem to recall that in the past this happened for hard line breaks.

This is indeed not so. Hard line-breaks just trigger the end of the  
current Paragraph and start a new one (an empty one, if it only  
contains a preserved linefeed, to produce a blank line), but the main  
getNextKnuthElements() loop is not interrupted. The forced breaks do,  
however, help the algorithm. I once ran a test with a document  
containing one single fo:block with the pre-formatted text of an  
entire book. Without 'linefeed-treatment="preserve"', FOP needed at  
least 768MB to avoid running out of memory, because it had to  
recompute all the line-breaks. Preserving the linefeeds, I needed only  
64MB (maybe even lower, but I don't think I tried that).

Regards

Andreas

RE: Apache FOP 0.95 Patch

Reply via email to