date:20090604

DO NOT REPLY [Bug 47296] Referenced Fill URL not applied when PDF Encrypted

2009-06-04 Thread bugzilla

https://issues.apache.org/bugzilla/show_bug.cgi?id=47296





--- Comment #1 from Andreas L. Delmelle   2009-06-04 
14:05:32 PST ---
Hi Lea

Sorry for the rather late reply...

Can you also attach the referenced PNG? That would make it slightly easier for
us to reproduce the issue using the attached SVG. Thanks!

Andreas

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

Re: R: Apache FOP 0.95 Patch

2009-06-04 Thread Andreas Delmelle


On 04 Jun 2009, at 15:36, Laera Dario wrote:

Hi Dario


I once ran a test with a document
containing one single fo:block with the pre-formatted text of an
entire book. Without 'linefeed-treatment="preserve"', FOP needed at
least 768MB to avoid running out of memory, because it had to
recompute all the line-breaks. Preserving the linefeeds, I needed  
only

64MB (maybe even lower, but I don't think I tried that).


Andreas, this recalls me an old thread: http://markmail.org/thread/j3zg47pfwjjn3y6v 
. Maybe the adjustment ratio have some responsibility on the high  
memory consumption.


Some, yes, but not all. It was a simple one-column layout, so the  
default size for a space should be roughly OK in relation to the line- 
length.
The fact remains that FOP has to recompute and reconsider all of the  
break-possibilities between every word, while using forced line-breaks  
automatically generates -INFINITE penalties, so the algorithm has no  
choice but to consider those breaks as definitive. A similar logic  
holds for page-sequences, I think. If you use huge amounts of tiny  
fo:blocks without forced page-breaks, line-breaking will not require  
that much memory, but page-breaking will...



Regards

Andreas

R: Apache FOP 0.95 Patch

2009-06-04 Thread Laera Dario

Hi Ben,

very good idea, I'm facing the same issue for very long report and your 
solution may be helpful.

> -Messaggio originale-
> Da: Andreas Delmelle [mailto:andreas.delme...@telenet.be]
> Inviato: giovedì 4 giugno 2009 14.35
> A: fop-dev@xmlgraphics.apache.org
> Oggetto: Re: Apache FOP 0.95 Patch 



> I once ran a test with a document
> containing one single fo:block with the pre-formatted text of an
> entire book. Without 'linefeed-treatment="preserve"', FOP needed at
> least 768MB to avoid running out of memory, because it had to
> recompute all the line-breaks. Preserving the linefeeds, I needed only
> 64MB (maybe even lower, but I don't think I tried that).

Andreas, this recalls me an old thread: 
http://markmail.org/thread/j3zg47pfwjjn3y6v. Maybe the adjustment ratio have 
some responsibility on the high memory consumption.


Dario



--
CONFIDENTIALITY NOTICE: This e-mail message from the IMA Group (including all 
attachments) is for the sole use of the intended recipient(s) and may contain 
confidential and privileged information. Please note that any opinions 
expressed in this e-mail are those of the author personally and not the IMA 
Group, who do not accept responsibility for the contents of the message. Any 
unauthorised review, use, disclosure, copying or distribution is strictly 
prohibited. If you are not the intended recipient, please contact the sender by 
reply e-mail and destroy all copies of the original message and its attachments.

Re: Apache FOP 0.95 Patch

2009-06-04 Thread Andreas Delmelle


On 04 Jun 2009, at 14:53, Ben Wuest wrote:

Hi Ben


I agree that this should happen behind the scenes without the user
having to specify anything.  I had to perform this work-around the  
way I

did because of our current time constraints.  Hopefully, this can lead
to something else. Unfortunately, my knowledge of the Apache FOP  
Source
only extends to the last week of work getting this work-around in  
place.


All the more credit to you then, if you accomplished to understand  
enough to do it in one week! :-)


If you ever considered becoming a more steady contributor, you are  
very welcome to join us (as far as I'm concerned). Note that you will  
probably want to check our code-style guidelines in that case (minor  
details, but important for getting patches committed sooner rather  
than later).



I am pretty sure, however, that to implement memory management
effectively for FOP (behind the scenes) for RTF and PDF, the two
handlers (RTFHandler and AreaTreeHandler) will have to be modified.
Having said that, I think it will be much easier to modify the RTF
rendering because it does not use the Page Breaking Algorithm.


Definitely true. For the AreaTreeHandler, the code that is currently  
in endPageSequence() would have to be relocated to trigger the  
doLayout() loop sooner. I've considered starting this at some point,  
but quickly ran into some 'walls'. The whole layout loop is  
implemented so ... tightly (?) As you noticed, disentangling that web  
is really far from trivial.


Anyway, thanks again for this contribution!


Andreas

RE: Apache FOP 0.95 Patch

2009-06-04 Thread Ben Wuest

I agree that this should happen behind the scenes without the user
having to specify anything.  I had to perform this work-around the way I
did because of our current time constraints.  Hopefully, this can lead
to something else. Unfortunately, my knowledge of the Apache FOP Source
only extends to the last week of work getting this work-around in place.

I am pretty sure, however, that to implement memory management
effectively for FOP (behind the scenes) for RTF and PDF, the two
handlers (RTFHandler and AreaTreeHandler) will have to be modified.
Having said that, I think it will be much easier to modify the RTF
rendering because it does not use the Page Breaking Algorithm.  

Regards,

Ben.

-Original Message-
From: Andreas Delmelle [mailto:andreas.delme...@telenet.be] 
Sent: Thursday, June 04, 2009 9:35 AM
To: fop-dev@xmlgraphics.apache.org
Subject: Re: Apache FOP 0.95 Patch

On 04 Jun 2009, at 14:11, Simon Pepping wrote:

Hi Ben, Simon & Vincent,

>> 
>
> Indeed, it is a horrible hack with regard to the meaning of a
> page-sequence. But it is an interesting solution to the problem of
> influencing FOP's page breaking algorithm.

The very same thoughts over here. A really interesting showcase of  
what FOP can/should do, but I'd go about the implementation  
differently. Still a worthwhile overview of what needs to happen,  
albeit behind the scenes, without requiring the user to do anything  
special.

> 
> B.T.W., why does the algorithm not stop at hard page breaks?

IIC from recent debug-sessions, it does. Well, it's not really the  
algorithm that stops...
If the FlowLM signals a forced page-break, the current block-list is  
returned, page-breaks are computed and the areas are immediately added  
to the tree. After that, the PageBreaker resumes fetching the  
following block-lists. The breaks for the latter part are computed  
later by an entirely separate PageBreakingAlgorithm. In fact, this is  
one scenario where the line-breaking continues with a possibly  
different available i-p-d.

Span-changes are another example where FOP currently already processes  
part of the page-sequence with a different PageBreakingAlgorithm.

> I seem to recall that in the past this happened for hard line breaks.

This is indeed not so. Hard line-breaks just trigger the end of the  
current Paragraph and start a new one (an empty one, if it only  
contains a preserved linefeed, to produce a blank line), but the main  
getNextKnuthElements() loop is not interrupted. The forced breaks do,  
however, help the algorithm. I once ran a test with a document  
containing one single fo:block with the pre-formatted text of an  
entire book. Without 'linefeed-treatment="preserve"', FOP needed at  
least 768MB to avoid running out of memory, because it had to  
recompute all the line-breaks. Preserving the linefeeds, I needed only  
64MB (maybe even lower, but I don't think I tried that).

Regards

Andreas

Re: Apache FOP 0.95 Patch

2009-06-04 Thread Andreas Delmelle


On 04 Jun 2009, at 14:11, Simon Pepping wrote:

Hi Ben, Simon & Vincent,





Indeed, it is a horrible hack with regard to the meaning of a
page-sequence. But it is an interesting solution to the problem of
influencing FOP's page breaking algorithm.


The very same thoughts over here. A really interesting showcase of  
what FOP can/should do, but I'd go about the implementation  
differently. Still a worthwhile overview of what needs to happen,  
albeit behind the scenes, without requiring the user to do anything  
special.




B.T.W., why does the algorithm not stop at hard page breaks?


IIC from recent debug-sessions, it does. Well, it's not really the  
algorithm that stops...
If the FlowLM signals a forced page-break, the current block-list is  
returned, page-breaks are computed and the areas are immediately added  
to the tree. After that, the PageBreaker resumes fetching the  
following block-lists. The breaks for the latter part are computed  
later by an entirely separate PageBreakingAlgorithm. In fact, this is  
one scenario where the line-breaking continues with a possibly  
different available i-p-d.


Span-changes are another example where FOP currently already processes  
part of the page-sequence with a different PageBreakingAlgorithm.



I seem to recall that in the past this happened for hard line breaks.


This is indeed not so. Hard line-breaks just trigger the end of the  
current Paragraph and start a new one (an empty one, if it only  
contains a preserved linefeed, to produce a blank line), but the main  
getNextKnuthElements() loop is not interrupted. The forced breaks do,  
however, help the algorithm. I once ran a test with a document  
containing one single fo:block with the pre-formatted text of an  
entire book. Without 'linefeed-treatment="preserve"', FOP needed at  
least 768MB to avoid running out of memory, because it had to  
recompute all the line-breaks. Preserving the linefeeds, I needed only  
64MB (maybe even lower, but I don't think I tried that).




Regards

Andreas

Re: Apache FOP 0.95 Patch

2009-06-04 Thread Simon Pepping

On Thu, Jun 04, 2009 at 11:35:17AM +0100, Vincent Hennebert wrote:
> It is likely to interest other users who run into similar memory issues,
> and the good thing of having made it against the 0.95 release is that it
> won???t be made obsolete by further changes in the code.
> 
> We are not going to apply this patch to the Trunk, though. This is
> a workaround that, although quite clever, distorts too much the
> semantics of the fo:page-sequence element. A page sequence really is
> a self-contained set of typographical material, that should start and
> end on its own pages (the common analogy is the chapter of a book).

Indeed, it is a horrible hack with regard to the meaning of a
page-sequence. But it is an interesting solution to the problem of
influencing FOP's page breaking algorithm. Effectively, it inserts
points where it guides the algorithm to restrict its optimization to
the region up to that point. As long as we do not have a fundamental
solution to the problem, causing OOM situations for users, this is a
prototype for a nice interim solution: some hint that tells the
page-breaking algorithm to stop here, calculate and ship the pages,
and then resume.

B.T.W., why does the algorithm not stop at hard page breaks? I seem to
recall that in the past this happened for hard line breaks. It may
have been me who undid that, in my work on block elements in line
elements. I may have done that mostly because it seemed an unnecessary
complication at that time. In the context of the current problem it
may be good to revisit that. A hard page break is a much better hint
than a pseudo-page-sequence.

Regards, Simon

-- 
Simon Pepping
home page: http://www.leverkruid.eu

DO NOT REPLY [Bug 47314] [PATCH] Suppress page breaks between page sequences

2009-06-04 Thread bugzilla

https://issues.apache.org/bugzilla/show_bug.cgi?id=47314


Laera Dario  changed:

   What|Removed |Added

 CC||lae...@ima.it




-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

RE: Apache FOP 0.95 Patch

2009-06-04 Thread Ben Wuest

Vincent -

I agree this is a work-around and does distort the semantics of the 
fo:page-sequence element.  When I opened up the FOP 0.95 Source last week, it 
became apparent that trying to interject where FOP starts its layout would be 
time consuming. Currently the handlers are directly tied to the FO to only 
start rendering when a page-sequence is closed.  

I would like to point out that the patched code does not start rendering 
earlier than this.  It simply provides a method of continuing the rendering 
without a page break.  Again, this does not conform to the semantics of a 
page-sequence -- in that -- a page-sequence should be a set of pages that start 
and end on its own pages.  However, there was no other way (at least that I 
could see) within the confines of the time constraints I had to provide a 
work-around for FOP to manage the memory correctly.  

So, having said this, the patch allows RTF and PDF Rendering to work in 
original fashion or in the new modified approach.  The default behavior is to a 
page-sequence not trigger a page break at the end of its content.  To enable 
your document to render as it did pre-patch the page-sequence is specified as:

... Contents 

I understand why you are not going to blend this into Trunk.  Maybe some of the 
work provided in this patch can be used to provide a new mechanism for 
improving the memory management of FOP in which rendering is not tied to the 
end of page-sequences. 

Cheers.

Ben.

-Original Message-
From: Vincent Hennebert [mailto:vhenneb...@gmail.com] 
Sent: Thursday, June 04, 2009 7:35 AM
To: fop-dev@xmlgraphics.apache.org
Cc: Chris Fanjoy; Jody Brownell; Glen Campbell
Subject: Re: Apache FOP 0.95 Patch

Hi Ben,

Thank you very much for your interest in FOP and your contribution. I've
opened a Bugzilla issue containing your patch so that it can easily be
referred to:
https://issues.apache.org/bugzilla/show_bug.cgi?id=47314

It is likely to interest other users who run into similar memory issues,
and the good thing of having made it against the 0.95 release is that it
won't be made obsolete by further changes in the code.

We are not going to apply this patch to the Trunk, though. This is
a workaround that, although quite clever, distorts too much the
semantics of the fo:page-sequence element. A page sequence really is
a self-contained set of typographical material, that should start and
end on its own pages (the common analogy is the chapter of a book).

Rather, at some point we will have to tackle that artificial limitation
of starting the layout only when the end of the page sequence has been
reached. It's possible to start earlier, like you have somehow proved
it. Also, we are planning to implement several layout options, providing
different trade-off between speed/memory consumption and typographical
quality. Eventually it should  no longer be necessary to split the
document into several page sequences to avoid memory issues.

Still, meanwhile your patch may save the lives of users who need a
quick solution to that problem.

I hope you understand.
Thanks again,
Vincent

Ben Wuest wrote:
> Hi -
> 
>  
> 
> We recently integrated Apache FOP 0.95 with our software to perform the 
> rendering of RTF and PDF reports.  This integration was very quick and 
> provided great results. However, due to the large amounts of data that our 
> software is required to handle, we began experiencing Out of Memory problems 
> with FOP.  We researched this and sent letters to the user community and 
> determined that what we were experiencing OOM issues because each of our 
> reports existed in one page-sequence.  We came to the conclusions from the 
> community response, Web Forums, and analysis of the Apache FOP code itself 
> that FOP reads to the end of a page sequence and then begins to render.   
> With the large amounts of data ( 40 Mb FO files ) we quickly ran into 
> scalability issues with one page-sequence per report.  At this point we 
> divided up our reports into multiple page-sequences only to find that FOP 
> starts a new page on every page sequence and this behavior can not be changed 
> (through the means of alterin
g the FO file).  Page breaking at unpredictable locations (sometimes leaving 
half or ¾ pages empty) made the report presentation visually unacceptable.
> 
>  
> 
> We have modified the Apache 0.95 code for PDF and RTF Rendering and would 
> like to offer this patch back to the community (the attached SVN diff is from 
> the 0.95 release baseline).   Listed below is an overview of the 
> modifications that have been made.  

> 
>  
> 
> 1.   Page Sequence Changes
> 
>  
> 
> The handling of the break-after attribute was added to the page-sequence.  
> This can only be set to auto (meaning that no page break will occur after the 
> page-sequence) or page (meaning that a page-break will occur after the 
> page-sequence).  There wasn't another suitabl

Re: Apache FOP 0.95 Patch

2009-06-04 Thread Vincent Hennebert

Hi Ben,

Thank you very much for your interest in FOP and your contribution. I’ve
opened a Bugzilla issue containing your patch so that it can easily be
referred to:
https://issues.apache.org/bugzilla/show_bug.cgi?id=47314

It is likely to interest other users who run into similar memory issues,
and the good thing of having made it against the 0.95 release is that it
won’t be made obsolete by further changes in the code.

We are not going to apply this patch to the Trunk, though. This is
a workaround that, although quite clever, distorts too much the
semantics of the fo:page-sequence element. A page sequence really is
a self-contained set of typographical material, that should start and
end on its own pages (the common analogy is the chapter of a book).

Rather, at some point we will have to tackle that artificial limitation
of starting the layout only when the end of the page sequence has been
reached. It’s possible to start earlier, like you have somehow proved
it. Also, we are planning to implement several layout options, providing
different trade-off between speed/memory consumption and typographical
quality. Eventually it should  no longer be necessary to split the
document into several page sequences to avoid memory issues.

Still, meanwhile your patch may save the lives of users who need a
quick solution to that problem.

I hope you understand.
Thanks again,
Vincent


Ben Wuest wrote:
> Hi -
> 
>  
> 
> We recently integrated Apache FOP 0.95 with our software to perform the 
> rendering of RTF and PDF reports.  This integration was very quick and 
> provided great results. However, due to the large amounts of data that our 
> software is required to handle, we began experiencing Out of Memory problems 
> with FOP.  We researched this and sent letters to the user community and 
> determined that what we were experiencing OOM issues because each of our 
> reports existed in one page-sequence.  We came to the conclusions from the 
> community response, Web Forums, and analysis of the Apache FOP code itself 
> that FOP reads to the end of a page sequence and then begins to render.   
> With the large amounts of data ( 40 Mb FO files ) we quickly ran into 
> scalability issues with one page-sequence per report.  At this point we 
> divided up our reports into multiple page-sequences only to find that FOP 
> starts a new page on every page sequence and this behavior can not be changed 
> (through the means of alterin
g the FO file).  Page breaking at unpredictable locations (sometimes leaving 
half or ¾ pages empty) made the report presentation visually unacceptable.
> 
>  
> 
> We have modified the Apache 0.95 code for PDF and RTF Rendering and would 
> like to offer this patch back to the community (the attached SVN diff is from 
> the 0.95 release baseline).   Listed below is an overview of the 
> modifications that have been made.  
> 
>  
> 
> 1.   Page Sequence Changes
> 
>  
> 
> The handling of the break-after attribute was added to the page-sequence.  
> This can only be set to auto (meaning that no page break will occur after the 
> page-sequence) or page (meaning that a page-break will occur after the 
> page-sequence).  There wasn't another suitable attribute to use (from the 
> xsl-fo standard) so break-after was employed.
> 
>  
> 
> 2.   PDF Rendering
> 
>  
> 
> In PDF rendering, the page-sequence handling was modified to consider the new 
> attribute on a page sequence (described in #1 above).  The rendering was 
> modified to save the last page of a page sequence for the next one and track 
> the available BPD.  In this fashion, when the next page sequence starts it 
> checks for an unfinished page on the previous page sequence and uses that to 
> begin its Page Breaking Algorithm.  This required some small changes in the 
> Page Break algorithm to read available BPD from carry over pages.
> 
>  
> 
> 3.   RTF Rendering
> 
>  
> 
> In RTF Rendering, the page-sequence handling was also modified to consider 
> the new attribute on a page sequence (described in #1 above). The RTFHandler 
> was modified to start a document Area for every page sequence and to release 
> the memory on page-sequence once it was complete (it is our impression from 
> JProfiler analysis that it currently does not do this).  We modified the 
> RTFElement class to allow for flags as to whether to write the prefix or 
> suffix.  This enables the rendering to actually write the RtfFile on every 
> page-sequence without closing the document.  In addition this means that 
> RtfFile is flushed (all memory) on every page-sequence (or RtfSection in the 
> context of the rendering).  
> 
>  
> 
>  
> 
> The result of this allows us to render 7000 page RTF and PDF reports that 
> contain continuous tables of over 20,000 records seamlessly (without page 
> breaks in the table that are not at an end of page)

DO NOT REPLY [Bug 47314] New: [PATCH] Suppress page breaks between page sequences

2009-06-04 Thread bugzilla

https://issues.apache.org/bugzilla/show_bug.cgi?id=47314

   Summary: [PATCH] Suppress page breaks between page sequences
   Product: Fop
   Version: 0.95
  Platform: All
OS/Version: All
Status: NEW
  Severity: normal
  Priority: P2
 Component: page-master/layout
AssignedTo: fop-dev@xmlgraphics.apache.org
ReportedBy: vhenneb...@gmail.com


Created an attachment (id=23757)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=23757)
Patch against FOP 0.95

FOP currently starts layout only when the end of a page sequence has been
reached. This leads to memory issues when a document is made of one big page
sequence, that can't be easily broken down into several smaller ones.

Ben Wuest has submitted a patch against FOP 0.95 that allows to 'join' page
sequences together i.e., suppress the page break that must occur between them.
That allows to make the document look like it had been made of only one page
sequence, without the memory issues mentioned above. The PDF and RTF renderers
have been adapted accordingly.

See discussion here: http://markmail.org/thread/mfafprg4xtownmw4

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

DO NOT REPLY [Bug 46905] [PATCH] Implement keep-*.within-column

2009-06-04 Thread bugzilla

https://issues.apache.org/bugzilla/show_bug.cgi?id=46905





--- Comment #18 from Chris Bowditch   2009-06-04 
00:37:12 PST ---
(In reply to comment #17)
> (In reply to comment #16)
> > This won't work. If keep-together.within-column="1" and
> > keep-together.within-page="always" then a break must be completely 
> > forbidden at
> > a page. Hinting penalties won't prevent that in every case, for example if 
> > the
> > only feasible page break is at such a place.
> OK, I thought so...
> I had this working for strength "always", with the modified implementation for
> Keep.compare() I suggested earlier (comment #4). Anyway, that case is easy. 
> The
> more complicated case is keep-together.within-column="1" and on a nested block
> .within-page="10". Both column-breaks and page-breaks are allowed, but the
> page-breaks should preferably be made before/after the nested block. A
> page-break in the nested block would be permitted only if its content does not
> fit into one page.

I think it is an acceptable limitation that keep-*.within-column works only for
"always" It is already a big improvement on the current situation where this is
treated as keep-*.within-page.



-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

DO NOT REPLY [Bug 47296] Referenced Fill URL not applied when PDF Encrypted

Re: R: Apache FOP 0.95 Patch

R: Apache FOP 0.95 Patch

Re: Apache FOP 0.95 Patch

RE: Apache FOP 0.95 Patch

Re: Apache FOP 0.95 Patch

Re: Apache FOP 0.95 Patch

DO NOT REPLY [Bug 47314] [PATCH] Suppress page breaks between page sequences

RE: Apache FOP 0.95 Patch

Re: Apache FOP 0.95 Patch

DO NOT REPLY [Bug 47314] New: [PATCH] Suppress page breaks between page sequences

DO NOT REPLY [Bug 46905] [PATCH] Implement keep-*.within-column

12 matches

Site Navigation

Mail list logo

Footer information