Re: Skype-conference on page-breaking?
Thanks, Luca. I've had a nice casual talk on the phone with Simon, yesterday. Essentially, we only talked about very high-level stuff, especially the decision for a certain strategy (or two). You know I came up with the idea to create a simpler best-fit strategy with no look-ahead for invoice-style documents but maybe it would be possible to design your obvious total-fit strategy in a way that it could be used as a best-fit without look-ahead. The problem, like I mentioned already, is the possible change of available IPD within a page-sequence which results in a possible back-tracking and recalculation of vertical boxes. Of course, if it's possible to stay with one page-breaking algorithm for all use cases that would be best (because of the reduced effort), but only if the algorithm is reasonably fast for invoice-style documents. I'm repeatedly confronted with certain speed requirements in this case. Since modern high-volume single-feed printers handle about 180 pages per minute (continuous feed systems handle over 4 times that speed, but I think that's neither relevant, nor realistic here) FOP should be able to operate close to these 180 pages per minute for not too complex documents on a modern server. That means 330ms per page. Not much. Of course, in such an environment it is possible to distribute the formatting process over several blade servers but I had to realize that certain companies tend to prefer spending 100'000 dollars on a big server than spending a lot less for a much faster CPU-power-oriented setup. It seems to be hard to say good-bye to the old host systems. Well, that's just like the reality looks like in my environment. Simon, for example, is much more interested in book-style documents where there are other requirements. Speed is not a big issue, but quality is. In the end, I think we need to rate the chosen approach in these two points of view. These are very contradicting requirements and it's something that seems quite important to me not to forget here. Luca, do you think your total-fit approach may be written in a way to handle changing available IPDs and that look-ahead can be disabled to improve processing speed at the cost of optimal break decisions? If it's ok for you (and feasible) I'd like to integrate what you already have (in code) into that branch I was talking about. I would like to avoid recreating something you've already started, even if it doesn't work with the changes that happened in the last weeks. Even if we may create two different strategies I'm sure that certain parts will be shared by both approaches, like the creation of Knuth-style elements for the PageLM. Some more comments inline: On 04.03.2005 13:23:01 Luca Furini wrote: Jeremias Maerki wrote: Would you consider sharing what you already have? This may help us in the general discussion and may be a good starting point. Ok, I'll try to. The main change in the LineLM is that the line breaking algorithm does not select only the node in activeList with fewest demerits: all the nodes whose demerits are = a threshold are used to create LineBreakPositions, so for each paragraph there is a set of layout options (for example, a paragraph could create 8 to 10 lines, 9 being the layout with fewest demerits). Hmm, that's a feature that I would say is something that only book-style documents will need. Invoice-style documents could live without it. According to the value of widows and orphans, the LineLM creates a sequence of elements: besides normal lines, represented by a box, there are optional lines, represented by box(0) penalty(inf,0) glue(0,1,0) box(0) and removable lines box(0) penalty(inf,0) glue(1,0,1) box(0) A few complications arise if not every possible layout allows breaks between lines, but they all can be solved using boxes, glues and penalties (for example, if a paragraph needs 3 or 4 lines, if it uses 3 it cannot be parted). Also something that's not all too important for invoice-style documents, although it can't hurt to have it. The BlockLM, and a block stacking LM in general, adds elements representing its children's spaces and keep condition, for example adding a 0 penalty or an infinite penalty according to child1.mustKeepWithNext(), child2.mustKeepWithPrevious() and this.mustKeepTogether(). That's certainly a must-have in any case. The PageLM, once it has the list of elements representing a whole page-sequence (or the content before a forced page break), calls the same breaking algorithm, using only a different selection method which leaves only one node in activeList. That's the part where I have a big question mark about changing available IPD. We may have to have a check that figures out if the available IPD changes within a page-sequence by inspecting the page-masters. That would allow us to switch automatically between total-fit and best-fit or maybe even first-fit. A remaining question mark is with side-floats as they influence
Re: Skype-conference on page-breaking?
I don't know why this is important to you but it's two to three months. On 04.03.2005 12:40:04 Peter B. West wrote: Jeremias Maerki wrote: Sounds very interesting. Would you consider sharing what you already have? This may help us in the general discussion and may be a good starting point. My problem is that I have to deliver working page breaking with keeps, breaks, multi-column, adjustable spacing etc. in a relatively short period of time. How short? Peter -- Peter B. West http://cv.pbw.id.au/ Project Folio http://defoe.sourceforge.net/folio/ Jeremias Maerki
Re: FOP at ApacheCon Europe 2005?
FYI, I've just given myself a shove, followed Bertrand's suggestion and submitted a session proposal for ApacheCon. I feel that our project should be present there. I was also thinking about something like hidden treasures in the XML Graphics project but I guess there's not so much meat on that bone to fill one hour. ApacheCon Europe 2005 CFP submission Submitter: Jeremias Maerki [EMAIL PROTECTED] Title: Apache FOP: Optimizing speed and memory consumption Level: Experienced Style: Orientation: Developer Duration: 60 Categories: Abstract: Apache FOP is the most popular XSL-FO implementation on the market. It is used in a wide variety of use cases to create documents in PDF, PostScript and other formats. This session will show a number of techniques to improve processing speed and and hints on how to handle things like OutOfMemoryErrors. It will also contain a short info block about the state and the future of the project. On 12.02.2005 10:57:15 Bertrand Delacretaz wrote: Le 8 févr. 05, à 19:29, Jeremias Maerki a écrit : Most of you will probably have heard that ApacheCon Europe will be happening in July. I think it would be great if FOP would somehow be visible there. There's a call for participation ending 2005-03-04. Any ideas? A recurring question in my consulting work is is FOP fast or what? or more precisely how to tune XSL-FO for FOP to run efficiently, mostly in view of avoiding memory bottlenecks. Me, I'm not using FOP hands-on enough these days to answer very precisely, I usually just tell them to test their performance on large documents very regularly during development, to avoid surprises. But maybe one of you FOP gurus could give a presentation with more precise information about this? Just my 2 cents. -Bertrand Jeremias Maerki
Re: FOP at ApacheCon Europe 2005?
Fantastic! I hope to be able to do the same someday. Glen --- Jeremias Maerki [EMAIL PROTECTED] wrote: FYI, I've just given myself a shove, followed Bertrand's suggestion and submitted a session proposal for ApacheCon. I feel that our project should be present there. I was also thinking about something like hidden treasures in the XML Graphics project but I guess there's not so much meat on that bone to fill one hour. ApacheCon Europe 2005 CFP submission Submitter: Jeremias Maerki [EMAIL PROTECTED] Title: Apache FOP: Optimizing speed and memory consumption Level: Experienced Style: Orientation: Developer Duration: 60 Categories: Abstract: Apache FOP is the most popular XSL-FO implementation on the market. It is used in a wide variety of use cases to create documents in PDF, PostScript and other formats. This session will show a number of techniques to improve processing speed and and hints on how to handle things like OutOfMemoryErrors. It will also contain a short info block about the state and the future of the project. On 12.02.2005 10:57:15 Bertrand Delacretaz wrote: Le 8 févr. 05, à 19:29, Jeremias Maerki a écrit : Most of you will probably have heard that ApacheCon Europe will be happening in July. I think it would be great if FOP would somehow be visible there. There's a call for participation ending 2005-03-04. Any ideas? A recurring question in my consulting work is is FOP fast or what? or more precisely how to tune XSL-FO for FOP to run efficiently, mostly in view of avoiding memory bottlenecks. Me, I'm not using FOP hands-on enough these days to answer very precisely, I usually just tell them to test their performance on large documents very regularly during development, to avoid surprises. But maybe one of you FOP gurus could give a presentation with more precise information about this? Just my 2 cents. -Bertrand Jeremias Maerki
Re: Skype-conference on page-breaking?
Jeremias Maerki wrote: I don't know why this is important to you Just curious. but it's two to three months. Ouch. Good luck. You might want to keep an eye on Folio. Peter On 04.03.2005 12:40:04 Peter B. West wrote: Jeremias Maerki wrote: Sounds very interesting. Would you consider sharing what you already have? This may help us in the general discussion and may be a good starting point. My problem is that I have to deliver working page breaking with keeps, breaks, multi-column, adjustable spacing etc. in a relatively short period of time. How short? -- Peter B. West http://cv.pbw.id.au/ Project Folio http://defoe.sourceforge.net/folio/
Re: future of FOP
Michael, if you follow the fop-dev mailing list you will realize that the development has not come to a stand-still. It is true that the last release is almost two years old. We're in a redesign phase which tries to address exactly the issue of keeps among other things. The redesign took a lot longer than anticipated. But we're on the right track so we can start releasing again later this year, complete with keeps. If you can't work around the missing keeps (they work on table-rows) and you need an immediate solution you will need to switch to a different solution for the time being. I understand that IBM is quite big in the document business. It would be very interesting if IBM committed to supporting FOP like they do for other open source projects here at the Apache Software Foundation. As far as I know IBM even has its own implementation of XSL-FO although I don't know if it's actively maintained. On 07.03.2005 16:27:33 Michael Iwaniewicz wrote: Dear FOP developers, we are a big sw-development and decidedrecently to change or old bookmaster/afp based print componentto XSL-FO. As part of our solution we started to use FOP but run into formattingproblems in the area of the keep-together and keep-with-nextoptions. We got the impression that the FOP developmentcame to a kind of stand-still, since the current version is dated from2003. I just wanted to ask you if our impression is correct. We have nowto decide if we change from FOP to XEP or XSL-Formatter. Thanks for your help, Michael Michael Iwaniewicz CHIS Architecture Office: (43-1) 21145-6446 Mobile:(43) (0) 664-618-5839 Jeremias Maerki
Re: future of FOP
Jeremias Maerki wrote: snip/ I understand that IBM is quite big in the document business. It would be very interesting if IBM committed to supporting FOP like they do for other open source projects here at the Apache Software Foundation. As far as I know IBM even has its own implementation of XSL-FO although I don't know if it's actively maintained. I guess you mean the alphaworks XFC project? It is not maintained at all. I posted a are you still alive question back in 2003, still waiting for a reply ;-) Chris
Re: FOP at ApacheCon Europe 2005?
Jeremias Maerki wrote: I was also thinking about something like hidden treasures in the XML Graphics project but I guess there's not so much meat on that bone to fill one hour. Well, there should be enough for an hour, at least in theory. I couldn't convice (yet) my boss that I have an important mission in Stuttgart in July. If I could, I'd probably talk about: - Handling fonts in Java, why the AWT font and text rendering subsystem is lame, and what FOP, Batik and perhaps others would expect from an API. - How to implement flowing text, line breaking and hyphenation efficiently; why the Java BreakIterator and other parts of the Java Unicode support sux0rs; what's behind TR14; Unicode normalization of text before looking it up in a dictionary, and efficient implementation of said dictionary for looking up all substrings in a word (using a trie, a PATRICIA tree or whatever) - Talk about the question why the algorithms aren't simply copied from Gecko (the Mozilla layout engine) Now that the deadline has been extended, I'll attempt it again. J.Pietschmann
Re: FOP at ApacheCon Europe 2005?
Cool, that would be great stuff. Let's hope your boss lets you off the leash. On 07.03.2005 23:57:50 J.Pietschmann wrote: Jeremias Maerki wrote: I was also thinking about something like hidden treasures in the XML Graphics project but I guess there's not so much meat on that bone to fill one hour. Well, there should be enough for an hour, at least in theory. I couldn't convice (yet) my boss that I have an important mission in Stuttgart in July. If I could, I'd probably talk about: - Handling fonts in Java, why the AWT font and text rendering subsystem is lame, and what FOP, Batik and perhaps others would expect from an API. - How to implement flowing text, line breaking and hyphenation efficiently; why the Java BreakIterator and other parts of the Java Unicode support sux0rs; what's behind TR14; Unicode normalization of text before looking it up in a dictionary, and efficient implementation of said dictionary for looking up all substrings in a word (using a trie, a PATRICIA tree or whatever) - Talk about the question why the algorithms aren't simply copied from Gecko (the Mozilla layout engine) Now that the deadline has been extended, I'll attempt it again. J.Pietschmann Jeremias Maerki
Re: DO NOT REPLY [Bug 33760] New: - [Patch] current AWTRenderer
I worked on my patch and tried to integrate you inputs. There are still many issues, but I think the basic structure is OK. You can find a patch attached to bug 33760. Comments inline: On Mon, 28 Feb 2005, Victor Mote wrote: 1. FOray has factored the FOP font logic into a separate module, cleaned it up significantly, and made some modest improvements. A few weeks ago, I aXSL-ized it as well, which means that it is written to a (theoretically) independent interface: http://cvs.sourceforge.net/viewcvs.py/axsl/axsl/axsl-font/src/java/org/axsl/ font/ I think there is general support within FOP to implement the FOray/aXSL font work in the FOP 1.0 code, but so far no one has actually taken the time to do it. If you get into messing with fonts at all, I highly recommend that FOray be implemented before doing anything else. I will be happy to support efforts to that end. For what I understand now, your approach sounds good to me. But I'm missing some major pieces of the picture ATM to start implementing your aXSL interface in FOP. Please let me come back to you when I'll feel more comfortable with the font-mechanism. On Mon, 28 Feb 2005 , Jeremias Maerki wrote: AbstractRenderer: I moved what I could reuse from PDFRenderer to AbstractRenderer: renderTextDecorations(), handleRegionTraits(), and added the needed empty methods. I think that was good although only time will tell if this will hold for all renderers to come. Eventually, I didn't modify AbstractRenderer, PDFRenderer and PS Renderer at all. The implementation of AWTRenderer is close to the other renderers, so that putting some methods in AbstractRenderer should not be a big problem. Speaking of startVParea(), could we rename it to something more meanigfull? Proposition: TransformPosition, or something like this. Actually, I like startVParea() (or rather startViewportArea like I would rather call it) because only for viewport a new transformation matrix is necessary. startViewportArea() is fine for me. I think the Java2D approach is not unlike the PDF/PS approach. Adobe was Sun's closest partner when they developed the Java2D API. I implemented a simple .bmp rendering (BMPReader.java). If there's a better way to render .bmp (JAI?), let me know. This should not be necessary. We have a BMP implementation in org.apache.fop.images. The BMP bitmaps should be loaded through that mechanism. OK, now I see. But how can I get an awt.Image from a FopImage? BTW, Using Graphics.create() you should be able to create a copy of the current Graphics2D object. By pushing the old one on a stack and overwriting the graphics member variable should should be able to create the same effect as with currentState.push()/saveGraphicsState() in PDFRenderer.startVParea () and currentState.pop()/restoreGraphicsState ()in endVParea(). When leaving a VP area you can simply restore an older Graphics2D object for the stack and continue painting. This will undo any transformations and state change done in the copy used within the VP area. See second paragraph in javadocs of java.awt.Graphics. Thanks for the hint. I did just that in AWTGraphicsState (same as PDFState). It holds all the context (font, colour, stroke, transformation) of the current graphics, and can act as a stack, too. I created an interface (RendererState) that could be implemented by all xxxState of the renderers. To be discussed... I also added a Debug button on the AWTRenderer-Windows, which outlines the blocks. This is just a test, and I would like to develop a full-fledged visual debugger [1]. If this code works for you, then I'll start to separate the Java2DRenderer and the AWTRenderer. Otherwise, please tell me how I can improve my code. Renaud [1] http://wiki.apache.org/xmlgraphics-fop/FopAndJava2D
DO NOT REPLY [Bug 33760] - [Patch] current AWTRenderer
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=33760. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=33760 [EMAIL PROTECTED] changed: What|Removed |Added Attachment #14371|0 |1 is obsolete|| Attachment #14372|0 |1 is obsolete|| --- Additional Comments From [EMAIL PROTECTED] 2005-03-08 03:25 --- Created an attachment (id=14426) -- (http://issues.apache.org/bugzilla/attachment.cgi?id=14426action=view) patch agains head for AWTRenderer -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.