Re: Skype-conference on page-breaking?

2005-03-07 Thread Jeremias Maerki
Thanks, Luca.

I've had a nice casual talk on the phone with Simon, yesterday.
Essentially, we only talked about very high-level stuff, especially the
decision for a certain strategy (or two). You know I came up with the
idea to create a simpler best-fit strategy with no look-ahead for
invoice-style documents but maybe it would be possible to design your
obvious total-fit strategy in a way that it could be used as a best-fit
without look-ahead. The problem, like I mentioned already, is the
possible change of available IPD within a page-sequence which results in
a possible back-tracking and recalculation of vertical boxes.

Of course, if it's possible to stay with one page-breaking algorithm for
all use cases that would be best (because of the reduced effort), but
only if the algorithm is reasonably fast for invoice-style documents.
I'm repeatedly confronted with certain speed requirements in this case.
Since modern high-volume single-feed printers handle about 180 pages per
minute (continuous feed systems handle over 4 times that speed, but I
think that's neither relevant, nor realistic here) FOP should be able to
operate close to these 180 pages per minute for not too complex
documents on a modern server. That means 330ms per page. Not much. Of
course, in such an environment it is possible to distribute the
formatting process over several blade servers but I had to realize that
certain companies tend to prefer spending 100'000 dollars on a big
server than spending a lot less for a much faster CPU-power-oriented
setup. It seems to be hard to say good-bye to the old host systems. Well,
that's just like the reality looks like in my environment.

Simon, for example, is much more interested in book-style documents
where there are other requirements. Speed is not a big issue, but
quality is.

In the end, I think we need to rate the chosen approach in these two
points of view. These are very contradicting requirements and it's
something that seems quite important to me not to forget here.

Luca, do you think your total-fit approach may be written in a way to
handle changing available IPDs and that look-ahead can be disabled to
improve processing speed at the cost of optimal break decisions? If it's
ok for you (and feasible) I'd like to integrate what you already have
(in code) into that branch I was talking about. I would like to avoid
recreating something you've already started, even if it doesn't work
with the changes that happened in the last weeks. Even if we may create
two different strategies I'm sure that certain parts will be shared by
both approaches, like the creation of Knuth-style elements for the
PageLM. 

Some more comments inline:

On 04.03.2005 13:23:01 Luca Furini wrote:
 
 Jeremias Maerki wrote:
 
 Would you consider sharing what you already
 have? This may help us in the general discussion and may be a good
 starting point.
 
 Ok, I'll try to.
 
 The main change in the LineLM is that the line breaking algorithm does not
 select only the node in activeList with fewest demerits: all the nodes
 whose demerits are = a threshold are used to create LineBreakPositions,
 so for each paragraph there is a set of layout options (for example, a
 paragraph could create 8 to 10 lines, 9 being the layout with fewest
 demerits).

Hmm, that's a feature that I would say is something that only book-style
documents will need. Invoice-style documents could live without it.

 According to the value of widows and orphans, the LineLM creates a
 sequence of elements: besides normal lines, represented by a box, there
 are optional lines, represented by
   box(0) penalty(inf,0) glue(0,1,0) box(0)
 and removable lines
   box(0) penalty(inf,0) glue(1,0,1) box(0)
 A few complications arise if not every possible layout allows breaks
 between lines, but they all can be solved using boxes, glues and
 penalties (for example, if a paragraph needs 3 or 4 lines, if it uses 3
 it cannot be parted).

Also something that's not all too important for invoice-style documents,
although it can't hurt to have it.

 The BlockLM, and a block stacking LM in general, adds elements
 representing its children's spaces and keep condition, for example
 adding a 0 penalty or an infinite penalty according to
 child1.mustKeepWithNext(), child2.mustKeepWithPrevious() and
 this.mustKeepTogether().

That's certainly a must-have in any case.

 The PageLM, once it has the list of elements representing a whole
 page-sequence (or the content before a forced page break), calls the same
 breaking algorithm, using only a different selection method which leaves
 only one node in activeList.

That's the part where I have a big question mark about changing
available IPD. We may have to have a check that figures out if the
available IPD changes within a page-sequence by inspecting the
page-masters. That would allow us to switch automatically between
total-fit and best-fit or maybe even first-fit. A remaining question
mark is with side-floats as they influence 

Re: Skype-conference on page-breaking?

2005-03-07 Thread Jeremias Maerki
I don't know why this is important to you but it's two to three months.

On 04.03.2005 12:40:04 Peter B. West wrote:
 Jeremias Maerki wrote:
  Sounds very interesting. Would you consider sharing what you already
  have? This may help us in the general discussion and may be a good
  starting point.
  
  My problem is that I have to deliver working page breaking with keeps,
  breaks, multi-column, adjustable spacing etc. in a relatively short
  period of time.
  
 
 How short?
 
 Peter
 -- 
 Peter B. West http://cv.pbw.id.au/
 Project Folio http://defoe.sourceforge.net/folio/



Jeremias Maerki



Re: FOP at ApacheCon Europe 2005?

2005-03-07 Thread Jeremias Maerki
FYI, I've just given myself a shove, followed Bertrand's suggestion and
submitted a session proposal for ApacheCon. I feel that our project
should be present there. I was also thinking about something like
hidden treasures in the XML Graphics project but I guess there's not
so much meat on that bone to fill one hour.

 ApacheCon Europe 2005 CFP submission
 
 Submitter: Jeremias Maerki [EMAIL PROTECTED]
 Title: Apache FOP: Optimizing speed and memory consumption
 Level: Experienced
 Style: 
 Orientation: Developer
 Duration: 60
 Categories: 
 Abstract:
 
 Apache FOP is the most popular XSL-FO implementation on the
 market. It is used in a wide variety of use cases to create documents
 in PDF, PostScript and other formats. This session will show a
 number of techniques to improve processing speed and and hints on how
 to handle things like OutOfMemoryErrors. It will also contain a
 short info block about the state and the future of the project.
 



On 12.02.2005 10:57:15 Bertrand Delacretaz wrote:
 Le 8 févr. 05, à 19:29, Jeremias Maerki a écrit :
 
  Most of you will probably have heard that ApacheCon Europe will be 
  happening in
  July. I think it would be great if FOP would somehow be visible there.
  There's a call for participation ending 2005-03-04. Any ideas?
 
 A recurring question in my consulting work is is FOP fast or what? or 
 more precisely how to tune XSL-FO for FOP to run efficiently, mostly 
 in view of avoiding memory bottlenecks.
 
 Me, I'm not using FOP hands-on enough these days to answer very 
 precisely, I usually just tell them to test their performance on large 
 documents very regularly during development, to avoid surprises.
 
 But maybe one of you FOP gurus could give a presentation with more 
 precise information about this?
 
 Just my 2 cents.
 
 -Bertrand



Jeremias Maerki



Re: FOP at ApacheCon Europe 2005?

2005-03-07 Thread Glen Mazza
Fantastic!  I hope to be able to do the same someday.

Glen


--- Jeremias Maerki [EMAIL PROTECTED] wrote:

 FYI, I've just given myself a shove, followed
 Bertrand's suggestion and
 submitted a session proposal for ApacheCon. I feel
 that our project
 should be present there. I was also thinking about
 something like
 hidden treasures in the XML Graphics project but I
 guess there's not
 so much meat on that bone to fill one hour.
 
  ApacheCon Europe 2005 CFP submission
  
  Submitter: Jeremias Maerki [EMAIL PROTECTED]
  Title: Apache FOP: Optimizing speed and memory
 consumption
  Level: Experienced
  Style: 
  Orientation: Developer
  Duration: 60
  Categories: 
  Abstract:
  
  Apache FOP is the most popular XSL-FO
 implementation on the
  market. It is used in a wide variety of use cases
 to create documents
  in PDF, PostScript and other formats. This session
 will show a
  number of techniques to improve processing speed
 and and hints on how
  to handle things like OutOfMemoryErrors. It will
 also contain a
  short info block about the state and the future of
 the project.
  
 
 
 
 On 12.02.2005 10:57:15 Bertrand Delacretaz wrote:
  Le 8 févr. 05, à 19:29, Jeremias Maerki a écrit :
  
   Most of you will probably have heard that
 ApacheCon Europe will be 
   happening in
   July. I think it would be great if FOP would
 somehow be visible there.
   There's a call for participation ending
 2005-03-04. Any ideas?
  
  A recurring question in my consulting work is is
 FOP fast or what? or 
  more precisely how to tune XSL-FO for FOP to run
 efficiently, mostly 
  in view of avoiding memory bottlenecks.
  
  Me, I'm not using FOP hands-on enough these days
 to answer very 
  precisely, I usually just tell them to test their
 performance on large 
  documents very regularly during development, to
 avoid surprises.
  
  But maybe one of you FOP gurus could give a
 presentation with more 
  precise information about this?
  
  Just my 2 cents.
  
  -Bertrand
 
 
 
 Jeremias Maerki
 
 



Re: Skype-conference on page-breaking?

2005-03-07 Thread Peter B. West
Jeremias Maerki wrote:
I don't know why this is important to you
Just curious.
but it's two to three months.
Ouch.  Good luck.  You might want to keep an eye on Folio.
Peter
On 04.03.2005 12:40:04 Peter B. West wrote:
Jeremias Maerki wrote:
Sounds very interesting. Would you consider sharing what you already
have? This may help us in the general discussion and may be a good
starting point.
My problem is that I have to deliver working page breaking with keeps,
breaks, multi-column, adjustable spacing etc. in a relatively short
period of time.
How short?

--
Peter B. West http://cv.pbw.id.au/
Project Folio http://defoe.sourceforge.net/folio/


Re: future of FOP

2005-03-07 Thread Jeremias Maerki
Michael,

if you follow the fop-dev mailing list you will realize that the
development has not come to a stand-still. It is true that the last
release is almost two years old. We're in a redesign phase which tries
to address exactly the issue of keeps among other things. The redesign
took a lot longer than anticipated. But we're on the right track so we
can start releasing again later this year, complete with keeps.

If you can't work around the missing keeps (they work on table-rows) and
you need an immediate solution you will need to switch to a different
solution for the time being.

I understand that IBM is quite big in the document business. It would be
very interesting if IBM committed to supporting FOP like they do for
other open source projects here at the Apache Software Foundation. As
far as I know IBM even has its own implementation of XSL-FO although I
don't know if it's actively maintained.

On 07.03.2005 16:27:33 Michael Iwaniewicz wrote:
 
 Dear FOP developers,
 
 we are a big sw-development and decidedrecently to change or old
 bookmaster/afp based print componentto XSL-FO. As part of our
 solution we started to use FOP but run into formattingproblems in the area of 
 the
 keep-together and keep-with-nextoptions. 
 
 We got the impression that the FOP developmentcame to a kind of
 stand-still, since the current version is dated from2003. I just wanted
 to ask you if our impression is correct. We have nowto decide if we
 change from FOP to XEP or XSL-Formatter.
 
 Thanks for your help, Michael
 
 
 Michael Iwaniewicz
 CHIS Architecture
 Office: (43-1) 21145-6446
 Mobile:(43) (0) 664-618-5839



Jeremias Maerki



Re: future of FOP

2005-03-07 Thread Chris Bowditch
Jeremias Maerki wrote:
snip/
I understand that IBM is quite big in the document business. It would be
very interesting if IBM committed to supporting FOP like they do for
other open source projects here at the Apache Software Foundation. As
far as I know IBM even has its own implementation of XSL-FO although I
don't know if it's actively maintained.
I guess you mean the alphaworks XFC project? It is not maintained at all. I 
posted a are you still alive question back in 2003, still waiting for a 
reply ;-)

Chris



Re: FOP at ApacheCon Europe 2005?

2005-03-07 Thread J.Pietschmann
Jeremias Maerki wrote:
I was also thinking about something like
hidden treasures in the XML Graphics project but I guess there's not
so much meat on that bone to fill one hour.
Well, there should be enough for an hour, at least in theory.
I couldn't convice (yet) my boss that I have an important mission
in Stuttgart in July. If I could, I'd probably talk about:
- Handling fonts in Java, why the AWT font and text rendering
 subsystem is lame, and what FOP, Batik and perhaps others would
 expect from an API.
- How to implement flowing text, line breaking and hyphenation
 efficiently; why the Java BreakIterator and other parts of the Java
 Unicode support sux0rs; what's behind TR14; Unicode normalization of
 text before looking it up in a dictionary, and efficient implementation
 of said dictionary for looking up all substrings in a word (using a
 trie, a PATRICIA tree or whatever)
- Talk about the question why the algorithms aren't simply copied
 from Gecko (the Mozilla layout engine)
Now that the deadline has been extended, I'll attempt it again.
J.Pietschmann


Re: FOP at ApacheCon Europe 2005?

2005-03-07 Thread Jeremias Maerki
Cool, that would be great stuff. Let's hope your boss lets you off the
leash.

On 07.03.2005 23:57:50 J.Pietschmann wrote:
 Jeremias Maerki wrote:
  I was also thinking about something like
  hidden treasures in the XML Graphics project but I guess there's not
  so much meat on that bone to fill one hour.
 
 Well, there should be enough for an hour, at least in theory.
 I couldn't convice (yet) my boss that I have an important mission
 in Stuttgart in July. If I could, I'd probably talk about:
 - Handling fonts in Java, why the AWT font and text rendering
   subsystem is lame, and what FOP, Batik and perhaps others would
   expect from an API.
 - How to implement flowing text, line breaking and hyphenation
   efficiently; why the Java BreakIterator and other parts of the Java
   Unicode support sux0rs; what's behind TR14; Unicode normalization of
   text before looking it up in a dictionary, and efficient implementation
   of said dictionary for looking up all substrings in a word (using a
   trie, a PATRICIA tree or whatever)
 - Talk about the question why the algorithms aren't simply copied
   from Gecko (the Mozilla layout engine)
 
 Now that the deadline has been extended, I'll attempt it again.
 
 J.Pietschmann



Jeremias Maerki



Re: DO NOT REPLY [Bug 33760] New: - [Patch] current AWTRenderer

2005-03-07 Thread Renaud Richardet
I worked on my patch and tried to integrate you inputs. There are
still many issues, but I think the basic structure is OK. You can find
a patch attached to bug 33760.

Comments inline:

On Mon, 28 Feb 2005, Victor Mote wrote:

 1. FOray has factored the FOP font logic into a separate module, cleaned it
 up significantly, and made some modest improvements. A few weeks ago, I
 aXSL-ized it as well, which means that it is written to a (theoretically)
 independent interface:
 http://cvs.sourceforge.net/viewcvs.py/axsl/axsl/axsl-font/src/java/org/axsl/
 font/
 I think there is general support within FOP to implement the FOray/aXSL font
 work in the FOP 1.0 code, but so far no one has actually taken the time to
 do it. If you get into messing with fonts at all, I highly recommend that
 FOray be implemented before doing anything else. I will be happy to support
 efforts to that end.

For what I understand now, your approach sounds good to me. But I'm
missing some major pieces of the picture ATM to start implementing
your aXSL interface in FOP. Please let me come back to you when I'll
feel more comfortable with the font-mechanism.


On Mon, 28 Feb 2005 , Jeremias Maerki wrote:

  AbstractRenderer: I moved what I could reuse from PDFRenderer to
  AbstractRenderer: renderTextDecorations(), handleRegionTraits(), and added 
  the
  needed empty methods.
 
 I think that was good although only time will tell if this will hold for
 all renderers to come. 

Eventually, I didn't modify AbstractRenderer, PDFRenderer and PS
Renderer at all.
The implementation of AWTRenderer is close to the other renderers, so
that putting some methods in AbstractRenderer should not be a big
problem.

  Speaking of startVParea(), could we rename it to something more meanigfull?
  Proposition: TransformPosition, or something like this.
 
 Actually, I like startVParea() (or rather startViewportArea like I would
 rather call it) because only for viewport a new transformation matrix is
 necessary. 

startViewportArea() is fine for me.

 I think the Java2D approach is not unlike the
 PDF/PS approach. 

Adobe was Sun's closest partner when they developed the Java2D API.

  I implemented a simple .bmp rendering (BMPReader.java). 
  If there's a better way to render .bmp (JAI?), let me know. 
 
 This should not be necessary. We have a BMP implementation in 
 org.apache.fop.images.
 The BMP bitmaps should be loaded through that mechanism. 

OK, now I see. But how can I get an awt.Image from a FopImage?

 BTW, Using Graphics.create() you should be able to create a copy of the
 current Graphics2D object. By pushing the old one on a stack and
 overwriting the graphics member variable should should be able to create
 the same effect as with currentState.push()/saveGraphicsState() in
 PDFRenderer.startVParea () and currentState.pop()/restoreGraphicsState
 ()in endVParea(). When leaving a VP area you can simply restore an older
 Graphics2D object for the stack and continue painting. This will undo
 any transformations and state change done in the copy used within the VP
 area. See second paragraph in javadocs of java.awt.Graphics.

Thanks for the hint. I did just that in AWTGraphicsState (same as
PDFState). It holds all the context (font, colour, stroke,
transformation) of the current graphics, and can act as a stack, too.
I created an interface (RendererState) that could be implemented by
all xxxState of the renderers. To be discussed...

I also added a Debug button on the AWTRenderer-Windows, which
outlines the blocks. This is just a test, and I would like to develop
a full-fledged visual debugger [1].

If this code works for you, then I'll start to separate the
Java2DRenderer and the AWTRenderer. Otherwise, please tell me how I
can improve my code.

Renaud

[1] http://wiki.apache.org/xmlgraphics-fop/FopAndJava2D


DO NOT REPLY [Bug 33760] - [Patch] current AWTRenderer

2005-03-07 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=33760.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=33760


[EMAIL PROTECTED] changed:

   What|Removed |Added

  Attachment #14371|0   |1
is obsolete||
  Attachment #14372|0   |1
is obsolete||




--- Additional Comments From [EMAIL PROTECTED]  2005-03-08 03:25 ---
Created an attachment (id=14426)
 -- (http://issues.apache.org/bugzilla/attachment.cgi?id=14426action=view)
patch agains head for AWTRenderer


-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.