Re: Merging jfor into FOP - what's the plan?
On Thursday 29 November 2001 12:44, Keiron Liddle wrote: > So are things like static areas, markers, page numbers etc. possible with > rtf or are these type of things simply not possible. Keiron, as far as I know, RTF does support the following (but jfor currently not for most of these things) - In parentheses, my understanding of these concepts, to make sure we're on the same wavelength: static areas - yes (headers and footers) markers - yes (references like "see page N") page numbers - yes (dynamic auto-numbering) But things like page numbers must be left to RTF to compute, FOP will need to include an *RTF code* to let the RTF reader compute page numbers, not compute them by itself. - Bertrand - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: Merging jfor into FOP - what's the plan?
On 2001.11.27 12:40 Bertrand Delacretaz wrote: > Without knowing too much about FOP internals, I think a processing chain > along these lines might help: > > parsing if needed > -> SAX events > -> FO attributes processing (validation, inheritance) > -> StructureRenderer > > StructureRenderer is > EITHER Layout + PrintRenderer > OR StructureProcessor (RTF, MIF, etc.) > > What we need to find out is how much the existing FOP and these > "structure renderers" have in common. This sounds like the sort of approach that we need. If possible we might be able to have a "layout processor" which normally reads the fo objects and creates an area tree. An alternate implementation will instead directly create the output document. The fo object tree does all the handling of attributes. So are things like static areas, markers, page numbers etc. possible with rtf or are these type of things simply not possible. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
RE: Merging jfor into FOP - what's the plan?
The latest RTF Spec (1.7), pertaining to Word 2002 is at: http://download.microsoft.com/download/Word2002/Install/1.7/W98NT42KMeXP /EN-US/W2KRTFSF.exe Self Extracting exe with the Word doc inside. Scott Sanders -Original Message- From: Bertrand Delacretaz [mailto:[EMAIL PROTECTED]] Sent: Tuesday, November 27, 2001 3:40 AM To: [EMAIL PROTECTED] Subject: Re: Merging jfor into FOP - what's the plan? Hi Arved, > What are your recommendations for someone to come up to speed with RTF? I'd recommend to stay away from it unless you really have to ;-) Seriously, to someone accustomed to clear and well-defined specs, RTF is somewhat messy, what it is really is a documented internal format, not a spec that has been agreed upon by a carefully-selected comittee. The RTF spec that we use in jfor is (mostly) V1.5 from Microsoft, who since moved on to 1.6 (at least), but apparently 1.5 is the most widely supported spec. A google search shows it at http://www.dubois.ws/software/RTF, it might be harder to find at Microsoft as it's not the latest. The rtflib package of jfor (available at www.jfor.org) encapsulates our knowledge of RTF and is fairly simple and understandable, but it is still too much element-oriented. One important thing to realize (happened too late here) is that RTF is more flow-based or stack-based than element-based: not everything that is opened has to be closed, it's more like a flow with embedded attribute changes. > As I understand it, RTF is presented > to a user-agent which does a fair amount of layout; higher-level structures > are still present in the RTF. Right - but there are both structure and presentations codes, so an RTF document could be both. Jfor has a strong bend towards structure, as usually the user goal is to get an editable RTF document, where as much of the original document structure must be preserved for convenience. Precise appearance usually comes second, as applying a new wordprocessor style sheet can change a lot of it. RTF is both a presentation and a structure format, along with a moving target due to the "spec" being expanded and rewritten with nearly every new version of winword. There are a many grey areas in the spec, meaning the only possible test is opening the generated RTF in the desired wordprocessors (and often watching it crash...). > > This is not so different from MIF Agreed. We are working with MIF for another project, and didn't choose FOP for that because of lack of precise control over the MIF output. I tend to see these formats as: -PDF for finished high-quality output ("presentation language"), layout 100% done by FOP -MIF for semi-finished high-quality output ("typography language"), layout done by Framemaker according to MIF instructions. -RTF for editable structure + presentation output ("wordprocessing language"), layout done by wordprocessor. So I fully agree that MIF and RTF "renderers" share a lot in common - they must be able to get as much information as possible about the original document structure, and in my view do not need any layout computations. > In a sense with RTF and MIF (and HTML for anyone who really desperately > wants to see FO->HTML) we are talking about translators as opposed to > formatters and renderers... yes - that's why I called jfor a "converter" instead of "formatter" Without knowing too much about FOP internals, I think a processing chain along these lines might help: parsing if needed -> SAX events -> FO attributes processing (validation, inheritance) -> StructureRenderer StructureRenderer is EITHER Layout + PrintRenderer OR StructureProcessor (RTF, MIF, etc.) What we need to find out is how much the existing FOP and these "structure renderers" have in common. - Bertrand - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: Merging jfor into FOP - what's the plan?
Bertrand et al, It looks as though the principle of disentangling the FO and Area tree builds, with communication by a stream of FOEvents, would also be useful in this context. Peter Bertrand Delacretaz wrote: > Hi Arved, > > >>What are your recommendations for someone to come up to speed with RTF? >> > > I'd recommend to stay away from it unless you really have to ;-) > Seriously, to someone accustomed to clear and well-defined specs, RTF is > somewhat messy, what it is really is a documented internal format, not a spec > that has been agreed upon by a carefully-selected comittee. > > The RTF spec that we use in jfor is (mostly) V1.5 from Microsoft, who since > moved on to 1.6 (at least), but apparently 1.5 is the most widely supported > spec. A google search shows it at http://www.dubois.ws/software/RTF, it might > be harder to find at Microsoft as it's not the latest. > > The rtflib package of jfor (available at www.jfor.org) encapsulates our > knowledge of RTF and is fairly simple and understandable, but it is still too > much element-oriented. > One important thing to realize (happened too late here) is that RTF is > more flow-based or stack-based than element-based: not everything that is > opened has to be closed, it's more like a flow with embedded attribute > changes. > > >>As I understand it, RTF is presented >>to a user-agent which does a fair amount of layout; higher-level structures >>are still present in the RTF. >> > > Right - but there are both structure and presentations codes, so an RTF > document could be both. > Jfor has a strong bend towards structure, as usually the user goal is to get > an editable RTF document, where as much of the original document structure > must be preserved for convenience. > Precise appearance usually comes second, as applying a new wordprocessor > style sheet can change a lot of it. > > RTF is both a presentation and a structure format, along with a moving target > due to the "spec" being expanded and rewritten with nearly every new version > of winword. > There are a many grey areas in the spec, meaning the only possible test is > opening the generated RTF in the desired wordprocessors (and often watching > it crash...). > > >> >>This is not so different from MIF >> > Agreed. We are working with MIF for another project, and didn't choose FOP > for that because of lack of precise control over the MIF output. > > I tend to see these formats as: > -PDF for finished high-quality output ("presentation language"), layout 100% > done by FOP > > -MIF for semi-finished high-quality output ("typography language"), layout > done by Framemaker according to MIF instructions. > > -RTF for editable structure + presentation output ("wordprocessing > language"), layout done by wordprocessor. > > So I fully agree that MIF and RTF "renderers" share a lot in common - > they must be able to get as much information as possible about the original > document structure, and in my view do not need any layout computations. > > >>In a sense with RTF and MIF (and HTML for anyone who really desperately >>wants to see FO->HTML) we are talking about translators as opposed to >>formatters and renderers... >> > > yes - that's why I called jfor a "converter" instead of "formatter" > > Without knowing too much about FOP internals, I think a processing chain > along these lines might help: > > parsing if needed > -> SAX events > -> FO attributes processing (validation, inheritance) > -> StructureRenderer > > StructureRenderer is > EITHER Layout + PrintRenderer > OR StructureProcessor (RTF, MIF, etc.) > > What we need to find out is how much the existing FOP and these "structure > renderers" have in common. > > - Bertrand > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, email: [EMAIL PROTECTED] > > -- Peter B. West [EMAIL PROTECTED] http://powerup.com.au/~pbwest "Lord, to whom shall we go?" - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: Merging jfor into FOP - what's the plan?
Hi Arved, > What are your recommendations for someone to come up to speed with RTF? I'd recommend to stay away from it unless you really have to ;-) Seriously, to someone accustomed to clear and well-defined specs, RTF is somewhat messy, what it is really is a documented internal format, not a spec that has been agreed upon by a carefully-selected comittee. The RTF spec that we use in jfor is (mostly) V1.5 from Microsoft, who since moved on to 1.6 (at least), but apparently 1.5 is the most widely supported spec. A google search shows it at http://www.dubois.ws/software/RTF, it might be harder to find at Microsoft as it's not the latest. The rtflib package of jfor (available at www.jfor.org) encapsulates our knowledge of RTF and is fairly simple and understandable, but it is still too much element-oriented. One important thing to realize (happened too late here) is that RTF is more flow-based or stack-based than element-based: not everything that is opened has to be closed, it's more like a flow with embedded attribute changes. > As I understand it, RTF is presented > to a user-agent which does a fair amount of layout; higher-level structures > are still present in the RTF. Right - but there are both structure and presentations codes, so an RTF document could be both. Jfor has a strong bend towards structure, as usually the user goal is to get an editable RTF document, where as much of the original document structure must be preserved for convenience. Precise appearance usually comes second, as applying a new wordprocessor style sheet can change a lot of it. RTF is both a presentation and a structure format, along with a moving target due to the "spec" being expanded and rewritten with nearly every new version of winword. There are a many grey areas in the spec, meaning the only possible test is opening the generated RTF in the desired wordprocessors (and often watching it crash...). > > This is not so different from MIF Agreed. We are working with MIF for another project, and didn't choose FOP for that because of lack of precise control over the MIF output. I tend to see these formats as: -PDF for finished high-quality output ("presentation language"), layout 100% done by FOP -MIF for semi-finished high-quality output ("typography language"), layout done by Framemaker according to MIF instructions. -RTF for editable structure + presentation output ("wordprocessing language"), layout done by wordprocessor. So I fully agree that MIF and RTF "renderers" share a lot in common - they must be able to get as much information as possible about the original document structure, and in my view do not need any layout computations. > In a sense with RTF and MIF (and HTML for anyone who really desperately > wants to see FO->HTML) we are talking about translators as opposed to > formatters and renderers... yes - that's why I called jfor a "converter" instead of "formatter" Without knowing too much about FOP internals, I think a processing chain along these lines might help: parsing if needed -> SAX events -> FO attributes processing (validation, inheritance) -> StructureRenderer StructureRenderer is EITHER Layout + PrintRenderer OR StructureProcessor (RTF, MIF, etc.) What we need to find out is how much the existing FOP and these "structure renderers" have in common. - Bertrand - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: Merging jfor into FOP - what's the plan?
Hi, Bertrand What are your recommendations for someone to come up to speed with RTF? I (and possibly others) need to understand RTF better in order to assist. The existing renderers for PDF, Postscript, XML and AWT can all handle raw areas...they do no layout whatsoever. As I understand it, RTF is presented to a user-agent which does a fair amount of layout; higher-level structures are still present in the RTF. This is not so different from MIF, and in fact, when the MIFRenderer was originally written, there _were_ some problems (as I recall) in working from the area tree directly. For example, MIF understands tables - this information needed to be passed along whereas other renderers no longer cared about such semantics. Since the MIFRenderer is somewhat moribund (I think) then jfor really becomes the prototype for a different class of formatter/renderers, operating in parallel with the existing code for PDF etc. It would be interesting to see if we can do things in such a way so as to resurrect MIF also, since I think it never ought to have been a renderer in the first place. In a sense with RTF and MIF (and HTML for anyone who really desperately wants to see FO->HTML) we are talking about translators as opposed to formatters and renderers...again, correct me if I am wrong, but the output of the translator is presented to a user-agent that will actually be doing layout. Regards, Arved Sandstrom At 08:43 AM 11/27/01 +0100, Bertrand Delacretaz wrote: >Hi Keiron, > >If there is not going to be a FOP release in the next few weeks, I >agree that a minimal integration does not make sense. > >Currently the jfor conversion is driven directly from SAX events, so the >first thing that comes to mind is driving it from the FO tree. > >You're right that, contrary to print renderers, the RTF one will need to know >about the structure of the original document. > >Does the FO tree handle things like attribute inheritance (i.e. a block >inherits the font definition from an ancestor block), or is this handled >while doing the layout? Such inheritance is currently missing in jfor. > >To summarize: >-jfor needs to know about the original document structure: speaks for option >(A), plugging jfor right after the FO tree stage if I understand well. > >-BUT jfor could probably benefit from some operations done at the layout >stage (attributes inheritance, others?) : speaks for option (B), extending >the renderer interface to let the renderers know (cleanly) about the original >document structure. > >If you can give me some pointers about where to look at in the code to >evaluate (A) and (B), I'll have a look. > >- Bertrand > >- >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, email: [EMAIL PROTECTED] > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: Merging jfor into FOP - what's the plan?
Hi Keiron, If there is not going to be a FOP release in the next few weeks, I agree that a minimal integration does not make sense. Currently the jfor conversion is driven directly from SAX events, so the first thing that comes to mind is driving it from the FO tree. You're right that, contrary to print renderers, the RTF one will need to know about the structure of the original document. Does the FO tree handle things like attribute inheritance (i.e. a block inherits the font definition from an ancestor block), or is this handled while doing the layout? Such inheritance is currently missing in jfor. To summarize: -jfor needs to know about the original document structure: speaks for option (A), plugging jfor right after the FO tree stage if I understand well. -BUT jfor could probably benefit from some operations done at the layout stage (attributes inheritance, others?) : speaks for option (B), extending the renderer interface to let the renderers know (cleanly) about the original document structure. If you can give me some pointers about where to look at in the code to evaluate (A) and (B), I'll have a look. - Bertrand - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: Merging jfor into FOP - what's the plan?
Hi Bertrand, For the short term I think that (1) would be the thing to do but since there won't be a release of FOP for a while there may be no point doing anything for the short term. As for how it will eventually end up working with the rest of fop. Can you give us a quick rundown of what is involved in creating an rtf document from xsl fo. What sort of information is passed from the fo to the rtf. How layout is considered etc. The way that FOP normally converts from fo to the output is by a few steps. First the fo is turned into the formatting object tree. This is then turned into an area tree. This area tree represents the final layout with data that any renderer can handle. The renderer then uses this area tree to create the pages. This means that the renderer knows nothing about the original document and does not have a concept of lists, tables etc. I should also point out that the MIF renderer used references to the formatting object tree to determine things ike tables to create tables in the output. This sort of thing is being revisited as it causes problems. Regards, Keiron. On 2001.11.23 13:32 Bertrand Delacretaz wrote: > (repost - I think the first one didn't get through) > > Now that the introductions are done, I'd like to initiate the discussion > about how to actually merge jfor into FOP. > > Currently I have one major code contribution to integrate into the jfor > code > base. I expect to be done in a week and would like to release a last > "non-FOP" version of jfor with these changes. > > Regarding the merging of jfor, I see three options: > > 1) inclusion of the jfor.jar in the FOP distribution, "user-level" > integration where a -rtf switch of FOP causes jfor to run instead of FOP > > Makes it possible for users to generate RTF + PDF without needing a > separate > download. No benefits on the developer side. We might get a lot of > questions > like "why is the RTF output so poor compared to PDF". > > 2) same but modify jfor to use the existing FOP infrastructure: startup, > parser, configuration, logging, etc.. > > 3) full integration of jfor as a FOP renderer, taking advantage of the > FOP > analysis of the XSL-FO document. > IMHO this needs to bypass the layout stage to stay quick and translate as > > much of the document structure as possible to RTF. > > Considering that I won't have much time in the next few weeks, my > suggestion > would be to first go ahead with 1) and simultaneously > studying and discussing how to best reach 2) and 3). > > Any thoughts? > > - Bertrand - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: Merging jfor into FOP - what's the plan?
On Friday 23 November 2001 20:13, Art Welch wrote: >. . . > Would it be possible to have one RTFRenderer > and then have an option use either the full FOP layout or bypass the FOP > layout for quick RTF?. . . I don't know about using the full FOP layout - last time I tried (beginning of this year) it looked hard - my understanding is that the Renderers receive "graphical area" events from FOP, whereas jfor works more at the "document elements" level. But I don't know much about what's currently going on about the layout mechanism - maybe it would be easier now. - Bertrand - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]