The latest RTF Spec (1.7), pertaining to Word 2002 is at: http://download.microsoft.com/download/Word2002/Install/1.7/W98NT42KMeXP /EN-US/W2KRTFSF.exe
Self Extracting exe with the Word doc inside. Scott Sanders -----Original Message----- From: Bertrand Delacretaz [mailto:[EMAIL PROTECTED]] Sent: Tuesday, November 27, 2001 3:40 AM To: [EMAIL PROTECTED] Subject: Re: Merging jfor into FOP - what's the plan? Hi Arved, > What are your recommendations for someone to come up to speed with RTF? I'd recommend to stay away from it unless you really have to ;-) Seriously, to someone accustomed to clear and well-defined specs, RTF is somewhat messy, what it is really is a documented internal format, not a spec that has been agreed upon by a carefully-selected comittee. The RTF spec that we use in jfor is (mostly) V1.5 from Microsoft, who since moved on to 1.6 (at least), but apparently 1.5 is the most widely supported spec. A google search shows it at http://www.dubois.ws/software/RTF, it might be harder to find at Microsoft as it's not the latest. The rtflib package of jfor (available at www.jfor.org) encapsulates our knowledge of RTF and is fairly simple and understandable, but it is still too much element-oriented. One important thing to realize (happened too late here) is that RTF is more flow-based or stack-based than element-based: not everything that is opened has to be closed, it's more like a flow with embedded attribute changes. > As I understand it, RTF is presented > to a user-agent which does a fair amount of layout; higher-level structures > are still present in the RTF. Right - but there are both structure and presentations codes, so an RTF document could be both. Jfor has a strong bend towards structure, as usually the user goal is to get an editable RTF document, where as much of the original document structure must be preserved for convenience. Precise appearance usually comes second, as applying a new wordprocessor style sheet can change a lot of it. RTF is both a presentation and a structure format, along with a moving target due to the "spec" being expanded and rewritten with nearly every new version of winword. There are a many grey areas in the spec, meaning the only possible test is opening the generated RTF in the desired wordprocessors (and often watching it crash...). > <snip> > This is not so different from MIF Agreed. We are working with MIF for another project, and didn't choose FOP for that because of lack of precise control over the MIF output. I tend to see these formats as: -PDF for finished high-quality output ("presentation language"), layout 100% done by FOP -MIF for semi-finished high-quality output ("typography language"), layout done by Framemaker according to MIF instructions. -RTF for editable structure + presentation output ("wordprocessing language"), layout done by wordprocessor. So I fully agree that MIF and RTF "renderers" share a lot in common - they must be able to get as much information as possible about the original document structure, and in my view do not need any layout computations. > In a sense with RTF and MIF (and HTML for anyone who really desperately > wants to see FO->HTML) we are talking about translators as opposed to > formatters and renderers... yes - that's why I called jfor a "converter" instead of "formatter" Without knowing too much about FOP internals, I think a processing chain along these lines might help: parsing if needed -> SAX events -> FO attributes processing (validation, inheritance) -> StructureRenderer StructureRenderer is EITHER Layout + PrintRenderer OR StructureProcessor (RTF, MIF, etc.) What we need to find out is how much the existing FOP and these "structure renderers" have in common. - Bertrand --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]