At 03:10 PM 9/6/2002 +0200, you wrote:
>Thanks DR.
>
>I'll prefer to convert an RTF doc. into HTML before displaying on the
>browser.


OK.

That's somewhat more complicated, but not impossible.

First of all, then, you don't need to worry about the mime type stuff.  If 
you want to serve up HTML, then just leave the mime type as "text/html" 
(the default).


What you would need to do is use an RTF parser (e.g., JavaCC with an RTF 
grammar) to parse the RTF document.  You would then need to adapt that 
parser to decide what to do when different RTF control codes came 
along.  e.g. output <b>text</b> when you hit a control code for bold, 
etc.  Dince you're doing this in a servlet or JSP, the output stream or 
writer that you would use would be the HTTPResponse's output stream.

This parsing and converting is NOT a trivial task.  The RTF spec 
(http://www.wotsit.org/download.asp?f=rtf15) is very detailed and contains 
MANY control codes including support for tables, font changes, etc.  It 
would be a VERY large task for you to attempt to provide support for the 
whole RTF spec.  Some better options would be to either support NO control 
codes, or to support only a small subset of them (e.g., font changes).

Supporting no control codes would mean that you just ignore them all, and 
only output the text in the document.  This would work, and the document 
would be reasonably readable as HTML, but it would be missing all its 
formatting.  Providing support for a few RTF control codes would probably 
be a better choice though, since it would improve the formatting quite a bit.


By the way, you might also want to take a look at another RTF grammar that 
someone posted over at the JavaCC Grammar Repository 
(http://www.cobase.cs.ucla.edu/pub/javacc/#Rsection).  Eric Friedman's 
grammar appears to have more built-in support for RTF control codes than 
mine does.  So maybe you could just use his and then you wouldn't have to 
write as much code of your own to handle the control codes.


Anyway, as far as JavaCC in general, the way it works is that you give it 
an InputStream (or Reader) which it parses.  If you have your RTF doc 
stored in a database, then you would need to turn it into an input stream.


FYI, while I was working on my project I found several stand-alone software 
applications that could turn RTF docs into HTML.  You can find them too on 
Google.  But I wound up not using any of them because I wanted my software 
to run as a servlet/JSP.  Since these were stand-alone apps (and not free) 
they weren't a good fit.  It sounds like you want something similar to me, 
so they probably won't be useful to you either, but I thought I'd mention 
it anyway.


Hope this helps.  Email back if not.


DR



To change your JDJList options, please visit: http://www.sys-con.com/java/list.cfm

Reply via email to