Thank you, I will do that. > Karl Koch wrote: > > >I appologise in advance, if some of my writing here has been said before. > >The last three answers to my question have been suggesting pattern > matching > >solutions and Swing. Pattern matching was introduced in Java 1.4 and > Swing > >is something I cannot use since I work with Java 1.1 on a PDA. > > > > > I see, > > In this case you can read line by line your HTML file and then write > something like this: > > String line; > int startPos, endPos; > StringBuffer text = new StringBuffer(); > while((line = reader.readLine()) != null ){ > startPos = line.indexOf(">"); > endPos = line.indexOf("<"); > if(startPos >0 && endPos > startPos) > text.append(line.substring(startPos, endPos)); > } > > This is just a sample code that should work if you have just one tag per > line in the HTML file. > This can be a start point for you. > > Hope it helps, > > Best, > > Sergiu > > >I am wondering if somebody knows a piece of simple sourcecode with low > >requirement which is running under this tense specification. > > > >Thank you all, > >Karl > > > > > > > >>No one has yet mentioned using ParserDelegator and ParserCallback that > >>are part of HTMLEditorKit in Swing. I have been successfully using > >>these classes to parse out the text of an HTML file. You just need to > >>extend HTMLEditorKit.ParserCallback and override the various methods > >>that are called when different tags are encountered. > >> > >> > >>On Feb 1, 2005, at 3:14 AM, Jingkang Zhang wrote: > >> > >> > >> > >>>Three HTML parsers(Lucene web application > >>>demo,CyberNeko HTML Parser,JTidy) are mentioned in > >>>Lucene FAQ > >>>1.3.27.Which is the best?Can it filter tags that are > >>>auto-created by MS-word 'Save As HTML files' function? > >>> > >>> > >>-- > >>Bill Tschumy > >>Otherwise -- Austin, TX > >>http://www.otherwise.com > >> > >> > >>--------------------------------------------------------------------- > >>To unsubscribe, e-mail: [EMAIL PROTECTED] > >>For additional commands, e-mail: [EMAIL PROTECTED] > >> > >> > >> > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] >
-- 10 GB Mailbox, 100 FreeSMS http://www.gmx.net/de/go/topmail +++ GMX - die erste Adresse für Mail, Message, More +++ --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]