Hi
Apologies ......... Can Somebody Please tell me or how to include a constructer within 'org.apache.lucene.demo.html.HtmlParser.java' , So that using the Constructer read the String argument,Strips the HTML Tags and returns the String with out Tags. Currently 'org.apache.lucene.demo.html.HtmlParser.java' method accepts fullpath of the file and then reads the Content to Strip Tags...... Thx in Advance Karthik -----Original Message----- From: Daniel Naber [mailto:[EMAIL PROTECTED] Sent: Saturday, September 25, 2004 12:47 AM To: Lucene Users List Subject: Re: demo IndexHTML parser breaks unicode? On Friday 24 September 2004 19:58, Fred Toth wrote: > I've got unicode in my source HTML. In particular, within meta tags, > and it's getting broken by the indexer. Note that I'm not trying to > query on any of this, just store and retrieve document titles with > unicode characters. Please try again with the code from CVS, Christoph Goller committed a fix for this problem (at least I think it was this problem) 1-3 weeks ago. Regards Daniel -- http://www.danielnaber.de --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]