Hi


Apologies .........

    Can Somebody Please tell me or  how to include  a constructer  within
'org.apache.lucene.demo.html.HtmlParser.java' ,
    So that using the Constructer read the String argument,Strips the HTML
Tags and returns the String with out Tags.
    Currently 'org.apache.lucene.demo.html.HtmlParser.java' method accepts
fullpath of the file and then reads
    the Content to Strip Tags......




Thx in Advance
Karthik


-----Original Message-----
From: Daniel Naber [mailto:[EMAIL PROTECTED]
Sent: Saturday, September 25, 2004 12:47 AM
To: Lucene Users List
Subject: Re: demo IndexHTML parser breaks unicode?


On Friday 24 September 2004 19:58, Fred Toth wrote:

> I've got unicode in my source HTML. In particular, within meta tags,
> and it's getting broken by the indexer. Note that I'm not trying to
> query on any of this, just store and retrieve document titles with
> unicode characters.

Please try again with the code from CVS, Christoph Goller committed a fix
for this problem (at least I think it was this problem) 1-3 weeks ago.

Regards
 Daniel

--
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to