Indexing HTML

Leo Galambos Tue, 03 Dec 2002 11:33:20 -0800

I tried to use IndexHTML (demo) and Lucene 1.2 for indexing *.CZ, but
Lucene often falls to never-ending loop. I've analyzed my data, so I know
what file(s) sent Lucene down. I don't see anything special in the
file(s), so I think, that it can go throught parser to main Lucene
routines (and then the problem could be in Merger).


Could you help me, please?

One of the problematic files:
http://com-os2.ms.mff.cuni.cz/bugs/f01529.txt
My program (based on Lucene demo): 
http://com-os2.ms.mff.cuni.cz/bugs/IndexHTML.java

Thank you very much.

-g-


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Indexing HTML

Reply via email to