I tried to use IndexHTML (demo) and Lucene 1.2 for indexing *.CZ, but Lucene often falls to never-ending loop. I've analyzed my data, so I know what file(s) sent Lucene down. I don't see anything special in the file(s), so I think, that it can go throught parser to main Lucene routines (and then the problem could be in Merger).
Could you help me, please? One of the problematic files: http://com-os2.ms.mff.cuni.cz/bugs/f01529.txt My program (based on Lucene demo): http://com-os2.ms.mff.cuni.cz/bugs/IndexHTML.java Thank you very much. -g- -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>