I am having problems indexing our website using Lucene 1.2.

To parse our HTML pages, I use the HTMLParser on lucene-demo-1.2.jar. At the
end of my creation script, I receive the numerous "pipe broken" error
messages.

java.io.IOException: Pipe broken
        at java.io.PipedReader.receive(PipedReader.java:122)
        at java.io.PipedReader.receive(PipedReader.java:148)
        at java.io.PipedWriter.write(PipedWriter.java:123)
        at java.io.Writer.write(Writer.java:148)
        at java.io.Writer.write(Writer.java:124)
        at org.apache.lucene.demo.html.HTMLParser.addText(Unknown Source)
        at org.apache.lucene.demo.html.HTMLParser.HTMLDocument(Unknown
Source)
        at org.apache.lucene.demo.html.ParserThread.run(Unknown Source)

This error message occurs only when I am parsing HTML documents. JSPs are
parsed without problem. The number of errors that are being thrown after the
program executes varies everytime the script is run (normally in the range
of 3-5 errors). The index generated is valid and may be used. However, I'm
worried that some of my files are not being indexed properly.

The following configuration is used:
Redhat linux 7.1
Java version 1.3.1_04
Apache 2.0.39
Tomcat 4.0.3
Apache Tomcat connector (ModJK) 4.0.4

I've tried running it on a machine that uses Redhat linux 7.3 (the rest of
the configuration is the same) and it doesn't have this problem. Any help is
appreciated. Thanks in advance!

--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to