Luis Filipe Nassif created TIKA-1007:
----------------------------------------
Summary: Improve Concurrency of ParsingReader
Key: TIKA-1007
URL: https://issues.apache.org/jira/browse/TIKA-1007
Project: Tika
Issue Type: Improvement
Components: parser
Affects Versions: 1.2
Environment: jre 1.7.0_05 x64, Windows 7 Enterprise x64
Reporter: Luis Filipe Nassif
Attachments: FastPipedReader.java, FastPipedWriter.java,
ModifiedParsingReader.java, ModifiedParsingReaderTest.java,
ParsingReaderTest.java
As discussed in TIKA-885, PipedReader and PipedWriter classes have a bug that
do not allow them to execute concurrently, because they notify each other only
when the pipe is full or empty, and do not after a char is read or written to
the pipe. It affects the concurrency of the reader and writer sides of
ParsingReader. Try to execute the attached ParsingReaderTest.java and you will
see that only one processor is used (25% CPU on my quad core machine). So i
modified ParsingReader to use modified versions of PipedReader and PipedWriter,
that work concurrently. Try to execute the attached
ModifiedParsingReaderTest.java and you will see that 2 processors are used (50%
on my machine). The attached FastPipedReader.java and FastPipedWriter.java are
only for demonstration purposes, because I took the base code from the net and
changed it, so it could suffer from license restrictions.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira