Possible ConcurrentModificationException while accessing Metadata produced by 
ParsingReader
-------------------------------------------------------------------------------------------

                 Key: TIKA-885
                 URL: https://issues.apache.org/jira/browse/TIKA-885
             Project: Tika
          Issue Type: Improvement
          Components: metadata, parser
    Affects Versions: 1.0
         Environment: jre 1.6_25 x64 and Windows7 Enterprise x64
            Reporter: Luis Filipe Nassif
            Priority: Minor


Oracle PipedReader and PipedWriter classes have a bug that do not allow them to 
execute concurrently, because they notify each other only when the pipe is full 
or empty, and do not after a char is read or written to the pipe. So i modified 
ParsingReader to use modified versions of PipedReader and PipedWriter, similar 
to gnu versions of them, that work concurrently. However, sometimes and with 
certain files, i am getting the following error:

java.util.ConcurrentModificationException
                at java.util.HashMap$HashIterator.nextEntry(Unknown Source)
                at java.util.HashMap$KeyIterator.next(Unknown Source)
                at java.util.AbstractCollection.toArray(Unknown Source)
                at org.apache.tika.metadata.Metadata.names(Metadata.java:146)

It is because the ParsingReader.ParsingTask thread is writing metadata while it 
is being read by the ParsingReader thread, with files containing metadata 
beyond its initial bytes. It will not occur with the current implementation, 
because java PipedReader and PipedWriter block each other, what is a 
performance bug that affect ParsingReader, but they could be fixed in a future 
java release. I think it would be a defensive approach to turn access to the 
private Metadata.metadata Map synchronized, what could avoid a possible future 
problem using ParsingReader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to