[ 
https://issues.apache.org/jira/browse/TIKA-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542477#comment-13542477
 ] 

Peter Nordquist commented on TIKA-1040:
---------------------------------------

I'm also running into this issue on Windows 7 with Java version 1.6.0_37.  I 
did a little digging and I think it may be related to the fact that the 
mp4parser library is using memory mapping with the FileChannels which is a 
known issue with Java 
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4715154 (the mapped region 
needs to be garbage collected before the file can be deleted on Windows).

At org.apache.tika.parser.mp4.MP4Parser.parse(MP4Parser.java:117) there's a 
call to grab the FileChannel and when using streams this will output the stream 
to a file in the temp directory and use the FileChannel from a stream on that 
file.  Then the channel is used in the constructor for IsoFile this class 
parses the file and uses the ChannelHelper class.  The ChannelHelper class uses 
mapped memory on line 33.  The stack starting in the parser is below (opposite 
of an exception stacktrace).  An important thing to note is that the IsoFile 
class uses the PropertyBoxParserImpl class by default which has the 
AbstractBoxParser as its super class and the PropertyBoxParserImpl does not 
override the parseBox method so I ended up omitting it from the stacktrace.  
There are also a couple other locations doing the same memory mapping in the 
mp4parser library not listed here but this is the shining example.

org.apache.tika.parser.mp4.MP4Parser.parse(MP4Parser.java:117)
com.coremedia.iso.IsoFile.<init>(IsoFile.java:49)
com.coremedia.iso.IsoFile.parse(IsoFile.java:80)
com.coremedia.iso.AbstractBoxParser.parseBox(AbstractBoxParser.java:50)
com.coremedia.iso.ChannelHelper.readFully(ChannelHelper.java:33)

There should be a workaround by using 
org.apache.tika.io.TikaInputStream.get(java.io.File) as your input stream 
passed to parsers (you should be able to use either method that takes a file).  
This would use the pre-existing TikaInputStream for the stream at 
org.apache.tika.parser.mp4.MP4Parser.parse(MP4Parser.java:115) and it should 
use the existing file as the source for the FileChannels.  However I would like 
to use the InputStreams and that's where the TemporaryResources is trying to 
delete the temporary file.  For now I guess I'm stuck with writing the content 
myself and using the workaround above.


                
> Could not delete temporary file
> -------------------------------
>
>                 Key: TIKA-1040
>                 URL: https://issues.apache.org/jira/browse/TIKA-1040
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.2
>         Environment: Windows XP 64
>            Reporter: Carlos S. Zamudio
>
> Although I found an entry that suggested this had been resolved in 1.2, I 
> continue to receive the exception below when attempting to extract metadata 
> from a video file. In my case the file type is in the Quicktime MOV format.
> org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from 
> org.apache.tika.parser.mp4.MP4Parser@4413ee
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:248)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
>       at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
>       at test.TikaExamples_1pt2a.testTikaMetadata(TikaExamples_1pt2a.java:223)
>       at test.TikaExamples_1pt2a.main(TikaExamples_1pt2a.java:60)
> Caused by: java.io.IOException: Could not delete temporary file 
> C:\DOCUME~1\CARLOS~1.SLA\LOCALS~1\Temp\apache-tika-1430602345143256975.tmp
>       at 
> org.apache.tika.io.TemporaryResources$1.close(TemporaryResources.java:70)
>       at 
> org.apache.tika.io.TemporaryResources.close(TemporaryResources.java:121)
>       at org.apache.tika.io.TikaInputStream.close(TikaInputStream.java:637)
>       at org.apache.tika.parser.mp4.MP4Parser.parse(MP4Parser.java:119)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
>       ... 4 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to