Hi all,
I'm currently adding fulltext index to my product and face a little issue.
I use tika 0.6 and this is the trace :
Caused by: java.io.IOException: Stream closed
at
java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:145)
at java.io.BufferedInputStream.reset(BufferedInputStream.java:414)
at org.apache.tika.mime.MimeTypes.detect(MimeTypes.java:532)
I'm using VFS and the streams the FileObject returns is of class :
org.apache.commons.vfs.provider.DefaultFileContent$FileContentInputStream
this stream supports 'mark' but automatically closes itself when we reach
the last byte. so if the file is tiny (size<limit) the folowing code will
bug :
stream.mark(limit);
mimeTypes.detect(stream,metadata);
stream.reset();
It would be safer for this specific case if MimeTypes don't read the last
byte.
Can we solve this in Tika or do you think it's a VFS bug and i should tell
them instead of you ?
Other precision :
In my case, i'm using AutoDetectParser
To solve the issue, i actually add the folowing code before calling it :
stream = new BufferedInputStream(stream);
I had this idea when reading this in the AutoDetectParser.parse() :
if (!stream.markSupported()) {
stream = new BufferedInputStream(stream);
}
Regards,
KERDUDOU Ronan
VIRAGE Group (France)
+33 2 53 55 10 22
<mailto:[email protected]> [email protected]
<http://www.viragegroup.com/> www.viragegroup.com