I am also having the issue where my zip contents (or kmz contents) are not
being processed - only the file names are processed.  It seems to recognize
the kmz extension and open the file just doesn't recurse the processing on
the contents.
The patch you mention has been around for a while.  I am running solr 4.8.1
and looks like the tika jar is 1.5. So I would think the patch would be
included already.  Do I need additional configuration?  My config is as
follows: 
<dataConfig><dataSource type="BinFileDataSource" /><document><entity
name="kmlfiles" dataSource=null" rootEntity="false" baseDir="mydirectory"
fileName=".*\.kmz$" onError="skip" processor="FileListEntityProcessor"
recursive="false" >
<field defs........................
/>
<entity name="kmlImport" processor="TikaEntityProcessor"
datasource="kmlfiles" htmlMapper="identity"
transformer="TemplateTransformer" url="${kmlfiles.fileAbsolutePath}">
<more field defs....
/></entity>
</entity>
</document></dataConfig>

and I am using the dataImport option from the admin page  Thanks for any
assistance - I'm on a closed network and getting patches to it are not
trival.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/ExtractingRequestHandler-indexing-zip-files-tp4138172p4157650.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to