>It is *strongly* recommended to *not* use >the Tika that's embedded within >Solr, but >instead to do the processing outside of Solr >in a program of your >own and index the results.
+1 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201601.mbox/%3CBY2PR09MB11210EDFCFA297528940B07C7F30%40BY2PR09MB112.namprd09.prod.outlook.com%3E