Hi,

I am looking into using Solr for indexing a large database that has
documents (mostly pdf and msoffice) stored as CLOBs in several tables.
It is my understanding that the DIH as provided in Solr 1.4 cannot
index these CLOBs yet, and that SOLR-1358 should provide exactly this.
So i was wondering what the most 'recommended' way is of solving this
.. Should it be done with a custom textextractor of some sort, set on
the column/field ?

Thanks,
Jorg
  • Tika and DIH integrati... Jorg Heymans

Reply via email to