Hello, I got a task to index in Solr 7.71 a PDF files which are stored in SqlBase database. I did half the job - I can to index all table fields, I can do a search in these fields except field in which is stored a pdf file content. As I am ttotally new in Solr, spent unsuccessfully a lot a time trying to understand how to force to extract and index field with pdf content. I need a help.
Regards, Aruna in solrconfig.xml i have * <lib dir="${solr.install.dir:../../../..}/contrib/dataimporthandler/lib" regex=".*\.jar" /> <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-dataimporthandler-.*\.jar" /> * * <lib dir="${solr.install.dir:../../../..}/contrib/extraction/lib" regex=".*\.jar" />* * <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-cell-\d.*\.jar" />* *<requestHandler name="/update/extract" startup="lazy" class="solr.extraction.ExtractingRequestHandler" > <lst name="defaults"> <str name="lowernames">true</str> <str name="fmap.meta">ignored_</str> <str name="fmap.content">_text_</str> </lst> </requestHandler>* *<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">db-data-config.xml</str> </lst> </requestHandler>* *---------------------------------------------------------------------------------------------------------------------------------------------db-data-config.xml<dataConfig><dataSource type="JdbcDataSource" driver="jdbc.unify.sqlbase.SqlbaseDriver" url="jdbc:sqlbase://localhost:2155/PDFDOCS" user="sysadm" password="sysadm" /> <document> <entity name="PDFDOCUMENTS" query="select ID, PDOCUMENT, UNIT from SYSADM.DOCS"> <field column="ID" name="idx" /> <field column="PDOCUMENT" name="PDF" /> <field column="UNIT" name="division" /> </entity> </document></dataConfig>*