Hi there,

I am new to the Apache Solr and currently exploring how to use this
technology to search in the PDF files.

<https://lucene.apache.org/solr/guide/6_6/uploading-structured-data-store-data-with-the-data-import-handler.html#the-tikaentityprocessor>
https://lucene.apache.org/solr/guide/6_6/uploading-structured-data-store-data-with-the-data-import-handler.html#the-tikaentityprocessor

<https://lucene.apache.org/solr/guide/6_6/uploading-structured-data-store-data-with-the-data-import-handler.html#the-tikaentityprocessor>

<https://lucene.apache.org/solr/guide/6_6/uploading-structured-data-store-data-with-the-data-import-handler.html#the-tikaentityprocessor>
I am able to index the PDF files using the "BinFileDataSource" for the PDF
files within the same server as shown in the below example.

Now i want to know if there is a way to change the baseDir pointing to the
folder present under a different server.

Please suggest an example to access the PDF files from another server.


<dataConfig>
  *<dataSource type="BinFileDataSource"/> <!--Local filesystem-->*
  <document>
    <entity name="K2FileEntity" processor="FileListEntityProcessor"
dataSource="null"
            recursive = "true"
            *baseDir="C:/solr-6.6.1/server/solr/core_K2_Depot/Depot"*
fileName=".*pdf" rootEntity="false">

            <field column="file" name="id"/>
            <field column="fileSize" name="size" />-->
            <field column="fileLastModified" name="lastmodified" />

              <entity name="pdf" processor="TikaEntityProcessor"
onError="skip"
                      url="${K2FileEntity.fileAbsolutePath}" format="text">

                <field column="title" name="title" meta="true"/>
                <field column="dc:format" name="format" meta="true"/>
                <field column="text" name="text"/>

              </entity>
    </entity>
  </document>
</dataConfig>


Kind regards,
Karan

Reply via email to