Very new to SOLR and haven't been able to find an answer to this issue:

Using SOLR 5.1.0 we have a successful FileListEntityProcessor setup but would like to exclude several directories that live below the baseDir.

The data-config.xml file looks like:

<dataConfig>
    <dataSource name="abc" type="BinFileDataSource" />
    <document>
<entity name="f" processor="FileListEntityProcessor" baseDir="/abc/def/" recursive="true" rootEntity="false" >
        <field column="fileAbsolutePath" name="filename" />
        <field column="file" name="title" />
<entity name="page" dataSource="abc" processor="TikaEntityProcessor" url="${f.fileAbsolutePath}" format="text" onError="skip" >
            <filed column="text" name="filedata" />
        </entity>
    </entity>
    </document>
</dataConfig>

So we'd like to index /abc/def/123 and /abc/def/456 but not index /abc/def/789 and /abc/def/xyz etc.

We can currently index all the files under /abc/def. That works fine but I can't figure out how to exclude entire subdirectories that have the same file types in them as the directories that we do want to index.

Any help  would be appreciated.

Joe

--
Joe Fidanza
609 279 6211
Systems Administrator
Center for Communications Research
805 Bunn Drive
Princeton, NJ 08540

Reply via email to