Very new to SOLR and haven't been able to find an answer to this issue:
Using SOLR 5.1.0 we have a successful FileListEntityProcessor setup but
would like to exclude several directories that live below the baseDir.
The data-config.xml file looks like:
<dataConfig>
<dataSource name="abc" type="BinFileDataSource" />
<document>
<entity name="f" processor="FileListEntityProcessor"
baseDir="/abc/def/" recursive="true" rootEntity="false" >
<field column="fileAbsolutePath" name="filename" />
<field column="file" name="title" />
<entity name="page" dataSource="abc"
processor="TikaEntityProcessor" url="${f.fileAbsolutePath}"
format="text" onError="skip" >
<filed column="text" name="filedata" />
</entity>
</entity>
</document>
</dataConfig>
So we'd like to index /abc/def/123 and /abc/def/456 but not index
/abc/def/789 and /abc/def/xyz etc.
We can currently index all the files under /abc/def. That works fine but
I can't figure out how to exclude entire subdirectories that have the
same file types in them as the directories that we do want to index.
Any help would be appreciated.
Joe
--
Joe Fidanza
609 279 6211
Systems Administrator
Center for Communications Research
805 Bunn Drive
Princeton, NJ 08540