Some time ago I faced a roughly similar challenge. After many trials and tests I ended up creating my own programs to accomplish the tasks of fetching files, selecting which are allowed to be indexed, and feeding them into Solr (POST style). This work is open source, found on https://netlab1.net/, web page section titled Presentations of long term utility, item Solr/Lucene Search Service. This is a set of docs, three small PHP programs, and a Solr schema etc bundle, all within one downloadable zip file.     On filtering found files, my solution uses a list of regular expressions which are simple to state and to process. The docs discuss the rules. Luckily, the code dealing with rules per se and doing the filtering is very short and simple; see crawler.php for convertfilter() and filterbyname(). Thus you may wish to consider them or equivalents for inclusion in your system, whatever that may be.
    Thanks,
    Joe D.

On 27/08/2020 20:32, Alexandre Rafalovitch wrote:
If you are indexing from Drupal into Solr, that's the question for
Drupal's solr module. If you are doing it some other way, which way
are you doing it? bin/post command?

Most likely this is not the Solr question, but whatever you have
feeding data into Solr.

Regards,
   Alex.

On Thu, 27 Aug 2020 at 15:21, Staley, Phil R - DCF
<phil.sta...@wisconsin.gov> wrote:
Can you or how do you exclude a specific folder/directory from indexing in SOLR 
version 7.x or 8.x?   Also our CMS is Drupal 8

Thanks,

Phil Staley
DCF Webmaster
608 422-6569
phil.sta...@wisconsin.gov



Reply via email to