First question: When indexing content in a directory, Solr's normal
behavior is to recursively index all the files found in that directory
and its subdirectories.  However, turns out that when the files are of
the form *.eml (email), solr won't do that.  I can use a wildcard to get
it to index the current directory, but it won't recurse.

I note this message that's displayed when I begin indexing: "Entering
auto mode. File endings considered are
xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log

Is there a way to get it to recurse through files with different
extensions, for example, like .eml?  When I manually add all the
subdirectory content, solr seems to parse the content very well,
recognizing all the standard email metadata.  I just can't get it to do
the indexing recursively.

Second question: if I want to index files from many different source
directories, is there a way to specify these different sources in one
command? (Right now I have to issue a separate indexing command for each
directory - which means I have to sit around and wait till each is
finished.)

Third question: I have a very large directory structure that includes a
couple of subdirectories I'd like to exclude from indexing.  Is there a
way to index recursively, but exclude specified directories?

Reply via email to