Here is what I want to do: For a given folder and all its subfolders on my
physical dive, mirror its contents including the contents of archives,
parsing xml, json,html, text, etc. using their respective parser skipping
invalids, and adding all other files as raw. I want archive files (*.zip,
*.doxc) to be added as raw, however I want the text inside archive files
like docx (ms-word) to be indexed and any files in the archives files that
match a filter to be indexed.

Note: It would be nice if there was a single db:add method that allowed me
to specify a map of filters to parsers with options, where all files that
do not match a filter (or are invalid) will be optionally added as raw.

Reply via email to