Re: Parsing and indexing parts of the input file paths

Andrew Musselman Tue, 21 Jul 2015 18:00:06 -0700

Erik, thanks; the prefix starting with "/user/andrew/" will be known, and
can be put into config, let's assume.  Would this be config-only or would
it require some code, and could you point to some classes I can start with
if I need to write code, and some up-to-date docs?


Same for the update processor, is there an example I could read?

On Tue, Jul 21, 2015 at 11:19 AM, Erik Hatcher <erik.hatc...@gmail.com>
wrote:

> If this is only for search, then an analysis chain could be crafted,
> likely with the pattern regex filter in the mix, to pull out pieces of the
> path.  How will you know the prefix of the file though?
>
> There’s also the ability to do this sort of thing in an update processor,
> most easily using the script update processor, using a bit of JavaScript to
> pull out the piece(s) you want to index (and even store at this point).
>
> —
> Erik Hatcher, Senior Solutions Architect
> http://www.lucidworks.com
>
>
>
>
> On Jul 21, 2015, at 1:31 PM, Andrew Musselman <andrew.mussel...@gmail.com>
> wrote:
>
> Dear user and dev lists,
>
> We are loading files from a directory and would like to index a portion of
> each file path as a field as well as the text inside the file.
>
> E.g., on HDFS we have this file path:
>
> /user/andrew/1234/1234/file.pdf
>
> And we would like the "1234" token parsed from the file path and indexed
> as an additional field that can be searched on.
>
> From my initial searches I can't see how to do this easily, so would I
> need to write some custom code, or a plugin?
>
> Thanks!
>
>
>

Re: Parsing and indexing parts of the input file paths

Reply via email to