Erik, thanks; the prefix starting with "/user/andrew/" will be known, and can be put into config, let's assume. Would this be config-only or would it require some code, and could you point to some classes I can start with if I need to write code, and some up-to-date docs?
Same for the update processor, is there an example I could read? On Tue, Jul 21, 2015 at 11:19 AM, Erik Hatcher <erik.hatc...@gmail.com> wrote: > If this is only for search, then an analysis chain could be crafted, > likely with the pattern regex filter in the mix, to pull out pieces of the > path. How will you know the prefix of the file though? > > There’s also the ability to do this sort of thing in an update processor, > most easily using the script update processor, using a bit of JavaScript to > pull out the piece(s) you want to index (and even store at this point). > > — > Erik Hatcher, Senior Solutions Architect > http://www.lucidworks.com > > > > > On Jul 21, 2015, at 1:31 PM, Andrew Musselman <andrew.mussel...@gmail.com> > wrote: > > Dear user and dev lists, > > We are loading files from a directory and would like to index a portion of > each file path as a field as well as the text inside the file. > > E.g., on HDFS we have this file path: > > /user/andrew/1234/1234/file.pdf > > And we would like the "1234" token parsed from the file path and indexed > as an additional field that can be searched on. > > From my initial searches I can't see how to do this easily, so would I > need to write some custom code, or a plugin? > > Thanks! > > >