Which can only happen if I post it to a web service, and won't happen if I do it through config?
On Tue, Jul 21, 2015 at 2:19 PM, Upayavira <u...@odoko.co.uk> wrote: > yes, unless it has been added consciously as a separate field. > > On Tue, Jul 21, 2015, at 09:40 PM, Andrew Musselman wrote: > > Thanks, so by the time we would get to an Analyzer the file path is > > forgotten? > > > > https://cwiki.apache.org/confluence/display/solr/Analyzers > > > > On Tue, Jul 21, 2015 at 1:27 PM, Upayavira <u...@odoko.co.uk> wrote: > > > > > Solr generally does not interact with the file system in that way (with > > > the exception of the DIH). > > > > > > It is the job of the code that pushes a file to Solr to process the > > > filename and send that along with the request. > > > > > > See here for more info: > > > > > > > https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Solr+Cell+using+Apache+Tika > > > > > > You could provide literal.filename=blah/blah > > > > > > Upayavira > > > > > > > > > On Tue, Jul 21, 2015, at 07:37 PM, Andrew Musselman wrote: > > > > I'm not sure, it's a remote team but will get more info. For now, > > > > assuming > > > > that a certain directory is specified, like "/user/andrew/", and a > regex > > > > is > > > > applied to capture anything two directories below matching > "*/*/*.pdf". > > > > > > > > Would there be a way to capture the wild-carded values and index > them as > > > > fields? > > > > > > > > On Tue, Jul 21, 2015 at 11:20 AM, Upayavira <u...@odoko.co.uk> wrote: > > > > > > > > > Keeping to the user list (the right place for this question). > > > > > > > > > > More information is needed here - how are you getting these > documents > > > > > into Solr? Are you posting them to /update/extract? Or using DIH, > or? > > > > > > > > > > Upayavira > > > > > > > > > > On Tue, Jul 21, 2015, at 06:31 PM, Andrew Musselman wrote: > > > > > > Dear user and dev lists, > > > > > > > > > > > > We are loading files from a directory and would like to index a > > > portion > > > > > > of > > > > > > each file path as a field as well as the text inside the file. > > > > > > > > > > > > E.g., on HDFS we have this file path: > > > > > > > > > > > > /user/andrew/1234/1234/file.pdf > > > > > > > > > > > > And we would like the "1234" token parsed from the file path and > > > indexed > > > > > > as > > > > > > an additional field that can be searched on. > > > > > > > > > > > > From my initial searches I can't see how to do this easily, so > would > > > I > > > > > > need > > > > > > to write some custom code, or a plugin? > > > > > > > > > > > > Thanks! > > > > > > > > >