Thanks Raymond. As I was doing the indexing of other delimited files directly with Solr and the terminal (without a client), I thought it would be possible to index the filename of JSON files this way as well. But like you say, I'm parsing the search results in Python. So I might as well build the index through Python as well. I might have to explore something like Pysolr.
Thanks again! On 21 May 2018 at 02:49, Raymond Xie <xie3208...@gmail.com> wrote: > would you consider to include the filename as another meta data fields for > being indexed? I think your downstream python can do that easily. > > > *------------------------------------------------* > *Sincerely yours,* > > > *Raymond* > > On Fri, May 18, 2018 at 3:47 PM, S.Ashwath <ashwat...@gmail.com> wrote: > > > Hello, > > > > I have 2 directories: 1 with txt files and the other with corresponding > > JSON (metadata) files (around 90000 of each). There is one JSON file for > > each CSV file, and they share the same name (they don't share any other > > fields). > > > > The txt files just have plain text, I mapped each line to a field call > > 'sentence' and included the file name as a field using the data import > > handler. No problems here. > > > > The JSON file has metadata: 3 tags: a URL, author and title (for the > > content in the corresponding txt file). > > When I index the JSON file (I just used the _default schema, and posted > the > > fields to the schema, as explained in the official solr tutorial),* I > don't > > know how to get the file name into the index as a field.* As far as i > know, > > that's no way to use the Data import handler for JSON files. I've read > that > > I can pass a literal through the bin/post tool, but again, as far as I > > understand, I can't pass in the file name dynamically as a literal. > > > > I NEED to get the file name, it is the only way in which I can associate > > the metadata with each sentence in the txt files in my downstream Python > > code. > > > > So if anybody has a suggestion about how I should index the JSON file > name > > along with the JSON content (or even some workaround), I'd be eternally > > grateful. > > > > Regards, > > > > Ash > > >