I'm using the solrconfig.xml from the distribution, ./server/solr/configsets/_default/conf/solrconfig.xml
But this problem extends to the index as well; using the initial example, if I search for <str name="parsedquery">metadata_txt:ab00001</str> (instead of ab00001.tif), my result set includes ab00001.tif, ab00001.jpg, ab00001.png, etc so the tokens in the index are split on dot as well, not just the query. I'm doing something wrong, or I'm misunderstanding something!! ~~Bill On Tue, May 2, 2023 at 1:02 PM Mikhail Khludnev <[email protected]> wrote: > Analyzer is configured in schema.xml. But literally, splitting on dot is > what I expect from StandardTokenizer. > > On Tue, May 2, 2023 at 8:48 PM Bill Tantzen <[email protected]> > wrote: > > > Mikhail, > > Thanks for the quick reply. Here is the parser info: > > > > <str name="QParser">LuceneQParser</str> > > > > ~~Bill > > > > On Tue, May 2, 2023 at 12:43 PM Mikhail Khludnev <[email protected]> > wrote: > > > > > Hello Bill, > > > Which analyzer is configured for metadata_txt? Perhaps you need to > tune > > it > > > accordingly. > > > > > > On Tue, May 2, 2023 at 7:40 PM Bill Tantzen <[email protected]> > > > wrote: > > > > > > > In my solr 9.2 schema, I am leveraging the dynamicField > > > > > > > > <dynamicField name="*_txt" type="text_general" indexed="true" > > > > stored="true"/> > > > > > > > > which tokenizes with solr.StandardTokenizerFactory for index and > query. > > > > > > > > However, when I query with, for example, > > > > <str name="q">metadata_txt:XYZ.tif</str> > > > > > > > > I see many more hits than I expect. When I add debug=true to the > > query, > > > I > > > > see: > > > > <str name="rawquerystring">metadata_txt:XYZ.tif</str> > > > > <str name="querystring">metadata_txt:XYZ.tif</str> > > > > <str name="parsedquery">metadata_txt:XYZ metadata_txt:tif</str> > > > > > > > > But I expect that dots not followed by whitespace will be kept as > part > > of > > > > the token, that is, the parsed query should remain > > "metadata_txt:XYZ.tif" > > > > but solr appears to be splitting into two tokens. > > > > > > > > Can somebody point out what I am misunderstanding? > > > > Thanks, > > > > ~~Bill > > > > > > > > > > > > > -- > > > Sincerely yours > > > Mikhail Khludnev > > > https://t.me/MUST_SEARCH > > > A caveat: Cyrillic! > > > > > > > > > -- > > Human wheels spin round and round > > While the clock keeps the pace... -- John Mellencamp > > ________________________________________________________________ > > Bill Tantzen University of Minnesota Libraries > > 612-626-9949 (U of M) 612-325-1777 (cell) > > > > > -- > Sincerely yours > Mikhail Khludnev > https://t.me/MUST_SEARCH > A caveat: Cyrillic! > -- Human wheels spin round and round While the clock keeps the pace... -- John Mellencamp ________________________________________________________________ Bill Tantzen University of Minnesota Libraries 612-626-9949 (U of M) 612-325-1777 (cell)
