[ 
https://issues.apache.org/jira/browse/SOLR-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646316#comment-17646316
 ] 

Daniel Lowe edited comment on SOLR-16139 at 12/12/22 8:09 PM:
--------------------------------------------------------------

We also ran into this issue when migrating from Solr 8.4 to Solr 9.x. If you 
have a JSON facet that has a sub-facet, and the main facet is on a 
{{{}SortableTextField{}}}, unless the original and analyzed values for the main 
buckets happen to be the same the sub-facet will be incomplete, or even 
entirely ignored if none of the original and analyzed bucket names match.

[~magibney] your suggested fix, which gives the original behaviour empirically 
fixed the issue for us. In our use case we are in fact using a 
{{{}KeywordTokenizer{}}}, which I think avoids the trappy behaviour described 
in SOLR-13056. As far as I'm aware the \{{SortableTextField }}remains the only 
way to have a field that acts like a string field for 
faceting/sorting/streaming, but still allows, for example, case insensitivity 
when searching against the field.

I'd personally prefer if this fix became part of Solr, although if I'm 
understanding correctly the underlying issue is that the facetting code is now 
requesting {{{{getFieldTermQuery}}}} rather than {{{}getFieldQuery{}}}. If the 
bucket names are allowed to differ from the indexed value this doesn't seem to 
be the right method.

Getting a bit pedantic, but {{getFieldTermQuery}} claims that no analysis 
should be performed, but the default implementation just calls 
{{getFieldQuery}} which doesn't make this guarantee.


was (Author: dan2097):
We also ran into this issue when migrating from Solr 8.4 to Solr 9.x. If you 
have a JSON facet that has a sub-facet, and the main facet is on a 
{{{}SortableTextField{}}}, unless the original and analyzed values for the main 
buckets happen to be the same the sub-facet will be incomplete, or even 
entirely ignored if none of the original and analyzed bucket names match.

[~magibney] your suggested fix, which gives the original behaviour empirically 
fixed the issue for us. In our use case we are in fact using a 
{{{}KeywordTokenizer{}}}, which I think avoids the trappy behaviour described 
in SOLR-13056. As far as I'm aware the {{SortableTextField }}remains the only 
way to have a field that acts like a string field for 
faceting/sorting/streaming, but still allows, for example, case insensitivity 
when searching against the field.

I'd personally prefer if this fix became part of Solr, although if I'm 
understanding correctly the underlying issue is that the facetting code is now 
requesting a {{getFieldTermQuery }}rather than a {{{}getFieldQuery{}}}. If the 
bucket names are allowed to differ from the indexed value this doesn't seem to 
be the right method.

Getting a bit pedantic, but {{getFieldTermQuery}} claims that no analysis 
should be performed, but the default implementation just calls 
{{getFieldQuery}} which doesn't make this guarantee.

> [Regression] JSON stat facet functions not working on analysed String 
> (SortableTextField)
> -----------------------------------------------------------------------------------------
>
>                 Key: SOLR-16139
>                 URL: https://issues.apache.org/jira/browse/SOLR-16139
>             Project: Solr
>          Issue Type: Bug
>          Components: faceting
>    Affects Versions: 8.11.1
>            Reporter: Jan Verbeeck
>            Priority: Critical
>              Labels: faceting, json
>
> After updating Solr on my dev environment to version 8.11.1 I noticed nested 
> json stat facets where not working anymore. When downgrading to 8.11.0 worked 
> as before.
> It seems the problem is related to using an analysed string fieldtype like 
> 'solr.SortableTextField' and then using a subfacet with a stat facet function 
> like 'avg'.
>  
> I have managed to reproduce this behaviour using docker. You can find the 
> repository here: 
> [solr-json-repro|https://github.com/verbeeckjan/solr-json-repro]
>  
> The only change I made to the default managed-schema is to change the type of 
> the dynamic field '*_s' to 'text_gen_sort'.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to