Many thanks. We are going to use your recommendation to change our Solr. Kind regards, Zhiqing
On Wed, 27 Apr 2022 at 23:27, Michael Gibney <[email protected]> wrote: > Do you want faceting to be based on tokenized values, or the original input > as a monolithic string? In any case, the patch attached to SOLR-13056 is > unlikely to help. The patch associated with SOLR-8362 might help (is > _designed_ to help with this kind of situation, in fact!). But that's a > monumental patch and I wouldn't recommend using it provisionally. > > The good news is that SortableTextField is kind of a convenience, so > depending on whether you want to facet on the original string or the > post-tokenization values, you can probably achieve the outcome you want by > leveraging creative copyFields, etc... > > One thing that I've always wondered about is the utility of > SortableTextField given that IIUC the sort values are not normalized > (casefolding, etc.). You configure one-and-only-one index-time analyzer, > which (again, IIUC) is used for tokenization. But the sort value (which one > might ordinarily normalize with KeywordTokenizer or something?) is based on > the pre-analysis raw input. > > I'm making a bunch of assumptions here, but my recommendation if you want > normalized sort on full value _and_ faceting on post-analysis token values: > use a copyField to direct input to two separate fields -- one for sorting > (maybe ICUCollationField?) and one for faceting (TextField). The faceting > would require uninversion (no docValues for faceting over TextField). Some > interesting general discussion about post-tokenization faceting use cases > (mostly advising against) can be found here [1]. > > [1] https://issues.apache.org/jira/browse/LUCENE-10023 > > Michael > > On Wed, Apr 27, 2022 at 5:01 PM WU, Zhiqing <[email protected]> wrote: > > > Hi Michael, > > Thanks for your responsible recommendation. > > Yes, we could use TextField in our application but still hope to use > > SortableTextField due to its Sorting functions > > I have read your previous comments (Mar, 2019) in > > https://issues.apache.org/jira/browse/SOLR-13056 > > Could your previous patch solve or partially solve the problem? > > Kind regards, > > Zhiqing > > > > On Tue, 26 Apr 2022 at 01:03, Michael Gibney <[email protected]> > > wrote: > > > > > I was hoping that would "just work"; since it didn't, I dug a little > more > > > and I'm afraid that explicitly setting`method:uif` has no effect -- if > > > docValues are there, they will be used: > > > > > > > > > > > > https://github.com/apache/solr/blob/c99af207c761ec34812ef1cc3054eb2804b7448b/solr/core/src/java/org/apache/solr/search/facet/FacetField.java#L161-L167 > > > > > > Pending SOLR-8362 (or some other more narrow solution?), I think the > only > > > responsible recommendation is: don't use SortableTextField for > faceting. > > > Would it work to use TextField instead? TextField has to be uninverted, > > but > > > at least it meets the requirement of indexed values being compatible > with > > > values over which bulk facet collection takes place. > > > > > > On Mon, Apr 25, 2022 at 3:52 PM WU, Zhiqing <[email protected]> wrote: > > > > > > > Hi Michael, > > > > Thanks for your quick reply and related information. > > > > I added "method":"uif" at 3 different places but it does not address > my > > > > problem - > > > > 1. > > > > { > > > > "query": "*:*", > > > > "method":"uif", > > > > "facet": { > > > > "categories": { > > > > "type": "terms", > > > > "field": "name_txt_sort", > > > > "limit": -1, > > > > "facet": { > > > > "sex_s": { > > > > "type": "terms", > > > > "field": "sex_s", > > > > "limit": -1 > > > > } > > > > } > > > > } > > > > } > > > > } > > > > > > > > Response: > > > > "error":{ > > > > "metadata":[ > > > > "error-class"...]} > > > > > > > > 2. > > > > { > > > > "query": "*:*", > > > > "facet": { > > > > "method":"uif", > > > > "categories": { > > > > "type": "terms", > > > > "field": "name_txt_sort", > > > > "limit": -1, > > > > "facet": { > > > > "sex_s": { > > > > "type": "terms", > > > > "field": "sex_s", > > > > "limit": -1 > > > > } > > > > } > > > > } > > > > } > > > > } > > > > > > > > Response: > > > > "error":{ > > > > "metadata":[ > > > > "error-class", ... > > > > > > > > 3. > > > > { > > > > "query": "*:*", > > > > "facet": { > > > > "categories": { > > > > "method":"uif", > > > > "type": "terms", > > > > "field": "name_txt_sort", > > > > "limit": -1, > > > > "facet": { > > > > "sex_s": { > > > > "type": "terms", > > > > "field": "sex_s", > > > > "limit": -1 > > > > } > > > > } > > > > } > > > > } > > > > } > > > > > > > > Response: > > > > "facets":{ > > > > "count":3, > > > > "categories":{ > > > > "buckets":[{ > > > > "val":"Amelia Harris", > > > > "count":1}, > > > > { > > > > "val":"George Smith", > > > > "count":1}, > > > > { > > > > "val":"Olivia Wilson", > > > > "count":1}]}}} > > > > > > > > Should I try "method":"uif" at another place? > > > > Kind regards, > > > > Zhiqing > > > > > > > > On Mon, 25 Apr 2022 at 17:47, Michael Gibney < > > [email protected]> > > > > wrote: > > > > > > > > > This is related to > https://issues.apache.org/jira/browse/SOLR-13056 > > > > > > > > > > I'm curious: if you set `method:uif` on the top-level facet, are > you > > > able > > > > > to achieve the desired results? (Note that `method:uif` incurs the > > same > > > > > heap memory overhead -- uninverting the indexed values -- as > faceting > > > > over > > > > > a regular TextField). Doing this (if it works as I think it might) > > > could > > > > > address the core problem with faceting on SortableTextField: that > > > > DocValues > > > > > for SortableTextField are appropriate for _sorting_, but are > > different > > > > from > > > > > the _indexed_ values that would be used for refinement and nested > > > domain > > > > > filtering. > > > > > > > > > > See also https://issues.apache.org/jira/browse/SOLR-8362 > > > > > > > > > > On Mon, Apr 25, 2022 at 11:59 AM WU, Zhiqing <[email protected]> > wrote: > > > > > > > > > > > Hello, > > > > > > I do not know why Nested Facets ( > > > > > > > > https://solr.apache.org/guide/8_11/json-facet-api.html#nested-facets > > > ) > > > > > does > > > > > > not work for _txt_sort field (SortableTextField). > > > > > > > > > > > > To reproduce the problem, > > > > > > I created a new collection (Config set: _default) and add the > > > following > > > > > to > > > > > > the collection > > > > > > { > > > > > > "name_txt_sort": ["Amelia Harris"], > > > > > > "name_txt": ["Amelia Harris"], > > > > > > "sex_s": "female" > > > > > > }, > > > > > > { > > > > > > "name_txt_sort": ["Olivia Wilson"], > > > > > > "name_txt": ["Olivia Wilson"], > > > > > > "sex_s": "female" > > > > > > }, > > > > > > { > > > > > > "name_txt_sort": ["George Smith"], > > > > > > "name_txt": ["George Smith"], > > > > > > "sex_s": "male" > > > > > > } > > > > > > > > > > > > If my query is: > > > > > > { > > > > > > "query": "*:*", > > > > > > "facet": { > > > > > > "categories": { > > > > > > "type": "terms", > > > > > > "field": "name_txt", > > > > > > "limit": -1, > > > > > > "facet": { > > > > > > "sex_s": { > > > > > > "type": "terms", > > > > > > "field": "sex_s", > > > > > > "limit": -1 > > > > > > } > > > > > > } > > > > > > } > > > > > > } > > > > > > } > > > > > > > > > > > > The output is correct: > > > > > > ============================ > > > > > > "facets":{ > > > > > > "count":3, > > > > > > "categories":{ > > > > > > "buckets":[{ > > > > > > "val":"amelia", > > > > > > "count":1, > > > > > > "sex_s":{ > > > > > > "buckets":[{ > > > > > > "val":"female", > > > > > > "count":1}]}}, > > > > > > { > > > > > > "val":"george", > > > > > > "count":1, > > > > > > ... > > > > > > ============================ > > > > > > > > > > > > However, if I change > > > > > > "field": "name_txt" > > > > > > to > > > > > > "field": "name_txt_sort" > > > > > > in my query, only one level group result is shown: > > > > > > ================================ > > > > > > "facets":{ > > > > > > "count":3, > > > > > > "categories":{ > > > > > > "buckets":[{ > > > > > > "val":"Amelia Harris", > > > > > > "count":1}, > > > > > > { > > > > > > "val":"George Smith", > > > > > > "count":1}, > > > > > > { > > > > > > "val":"Olivia Wilson", > > > > > > "count":1}]}}} > > > > > > ==================================== > > > > > > > > > > > > I know for _txt field, its fieldType is "text_general" and class > is > > > > > > "solr.TextField" > > > > > > for _txt_sort field, its fieldType is "text_gen_sort" and class > > is > > > > > > "solr.SortableTextField" > > > > > > > > > > > > It seems SortableTextField will influence Nested Facets but I > could > > > not > > > > > > find any related document. > > > > > > Is it a bug or SortableTextField is not acceptable in Nested > > Facets? > > > > > > Many thanks in advance. > > > > > > Kind regards, > > > > > > Zhiqing > > > > > > > > > > > > > > > > > > > > >
