Many thanks.
We are going to use your recommendation to change our Solr.
Kind regards,
Zhiqing

On Wed, 27 Apr 2022 at 23:27, Michael Gibney <[email protected]>
wrote:

> Do you want faceting to be based on tokenized values, or the original input
> as a monolithic string? In any case, the patch attached to SOLR-13056 is
> unlikely to help. The patch associated with SOLR-8362 might help (is
> _designed_ to help with this kind of situation, in fact!). But that's a
> monumental patch and I wouldn't recommend using it provisionally.
>
> The good news is that SortableTextField is kind of a convenience, so
> depending on whether you want to facet on the original string or the
> post-tokenization values, you can probably achieve the outcome you want by
> leveraging creative copyFields, etc...
>
> One thing that I've always wondered about is the utility of
> SortableTextField given that IIUC the sort values are not normalized
> (casefolding, etc.). You configure one-and-only-one index-time analyzer,
> which (again, IIUC) is used for tokenization. But the sort value (which one
> might ordinarily normalize with KeywordTokenizer or something?) is based on
> the pre-analysis raw input.
>
> I'm making a bunch of assumptions here, but my recommendation if you want
> normalized sort on full value _and_ faceting on post-analysis token values:
> use a copyField to direct input to two separate fields -- one for sorting
> (maybe ICUCollationField?) and one for faceting (TextField). The faceting
> would require uninversion (no docValues for faceting over TextField). Some
> interesting general discussion about post-tokenization faceting use cases
> (mostly advising against) can be found here [1].
>
> [1] https://issues.apache.org/jira/browse/LUCENE-10023
>
> Michael
>
> On Wed, Apr 27, 2022 at 5:01 PM WU, Zhiqing <[email protected]> wrote:
>
> > Hi Michael,
> > Thanks for your responsible recommendation.
> > Yes, we could use TextField in our application but still hope to use
> > SortableTextField due to its Sorting functions
> > I have read your previous comments (Mar, 2019) in
> > https://issues.apache.org/jira/browse/SOLR-13056
> > Could your previous patch solve or partially solve the problem?
> > Kind regards,
> > Zhiqing
> >
> > On Tue, 26 Apr 2022 at 01:03, Michael Gibney <[email protected]>
> > wrote:
> >
> > > I was hoping that would "just work"; since it didn't, I dug a little
> more
> > > and I'm afraid that explicitly setting`method:uif` has no effect -- if
> > > docValues are there, they will be used:
> > >
> > >
> > >
> >
> https://github.com/apache/solr/blob/c99af207c761ec34812ef1cc3054eb2804b7448b/solr/core/src/java/org/apache/solr/search/facet/FacetField.java#L161-L167
> > >
> > > Pending SOLR-8362 (or some other more narrow solution?), I think the
> only
> > > responsible recommendation is: don't use SortableTextField for
> faceting.
> > > Would it work to use TextField instead? TextField has to be uninverted,
> > but
> > > at least it meets the requirement of indexed values being compatible
> with
> > > values over which bulk facet collection takes place.
> > >
> > > On Mon, Apr 25, 2022 at 3:52 PM WU, Zhiqing <[email protected]> wrote:
> > >
> > > > Hi Michael,
> > > > Thanks for your quick reply and related information.
> > > > I added "method":"uif" at 3 different places but it does not address
> my
> > > > problem -
> > > > 1.
> > > > {
> > > >   "query": "*:*",
> > > >   "method":"uif",
> > > >   "facet": {
> > > >     "categories": {
> > > >       "type": "terms",
> > > >       "field": "name_txt_sort",
> > > >       "limit": -1,
> > > >       "facet": {
> > > >         "sex_s": {
> > > >           "type": "terms",
> > > >           "field": "sex_s",
> > > >           "limit": -1
> > > >         }
> > > >       }
> > > >     }
> > > >   }
> > > > }
> > > >
> > > > Response:
> > > > "error":{
> > > >     "metadata":[
> > > >       "error-class"...]}
> > > >
> > > > 2.
> > > > {
> > > >   "query": "*:*",
> > > >   "facet": {
> > > >     "method":"uif",
> > > >     "categories": {
> > > >       "type": "terms",
> > > >       "field": "name_txt_sort",
> > > >       "limit": -1,
> > > >       "facet": {
> > > >         "sex_s": {
> > > >           "type": "terms",
> > > >           "field": "sex_s",
> > > >           "limit": -1
> > > >         }
> > > >       }
> > > >     }
> > > >   }
> > > > }
> > > >
> > > > Response:
> > > > "error":{
> > > >     "metadata":[
> > > >       "error-class", ...
> > > >
> > > > 3.
> > > > {
> > > >   "query": "*:*",
> > > >   "facet": {
> > > >     "categories": {
> > > >       "method":"uif",
> > > >       "type": "terms",
> > > >       "field": "name_txt_sort",
> > > >       "limit": -1,
> > > >       "facet": {
> > > >         "sex_s": {
> > > >           "type": "terms",
> > > >           "field": "sex_s",
> > > >           "limit": -1
> > > >         }
> > > >       }
> > > >     }
> > > >   }
> > > > }
> > > >
> > > > Response:
> > > > "facets":{
> > > >     "count":3,
> > > >     "categories":{
> > > >       "buckets":[{
> > > >           "val":"Amelia Harris",
> > > >           "count":1},
> > > >         {
> > > >           "val":"George Smith",
> > > >           "count":1},
> > > >         {
> > > >           "val":"Olivia Wilson",
> > > >           "count":1}]}}}
> > > >
> > > > Should I try "method":"uif" at another place?
> > > > Kind regards,
> > > > Zhiqing
> > > >
> > > > On Mon, 25 Apr 2022 at 17:47, Michael Gibney <
> > [email protected]>
> > > > wrote:
> > > >
> > > > > This is related to
> https://issues.apache.org/jira/browse/SOLR-13056
> > > > >
> > > > > I'm curious: if you set `method:uif` on the top-level facet, are
> you
> > > able
> > > > > to achieve the desired results? (Note that `method:uif` incurs the
> > same
> > > > > heap memory overhead -- uninverting the indexed values -- as
> faceting
> > > > over
> > > > > a regular TextField). Doing this (if it works as I think it might)
> > > could
> > > > > address the core problem with faceting on SortableTextField: that
> > > > DocValues
> > > > > for SortableTextField are appropriate for _sorting_, but are
> > different
> > > > from
> > > > > the _indexed_ values that would be used for refinement and nested
> > > domain
> > > > > filtering.
> > > > >
> > > > > See also https://issues.apache.org/jira/browse/SOLR-8362
> > > > >
> > > > > On Mon, Apr 25, 2022 at 11:59 AM WU, Zhiqing <[email protected]>
> wrote:
> > > > >
> > > > > > Hello,
> > > > > > I do not know why Nested Facets (
> > > > > >
> > https://solr.apache.org/guide/8_11/json-facet-api.html#nested-facets
> > > )
> > > > > does
> > > > > > not work for _txt_sort field (SortableTextField).
> > > > > >
> > > > > > To reproduce the problem,
> > > > > > I created a new collection (Config set: _default) and add the
> > > following
> > > > > to
> > > > > > the collection
> > > > > > {
> > > > > >     "name_txt_sort": ["Amelia Harris"],
> > > > > >     "name_txt": ["Amelia Harris"],
> > > > > >     "sex_s": "female"
> > > > > > },
> > > > > > {
> > > > > >     "name_txt_sort": ["Olivia Wilson"],
> > > > > >     "name_txt": ["Olivia Wilson"],
> > > > > >     "sex_s": "female"
> > > > > > },
> > > > > > {
> > > > > >     "name_txt_sort": ["George Smith"],
> > > > > >     "name_txt": ["George Smith"],
> > > > > >     "sex_s": "male"
> > > > > > }
> > > > > >
> > > > > > If my query is:
> > > > > > {
> > > > > >   "query": "*:*",
> > > > > >   "facet": {
> > > > > >     "categories": {
> > > > > >       "type": "terms",
> > > > > >       "field": "name_txt",
> > > > > >       "limit": -1,
> > > > > >       "facet": {
> > > > > >         "sex_s": {
> > > > > >           "type": "terms",
> > > > > >           "field": "sex_s",
> > > > > >           "limit": -1
> > > > > >         }
> > > > > >       }
> > > > > >     }
> > > > > >   }
> > > > > > }
> > > > > >
> > > > > > The output is correct:
> > > > > > ============================
> > > > > > "facets":{
> > > > > >     "count":3,
> > > > > >     "categories":{
> > > > > >       "buckets":[{
> > > > > >           "val":"amelia",
> > > > > >           "count":1,
> > > > > >           "sex_s":{
> > > > > >             "buckets":[{
> > > > > >                 "val":"female",
> > > > > >                 "count":1}]}},
> > > > > >         {
> > > > > >           "val":"george",
> > > > > >           "count":1,
> > > > > >            ...
> > > > > > ============================
> > > > > >
> > > > > > However, if I change
> > > > > > "field": "name_txt"
> > > > > > to
> > > > > > "field": "name_txt_sort"
> > > > > > in my query, only one level group result is shown:
> > > > > > ================================
> > > > > >   "facets":{
> > > > > >     "count":3,
> > > > > >     "categories":{
> > > > > >       "buckets":[{
> > > > > >           "val":"Amelia Harris",
> > > > > >           "count":1},
> > > > > >         {
> > > > > >           "val":"George Smith",
> > > > > >           "count":1},
> > > > > >         {
> > > > > >           "val":"Olivia Wilson",
> > > > > >           "count":1}]}}}
> > > > > > ====================================
> > > > > >
> > > > > > I know for _txt field, its fieldType is "text_general" and class
> is
> > > > > > "solr.TextField"
> > > > > >   for _txt_sort field, its fieldType is "text_gen_sort" and class
> > is
> > > > > > "solr.SortableTextField"
> > > > > >
> > > > > > It seems SortableTextField will influence Nested Facets but I
> could
> > > not
> > > > > > find any related document.
> > > > > > Is it a bug or SortableTextField is not acceptable in Nested
> > Facets?
> > > > > > Many thanks in advance.
> > > > > > Kind regards,
> > > > > > Zhiqing
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to