Hi Julia,

As I'm working with indexes that are updated infrequently and queried very
frequently, I would duplicate that data with copyField directives at index
time. Writing a custom facet processor comes with the risk that it might
break with a Solr upgrade.

Are you talking millions of unique users per document, or just overall? I
wouldn't worry about duplicating a few hundred per document at index time.
Whether or not they are duplicated doesn't change how many facets you have
to return at query time anyway.

Thomas

Op wo 4 jan. 2023 om 15:37 schreef Julia Gilenko
<jgile...@proofpoint.com.invalid>:

> Hi everyone,
>
> We have two multi-valued fields, both containing usernames, and we'd like
> to compute the combined counts across both fields. For example, if we were
> to facet on these two docs:
>
> doc1: { field1: [user1, user2], field2: [user3, user4] }
> doc2: { field1: [user1, user3], field2: [user2] }
>
> we'd expect the following counts:
>
> user1: 2
> user2: 2
> user3: 2
> user4: 1
>
> I know one option is to create a new field that combines the two and facet
> on that, but these lists can be large and we could have millions of unique
> users, so we're looking at implementing a custom facet processor to avoid
> duplicating data. Looks like one way would be to subclass SimpleFacets and
> register a new FacetComponent, but this seems to use the legacy faceting
> methods. Is there a way to do something similar with the JSON API? Is this
> even advisable?
>
> Thanks,
> Julia
>

Reply via email to