[
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064414#comment-13064414
]
Bill Bell commented on SOLR-2242:
---------------------------------
Yonik,
Yes I know about groups.ngroups. But the use case still stands. We need a way
to add up facet terms without actually counting them.
I had the restructured facet_fields XML like you recommended (twice). And the
issue is it breaks ALL sharding. The reason why it breaks distribution is that
it is looking for <int> and not <lst>... Several people have wanted me to
change the name to count, to term, to distinct... I really don't care what the
name is, since it makes sense when you try it. I think changing the
distribution is a MUCH larger project. If you want to jump in on the
sharding/distribution to make it work with lists, then please help. The format
change is a HUGE issue. The magic names could also be an issue but ONLY if you
use this new feature. It is not an issue for all APIs and usage - which is why
I added it as a magic variable.
Do we have any examples with Boolean? I have not seen any... Do we use
True/False or on/off? Do you mean like facet=true ? The reason why I have a 1
and 2 is to get the count of terms, but only return a smaller set (internal
limit=-1, but user types limit=5). That is the reason for that. I believe it is
very useful.
Having the numFacetTerms like every other term pretty much works with
sharding/distribution. It just adds it together like any other facet count. One
server returns 5, and the other returns numFacetTerms=10, and the combined
result returns 15. It may break some new feature with distribution or something
I am not aware of and not using...
Concerning building in memory. Having it cached is what I was trying to
achieve. If there is another way to cache the result then let me know other
options. Not having it cached at all is a huge performance problem. If you are
using mode 2, it does not matter that much since you need to return the list
and in most cases you have it in memory... Mode 1 hides it a bit and builds the
entire list in memory when we only need to cache the one value... Again -
without breaking something else, not sure how to achieve that.
As long as there are not more gotchas in distribution, most of the other things
you are listing (XML, name change, boolean) are almost preferences and the XML
format change will be a huge issue, and we should be able to commit? Also,
would like to not cache the entire list in memory when using this - need some
assistance.
1. Any other distribution/sharing issues with adding a magic variable in
facet_field for a new feature?
2. Where and how do we store a cache value without using the array that is
present so we don't cache the whole facet term list when we only need to cache
the resulting number?
Thanks.
> Get distinct count of names for a facet field
> ---------------------------------------------
>
> Key: SOLR-2242
> URL: https://issues.apache.org/jira/browse/SOLR-2242
> Project: Solr
> Issue Type: New Feature
> Components: Response Writers
> Affects Versions: 4.0
> Reporter: Bill Bell
> Assignee: Simon Willnauer
> Priority: Minor
> Fix For: 4.0
>
> Attachments: NumFacetTermsFacetsTest.java,
> SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch,
> SOLR-2242.shard.patch, SOLR-2242.shard.patch,
> SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch,
> SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for
> distinct values. This is normal behavior. This patch tells you how many
> distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
> <lst name="price">
> <int name="numFacetTerms">14</int>
> <int name="0.0">3</int><int name="11.5">1</int><int
> name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int
> name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int
> name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int
> name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
> </lst>
> </lst>
> {code}
> Several people use this to get the group.field count (the # of groups).
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]