Re: Handling intersection facets of many values

2014-11-20 Thread Toke Eskildsen
On Wed, 2014-11-19 at 23:53 +0100, Peter Sturge wrote: Yes, the 'lots-of-booleans' thing is a bit prohibitive as it won't realistically scale to large value sets. large is extremely relative in Solr Land, but I would be weary of going beyond 10K.

Re: Handling intersection facets of many values

2014-11-20 Thread Michael Sokolov
If you're willing to write some Java you can do something more efficient by intersecting two terms enumerations: this works with constant memory for any number of values in two fields, basically like intersecting any two sorted lists, you leap frog between them. I have an example if you're

Handling intersection facets of many values

2014-11-19 Thread Peter Sturge
Hi Solr Group, Got an interesting use case (to me, at least), perhaps someone could give some insight on how best to achieve this? I've got a core that has about 7million entries, with a field call 'addr'. By definition, every entry has a unique 'addr' value, so there are 7million unique values

RE: Handling intersection facets of many values

2014-11-19 Thread Toke Eskildsen
Peter Sturge [peter.stu...@gmail.com] wrote: [addr 7M unique, dest 1K unique] What is the best/only/most efficient way to consutruct a search where by I get back an (ideally faceted) list of values for 'dest' that occur in 'addr'? I assume the actual values are defined by a query? As the

Re: Handling intersection facets of many values

2014-11-19 Thread Peter Sturge
Hi Toke, Thanks for your input. I guess you mean take the 1k or so values and build a boolean query from them? If that's not what you mean, my apologies.. I'd thought of doing that - the trouble I had was the unique values could be 20k, or 15,167 or any arbirary and potentially high-ish number -

RE: Handling intersection facets of many values

2014-11-19 Thread Toke Eskildsen
Peter Sturge [peter.stu...@gmail.com] wrote: I guess you mean take the 1k or so values and build a boolean query from them? Not really. Let me try again: 1) Perform a facet call with facet.limit=-1 on dest to get the relevant dest values. The result will always be 1000 values or less. Take

Re: Handling intersection facets of many values

2014-11-19 Thread Peter Sturge
Hi Toke, Yes, the 'lots-of-booleans' thing is a bit prohibitive as it won't realistically scale to large value sets. I've been wrestling with joins this evening and have managed to get these working - and it works very nicely - and across cores (although not shards yet afaik)! For anyone looking