Re: Sorting memory-efficiently by any numeric field (dates too?)

2013-11-12 Thread Erick Erickson
Siiigggh. Yet another "brilliant" idea bites the dust. Thanks! Erick On Tue, Nov 12, 2013 at 8:49 PM, Yonik Seeley wrote: > On Tue, Nov 12, 2013 at 7:01 PM, Erick Erickson > wrote: > > Yonik: > > > > Of course I'm not really up on the details of sorting, but aren't there > > various control s

Re: Sorting memory-efficiently by any numeric field (dates too?)

2013-11-12 Thread Yonik Seeley
On Tue, Nov 12, 2013 at 7:01 PM, Erick Erickson wrote: > Yonik: > > Of course I'm not really up on the details of sorting, but aren't there > various control structures that are allocated for a sort but not for > scoring? I'm thinking of long[maxDoc] type structures in addition to > the actual val

Re: Sorting memory-efficiently by any numeric field (dates too?)

2013-11-12 Thread Erick Erickson
Yonik: Of course I'm not really up on the details of sorting, but aren't there various control structures that are allocated for a sort but not for scoring? I'm thinking of long[maxDoc] type structures in addition to the actual values in the FieldCache. I've been thinking about docValues for this

Re: Sorting memory-efficiently by any numeric field (dates too?)

2013-11-12 Thread Yonik Seeley
For a reasonable top-N, the space efficiency should still be the same as it is really just dominated by the FieldCache representation (is it in-memory or disk-docvalue based). Directly sorting on that numeric field vs deriving a score from the field and sorting on that shouldn't really be that dif

RE: Sorting memory-efficiently by any numeric field (dates too?)

2013-11-12 Thread Petersen, Robert
Hi Erick, I like your idea, FWIW please also leave room for boost by function query which takes many numeric fields as input but results in a single value. I don't know if this counts as a really clever function but here's one that I currently use: {!boost b=pow(sum(log(sum(product(boosted,90