Re: Large scale sorting

jian chen Mon, 09 Apr 2007 15:41:59 -0700

Hi, Paul,

Thanks for your reply. For your previous email about the need for disk based
sorting solution, I kind of agree about your points. One incentive for your
approach is that we don't need to warm-up the index anymore in case that the
index is huge.


In our application, we have to sync up the index pretty frequently, the
warm-up of the index is killing it.

To address your concern about single sort locale, what about creating a sort
field for each sort locale? So, if you have, say, 10 locales, you will have
10 sort fields, each utilizing the mechanism of constructing the norms.

At query time, in the HitCollector, for each doc id matched, you can load
the field value (integer) through the IndexReader. (here you need to enhance
the IndexReader to be able to load the sort field values). Then, you can use
that value to reject/accept the doc, or factor into the score.

How do you think?

Jian



On 4/9/07, Paul Smith <[EMAIL PROTECTED]> wrote:


>
> Now, if we could use integers to represent the sort field values,
> which is
> typically the case for most applications, maybe we can afford to
> have the
> sort field values stored in the disk and do disk lookup for each
> document
> matched? The look up of the sort field value will be as simple as
> docNo * 4
> * offset.
>
> This way, we use the same approach as constructing the norms
> (proper merging
> for incremental indexing), but, at search time, we don't load the
> sort field
> values into memory, instead, just store them in disk.
>
> Will this approach be good enough?

While a nifty idea, I think this only works for a single sort
locale.  I initially came up with a similar idea that the terms are
already stored in 'sorted' order and one might be able to use the
terms position for sorting, it's just that the terms ordering
position is different in different locales.

Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Large scale sorting

Reply via email to