Thanks Erick. Your summary about doc IDs is much helpful.
I tested the second level sort with a small set of data (10K records) and didn't see much of a significant impact. I will test with a 10m records at some time later. Steve On Mon, Aug 24, 2015 at 11:03 PM, Erick Erickson <erickerick...@gmail.com> wrote: > Getting the most recent doc first in the case of a tie > will _not_ "just happen". I don't think you really get the > nuance here... > > You index doc1, and doc2 later. Let's > claim that doc1 gets internal Lucene doc ID of 1 and > doc2 gets an internal doc ID of 2. So far you're golden. > Let's further claim that doc1 is in a different segment than > doc2. Sometime later, as you add/update/delete docs, > segments are merged and doc1 and doc2 may or may > not be in the merged segment. At that point, doc1 can get an > internal Lucene doc ID of, say, 823 and doc2 can get an internal > doc ID of, say 64. So their relative order is changed. > > You have to have a secondary sort criteria then. And it has to be > something monotonically increasing by time that won't ever change > like internal doc IDs can. Adding a timestamp > to every doc is certainly an option. Adding your own counter > is also reasonable. > > But this is a _secondary_ sort, so it's not even consulted if the > first sort (score) is not a tie. You can get a sense of how this would > affect your query time/CPU usage/RAM by must specifying > sort=score desc,id asc > where id is your <uniqueKey> field. This won't do what you want, > but it will simulate it without having to re-index. > > Best, > Erick > > On Mon, Aug 24, 2015 at 11:54 AM, Steven White <swhite4...@gmail.com> > wrote: > > Thanks Hoss. > > > > I understand the dynamic nature of doc-IDs. All that I care about is the > > most recent docs be at the top of the hit list when there is a tie. From > > your reply, it is not clear if that's what happens. If not, then I have > to > > sort, but this is something I want to avoid so it won't add cost to my > > queries (CPU and RAM). > > > > Can you help me answer those two questions? > > > > Steve > > > > On Mon, Aug 24, 2015 at 2:16 PM, Chris Hostetter < > hossman_luc...@fucit.org> > > wrote: > > > >> > >> : A follow up question. Is the sub-sorting on the lucene internal doc > IDs > >> : ascending or descending order? That is, do the most recently index > doc > >> > >> you can not make any generic assumptions baout hte order of the internal > >> lucene doc IDS -- the secondary sort on the internal IDs is stable (and > >> FWIW: ascending) for static indexes, but as mentioned before: the > *actual* > >> order hte the IDS changes as the index changes -- if there is an index > >> merge, the ids can be totally different and docs can be re-arranged > into a > >> diff order... > >> > >> : > However, internal Lucene Ids can change when index changes. (merges, > >> : > updates etc). > >> > >> ... > >> > >> : show up first in this set of docs that have tied score? If not, who > can > >> I > >> : have the most recent be first? Do I have to sort on lucene's internal > >> doc > >> > >> add a "timestamp" or "counter" field when you index your documents that > >> means whatevery you want it to mean (order added, order updated, order > >> according to some external sort criteria from some external system) and > >> then do an explicit sort on that. > >> > >> > >> -Hoss > >> http://www.lucidworks.com/ > >> >