On Thursday 13 March 2008 00:42:59 Erick Erickson wrote: > I certainly found that lazy loading changed my speed dramatically, but > that was on a particularly field-heavy index. > > I wonder if TermEnum/TermDocs would be fast enough on an indexed > (UN_TOKENIZED???) field for a unique id. > > Mostly, I'm hoping you'll try this and tell me if it works so I don't have > to sometime <G>....
I added a "uid" field to our existing fields. After the load there were some gaps in the values for this field; presumably those were documents where adding the doc failed and adding the fallback doc also failed. The index contains 20004 documents. Each test I ran over 10 iterations and times below are an average of the last 5 as it took around 5 rounds to warm up. Filter building, for a filter returning 1000 documents randomly selected: Time to build filter by UID (100% Derby) - 93ms Additional time to build filter by DocID - 12ms (13% penalty) 13% penalty is acceptable IMO. The problem comes next. Bulk operation building, for a query returning around 2800 documents: Time to build the bulkop by DocID (100% Hits) - 6ms Time to fetch the "uid" field from the document - 152ms (2600% penalty) Time to do the DB query (not counting commit though) - 10ms For interest's sake I also timed fetching the document with no FieldSelector, that takes around 410ms for the same documents. So there is still a big benefit in using the field selector, it just isn't anywhere near enough to get it close to the time it takes to retrieve the doc IDs. Daniel --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]