Re: Best Practices for getting Strings from a position range

Grant Ingersoll Wed, 15 Aug 2007 08:11:14 -0700


On Aug 15, 2007, at 10:46 AM, Peter Keegan wrote:

Grant,

I built an index as described here:
http://www.nabble.com/SpanQuery-and-database-join-tf4262902.html

Many documents have only 1 or 2 rows, some have dozens.
Here is a typical query without spans:
+((+contents:quaker +contents:cereal) (+boost50:quaker+boost50:cereal))
+literals:co$us), sort=<custom:"feedbabe":
[EMAIL PROTECTED]>,"dateactiveR"!


Here is a typical query with spans:

+spanNear([adliterals:jb$1, adliterals:co$us], 8, false)
+(+((+contents:quaker +contents:cereal) (+boost50:quaker+boost50:cereal))
+literals:co$us), sort=<custom:"feedbabe":
[EMAIL PROTECTED]>,"dateactiveR"!
The addition of the spanNear clause caused the 10X decrease inthroughput. Icould probably change the way rows are indexed and use orderedterms, which
seems to be a bit faster (only 5X decrease)

In looking at the code, it makes sense that an ordered SpanNearQuerywould be faster.

I am still trying to dig into the logistics of the UnorderedSpanNearQuery, as it is the only thing hanging me up on addingpayload access to Spans. I need to step through and debug. As yourstack trace showed, there is a lot of work taking place to manage thepriority queue that is created. I just don't understand the relationbetween the SpanCells, the "ordered" List and the PriorityQueue"queue" just yet. It seems the SpanCells make a linked list, the"ordered" list is for getting the spans from the sub queries and thequeue seems to rearrange the ordered list

If anyone wants to chip in with pseudocode explaining what is goingon in NearSpansUnordered.java it would be helpful.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Best Practices for getting Strings from a position range

Reply via email to