Re: Query Performance and Optimization

Marcel Reutegger Mon, 12 Mar 2007 07:31:49 -0800

David Johnson wrote:

I think I was again focusing on range queries and giving Lucene some way of
filtering out subsets of the document set, so that the whole document set
wouldn't have to be walked.  For the date range query the from and to dates
would most likely share some set of most significant bytes - these bytes

could just be passed to Lucene as a direct match thereby reducing thesubset

of the collection that would by walked.  If the range query is fixed this
"optimization" would be unnecessary.  Nevertheless, I still wonder if there
is additional information that could be stored in Lucene to augment the
index and improve query processing.

ah, now I see. yes, that might help in some cases. e.g. you could say get me alldocuments with a year value of 2007 and month value of 7. which would beequivalent to a range query 2007-07-01 to 2007-07-31

In this case I was considering using the node UUID as the cross-index join
parameter.  Still, there is the problem of combining the results from two
different indexes.


there are two issues with this approach:
1) getting the UUID requires lucene to load the document

2) implementing an *efficient* join across system boundaries is not easy, evenif the documents are sorted.

3) Use the database to provide the indexing structures.

To me this seems to be a very interesting option, though it requires
considerable effort.


Yes, I agree, this is an interesting option, and does seem that it would
take a fair amount of effort.  Your comments on the user list to this same
thread seems like a start to the thought process needed.  I am not very
familiar with the details of the PM, although I do think that bringing

together data storage and indexing will help with improving queryprocessing

speed, as well as help with some data integrity issues that have been
discussed in other threads.

Over the weekend, I will see if I can come up with a solution to the range
query issue discussed above.


great.

regards
 marcel

Re: Query Performance and Optimization

Reply via email to