[ https://issues.apache.org/jira/browse/LUCENE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650974#action_12650974 ]
Earwin Burrfoot commented on LUCENE-1461: ----------------------------------------- bq. RangeQuery no longer relies on the sort order of the terms, which means tricks like padding numeric terms are no longer needed, I think? I do rely on sort order for speed and simplicity, though I never used padding for numeric/date terms :) All dates/numbers/somethingelsespecial are converted to strings using base-2^15^ (to keep high bit=0, as 0xFFFF is used somewhere within Lucene intestines as EOS marker, darn it!) encoding. Plus adjustment to preserve sort order for negative numbers in face of unsigned java char. This transformation is insanely fast, and produces well-compressed results (I have FAT read->mem/write->mem+disk indexes). bq. b) prefix the terms with a precision marker. The prefix is important for the sort order, so that all terms of one precision are in one "bunch" and not distributed between higher precsion terms. And you can no longer use this field for sorting, as it has more than one term for each document. bq. For my last implementation, based on filters I did not use a BooleanQuery with OR'ed ranges because of resource usage Using filters here too bq. Allowing each field to provide its own Comparator may still be helpful then But you still store strings in the index. So essentially you'll convert your value from T to String, store it, retrieve it, convert back to T in such a custom comparator, and finally compare. Why should I need that second conversion and custom comparators, if I can have order-preserving bijective T<->String relation? > Cached filter for a single term field > ------------------------------------- > > Key: LUCENE-1461 > URL: https://issues.apache.org/jira/browse/LUCENE-1461 > Project: Lucene - Java > Issue Type: New Feature > Reporter: Tim Sturge > Assignee: Michael McCandless > Attachments: DisjointMultiFilter.java, LUCENE-1461.patch, > LUCENE-1461a.patch, LUCENE-1461b.patch, RangeMultiFilter.java, > RangeMultiFilter.java, TermMultiFilter.java > > > These classes implement inexpensive range filtering over a field containing a > single term. They do this by building an integer array of term numbers > (storing the term->number mapping in a TreeMap) and then implementing a fast > integer comparison based DocSetIdIterator. > This code is currently being used to do age range filtering, but could also > be used to do other date filtering or in any application where there need to > be multiple filters based on the same single term field. I have an untested > implementation of single term filtering and have considered but not yet > implemented term set filtering (useful for location based searches) as well. > The code here is fairly rough; it works but lacks javadocs and toString() and > hashCode() methods etc. I'm posting it here to discover if there is other > interest in this feature; I don't mind fixing it up but would hate to go to > the effort if it's not going to make it into Lucene. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]