[ https://issues.apache.org/jira/browse/LUCENE-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Uwe Schindler updated LUCENE-1470: ---------------------------------- Attachment: TrieUtils.java I modified the proposal TrieUtils a little bit. Maybe this class could get the new NumberUtils with the extra option to trie encode the values. - Added 32bit support - Merged the unsigned int/long handling into prefixCode methods. Making the "sortableBits" public API is not needed^. For TrieRangeFilter the raw bits will not be needed (I need compareable ints/longs). - The conversion from doubles and floats was renamed and returns the standard signed long/int: doubleToSortableLong and floatToSortableInt, Date is removed (as just Date.getTime() can be used, as everybody knows). - Still missing are Long/IntParser for FieldCache and a SortField factory. It's still untested! I will implement tomorrow (now its time to go to bed) the TrieRangeFilter in two variants (one for 32 bit ints another for 64 bit longs). The min/max values are the ints/longs. The trie coding is also done using the shift value and bit magic. The results of the range split are then encoded using TrieUtils.xxxToPrefixCode(). > Add TrieRangeQuery to contrib > ----------------------------- > > Key: LUCENE-1470 > URL: https://issues.apache.org/jira/browse/LUCENE-1470 > Project: Lucene - Java > Issue Type: New Feature > Components: contrib/* > Affects Versions: 2.4 > Reporter: Uwe Schindler > Assignee: Uwe Schindler > Fix For: 2.9 > > Attachments: fixbuild-LUCENE-1470.patch, fixbuild-LUCENE-1470.patch, > LUCENE-1470-readme.patch, LUCENE-1470.patch, LUCENE-1470.patch, > LUCENE-1470.patch, LUCENE-1470.patch, LUCENE-1470.patch, LUCENE-1470.patch, > LUCENE-1470.patch, TrieUtils.java, TrieUtils.java > > > According to the thread in java-dev > (http://www.gossamer-threads.com/lists/lucene/java-dev/67807 and > http://www.gossamer-threads.com/lists/lucene/java-dev/67839), I want to > include my fast numerical range query implementation into lucene > contrib-queries. > I implemented (based on RangeFilter) another approach for faster > RangeQueries, based on longs stored in index in a special format. > The idea behind this is to store the longs in different precision in index > and partition the query range in such a way, that the outer boundaries are > search using terms from the highest precision, but the center of the search > Range with lower precision. The implementation stores the longs in 8 > different precisions (using a class called TrieUtils). It also has support > for Doubles, using the IEEE 754 floating-point "double format" bit layout > with some bit mappings to make them binary sortable. The approach is used in > rather big indexes, query times are even on low performance desktop > computers <<100 ms (!) for very big ranges on indexes with 500000 docs. > I called this RangeQuery variant and format "TrieRangeRange" query because > the idea looks like the well-known Trie structures (but it is not identical > to real tries, but algorithms are related to it). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org