[
https://issues.apache.org/jira/browse/LUCENE-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659319#action_12659319
]
Uwe Schindler commented on LUCENE-1496:
---------------------------------------
I looked into the code of NumberUtils:
The encoding is very similar to the one of TrieUtils (used in TrieRangeQuery,
see LUCENE-1470,
http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/search/trie/TrieUtils.html).
The only difference between TrieUtils and NumberUtils is the more compact
encoding in NumberUtils (because in TrieUtils.VARIANT_8BIT uses one character
per byte, NumberUtils uses 14 bits per character). TrieUtils works also correct
with String.compareTo() (it was the intention behind TrieUtils).
In my opinion, TrieUtils has some more advantages:
- Doubles are encoded in a correctly sortable way (even Double.XXX_INFINITY!),
using the IEEE binary representation of doubles with some bit alignments.
- Direct support for Dates and longs
- Builtin comparator for the new SortField constructor (LUCENE-1478) and a
nice SortField factory. This maps all encoded values to a FieldCache with long
values (even for dates or doubles because there is no difference, longs have
the fastest encoding/decoding speed - for sorting, the real values are not
interesting).
The only problem is, that indexes, encoded with the old NumberUtils are not
readable by TrieUtils. But if we include such things into Lucene, we should not
duplicate code and create again new encodings.
For the more compact encoding, TrieUtils could be extended, to also support a
"14bit" Trie variant (which would not work for real trie encoding), but may be
used for simply store longs very compact. On the other hand, if somebody uses
NumberUtils, he may be also interested in TrieRangeQuery, so he should use
TrieUtils.VARIANT_8BIT.
So I think, we should perhaps leave NumberUtils at solr and use TrieUtils in
Lucene. LocalLucene should then also use TrieUtils. And solr may in future
switch to Trie encoding with the next major version, too.
> Move solr NumberUtils to lucene
> -------------------------------
>
> Key: LUCENE-1496
> URL: https://issues.apache.org/jira/browse/LUCENE-1496
> Project: Lucene - Java
> Issue Type: Task
> Reporter: Ryan McKinley
> Priority: Trivial
> Fix For: 2.9
>
>
> solr includes a NumberUtils class with some general utilities for dealing
> with tokens and numbers.
> This should be in lucene rather then solr.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]