[ 
https://issues.apache.org/jira/browse/LUCENE-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839852#action_12839852
 ] 

Toke Eskildsen commented on LUCENE-1990:
----------------------------------------

Some thoughts on avoiding the generic division by experimenting with reciprocal 
multiplication: For aligned, the sane number of values/block are [3, 5, 6, 7, 
8, 9, 10, 16, 21, 32, 64]. I tried testing index from 0 to Integer.MAX_VALUE 
with these divisors and reciprocal multiplication. It worked perfectly for all 
divisors except [5, 7, 9, 10, 21]. Unfortunately it already falls for divisor 
21 at index 252645140, which makes it useless as a full replacement. If one 
were so inclined, it would be possible to select aligned implementation based 
on valueCount, with fallback to the "slow" version. The gain of using fast 
division seems quite substantial as it makes aligned 14-40% faster than packed 
(note: Just tested on a single machine). However, re-introducing aligned with 
four different implementations (Aligned32, Aligned32Fast, Aligned64, 
Aligned64Fast) is rather daunting and it would make the selection code really 
messy.

I can see that there are well-known tricks to get around the rounding errors. 
Some are described at http://www.cs.uiowa.edu/~jones/bcd/divide.html#fixed . I 
don't know if these extra tricks would negate the 14-40% speed gain though. 
Since I would like to get the patch out of the door, I vote for keeping aligned 
disabled and just note that more bit fiddling might make it attractive at some 
point.

> Add unsigned packed int impls in oal.util
> -----------------------------------------
>
>                 Key: LUCENE-1990
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1990
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: Flex Branch
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: Flex Branch
>
>         Attachments: generated_performance-te20100226.txt, 
> LUCENE-1990-te20100122.patch, LUCENE-1990-te20100210.patch, 
> LUCENE-1990-te20100212.patch, LUCENE-1990-te20100223.patch, 
> LUCENE-1990-te20100226.patch, LUCENE-1990-te20100226b.patch, 
> LUCENE-1990-te20100226c.patch, LUCENE-1990-te20100301.patch, 
> LUCENE-1990.patch, LUCENE-1990_PerformanceMeasurements20100104.zip, 
> perf-mkm-20100227.txt, performance-20100301.txt, performance-te20100226.txt
>
>
> There are various places in Lucene that could take advantage of an
> efficient packed unsigned int/long impl.  EG the terms dict index in
> the standard codec in LUCENE-1458 could subsantially reduce it's RAM
> usage.  FieldCache.StringIndex could as well.  And I think "load into
> RAM" codecs like the one in TestExternalCodecs could use this too.
> I'm picturing something very basic like:
> {code}
> interface PackedUnsignedLongs  {
>   long get(long index);
>   void set(long index, long value);
> }
> {code}
> Plus maybe an iterator for getting and maybe also for setting.  If it
> helps, most of the usages of this inside Lucene will be "write once"
> so eg the set could make that an assumption/requirement.
> And a factory somewhere:
> {code}
>   PackedUnsignedLongs create(int count, long maxValue);
> {code}
> I think we should simply autogen the code (we can start from the
> autogen code in LUCENE-1410), or, if there is an good existing impl
> that has a compatible license that'd be great.
> I don't have time near-term to do this... so if anyone has the itch,
> please jump!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to