Improvement of PForDelta Codec
------------------------------
Key: LUCENE-2903
URL: https://issues.apache.org/jira/browse/LUCENE-2903
Project: Lucene - Java
Issue Type: Improvement
Reporter: hao yan
There are 3 versions of PForDelta implementations in the Bulk Branch:
FrameOfRef, PatchedFrameOfRef, and PatchedFrameOfRef2.
The FrameOfRef is a very basic one which is essentially a binary encoding (may
result in huge index size).
The PatchedFrameOfRef is the implmentation based on the original version of
PForDelta in the literatures.
The PatchedFrameOfRef2 is my previous implementation which are improved this
time. (The Codec name is changed to NewPForDelta.).
In particular, the changes are:
1. I fixed the bug of my previous version (in Lucene-1410.patch), where the old
PForDelta does not support very large exceptions (since
the Simple16 does not support very large numbers). Now this has been fixed in
the new LCPForDelta.
2. I changed the PForDeltaFixedIntBlockCodec. Now it is faster than the other
two PForDelta implementation in the bulk branch (FrameOfRef and
PatchedFrameOfRef). The codec's name is "NewPForDelta", as you can see in the
CodecProvider and PForDeltaFixedIntBlockCodec.
3. The performance test results are:
1) My "NewPForDelta" codec is faster then FrameOfRef and PatchedFrameOfRef for
almost all kinds of queries, slightly worse then BulkVInt.
2) My "NewPForDelta" codec can result in the smallest index size among all 4
methods, including FrameOfRef, PatchedFrameOfRef, and BulkVInt, and itself)
3) All performance test results are achieved by running with "-server" instead
of "-client"
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]