Is it essential that your hash should be uint32 not int32?

I would assume you can get a good performance and 32 bit of hash if
you stay with int32's:

a) fill your tables with (integer % Math.pow(2,32)) | 0 to force
uint32 into int32 and
b) make your table a typed array instead of normal array: new
Int32Array(256); [not necessary if you are running x64 version of V8]
c) stop doing >>> 0 at the end;

--
Vyacheslav Egorov

On Wed, May 30, 2012 at 5:21 PM, Joran Greef <jo...@ronomon.com> wrote:
> The difference comes through when changing "integer % Math.pow(2,32)"
> to "integer % Math.pow(2,31)" when pre-generating the hash tables.
>
> Hash tables containing 256 random 31-bit unsigned integers are pre-generated
> for every byte of key. The hash operates on fixed-length 20 byte keys.
>
> Each byte of the key is XOR-red with one of the integers in the table,
> depending on the position of the byte in the key, so the XOR is dealing with
> an 8-bit unsigned integer and a 31-bit unsigned integer.
>
> The result is cast to an unsigned integer by >>> 0.
>
> On Wednesday, May 30, 2012 5:00:55 PM UTC+2, Vyacheslav Egorov wrote:
>>
>> Minor correction: 31-bit tagged _signed_ integers are used on ia32, on
>> x64 you get 32-bit tagged _signed_ integers. Neither though are wide
>> enough to contain all values from unsigned 32-bit integer range.
>>
>> Thus if you are really using them as 32bit _unsigned_ integers, e.g.
>> you are doing something like:
>>
>> var a = (b ^ c) >>> 0; // force into uint32 and then use in
>> non-int32-truncating manner.
>>
>> then unfortunately even V8's optimizing compiler gets confused. It
>> does not have designated uint32 representation and does not try to
>> infer when int32 can be safely used instead of uint32 (another example
>> of this bug: http://code.google.com/p/v8/issues/detail?id=2097).
>>
>> I suggest you post your code here if possible so that we could take a
>> look.
>>
>> --
>> Vyacheslav Egorov
>>
>> On Wed, May 30, 2012 at 4:40 PM, Jakob Kummerow <jkumme...@chromium.org>
>> wrote:
>> > As long as you're running unoptimized code on a 32-bit V8, this is
>> > expected:
>> > 31-bit integers are stored directly as "small integers", the 32nd bit is
>> > used to tag them as such, whereas 32-bit integers are converted to
>> > doubles
>> > and stored as objects on the heap, which makes accessing them more
>> > expensive.
>> >
>> > When your code runs long enough for the optimizer to kick in, it should
>> > recognize this situation, use untagged 32-bit integer values in
>> > optimized
>> > code, and the difference between 31-bit and 32-bit values should go
>> > away. If
>> > it doesn't, please post a reduced test case that exhibits the behavior
>> > so
>> > that we can investigate. (Running the code for a second or so should be
>> > enough to get the full effect of optimization and make the initial
>> > difference negligible.)
>> >
>> >
>> > On Wed, May 30, 2012 at 4:31 PM, Joran Greef <jo...@ronomon.com> wrote:
>> >>
>> >> I am implementing a table hash
>> >> (http://en.wikipedia.org/wiki/Tabulation_hashing) and noticed that a
>> >> table
>> >> hash using a table of 31-bit unsigned integers is almost an order of
>> >> magnitude faster than a table hash using a table of 32-bit unsigned
>> >> integers.
>> >>
>> >> The former has an average hash time of 0.00007ms per 20 byte key for a
>> >> 31-bit hash, and the latter has an average hash time of 0.00034ms per
>> >> 20
>> >> byte key for a 32-bit hash.
>> >>
>> >> I figured that XOR on 8-bit integers would be faster than XOR on 16-bit
>> >> integers would be faster than XOR on 24-bit integers would be faster
>> >> than
>> >> XOR on 32-bit integers, but did not anticipate such a difference
>> >> between
>> >> 31-bit and 32-bit integers.
>> >>
>> >> Is there something regarding XOR that I may be missing that could
>> >> explain
>> >> the difference?
>> >>
>> >>
>> > --
>> > v8-users mailing list
>> > v8-users@googlegroups.com
>> > http://groups.google.com/group/v8-users
>
> --
> v8-users mailing list
> v8-users@googlegroups.com
> http://groups.google.com/group/v8-users

-- 
v8-users mailing list
v8-users@googlegroups.com
http://groups.google.com/group/v8-users

Reply via email to