Vyacheslav, thanks for your suggestions. I tried them out and together over 
200 random keys each with 100,000 iterations, it's about 0.000048ms per key 
(hash output is 32-bit signed).

Ultimately the hash result is being used as an index into a hash table so 
it needs to be cast to unsigned integer at some point. If I do a) and b) 
but then cast it, it's about 0.00022ms per key (hash output is 32-bit 
unsigned).

If I use a Uint32Array or normal array, with 31-bit unsigned integers as 
hashing table constants, and then cast the result to unsigned, it's about 
0.000048ms per key (hash output is 31-bit unsigned).

Is there something obvious I'm missing to get an index into a hash table 
bucket using a 32-bit signed integer without slowing things down?

I did also notice along the way that if I run the original 32-bit unsigned 
integer hash first a few times, and then the faster 31-bit unsigned integer 
hash, then the 31-bit hash is instead two orders of magnitude slower.

On Wednesday, May 30, 2012 6:25:39 PM UTC+2, Vyacheslav Egorov wrote:
>
> Is it essential that your hash should be uint32 not int32? 
>
> I would assume you can get a good performance and 32 bit of hash if 
> you stay with int32's: 
>
> a) fill your tables with (integer % Math.pow(2,32)) | 0 to force 
> uint32 into int32 and 
> b) make your table a typed array instead of normal array: new 
> Int32Array(256); [not necessary if you are running x64 version of V8] 
> c) stop doing >>> 0 at the end; 
>
> -- 
> Vyacheslav Egorov 
>
> On Wed, May 30, 2012 at 5:21 PM, Joran Greef <jo...@ronomon.com> wrote: 
> > The difference comes through when changing "integer % Math.pow(2,32)" 
> > to "integer % Math.pow(2,31)" when pre-generating the hash tables. 
> > 
> > Hash tables containing 256 random 31-bit unsigned integers are 
> pre-generated 
> > for every byte of key. The hash operates on fixed-length 20 byte keys. 
> > 
> > Each byte of the key is XOR-red with one of the integers in the table, 
> > depending on the position of the byte in the key, so the XOR is dealing 
> with 
> > an 8-bit unsigned integer and a 31-bit unsigned integer. 
> > 
> > The result is cast to an unsigned integer by >>> 0. 
> > 
> > On Wednesday, May 30, 2012 5:00:55 PM UTC+2, Vyacheslav Egorov wrote: 
> >> 
> >> Minor correction: 31-bit tagged _signed_ integers are used on ia32, on 
> >> x64 you get 32-bit tagged _signed_ integers. Neither though are wide 
> >> enough to contain all values from unsigned 32-bit integer range. 
> >> 
> >> Thus if you are really using them as 32bit _unsigned_ integers, e.g. 
> >> you are doing something like: 
> >> 
> >> var a = (b ^ c) >>> 0; // force into uint32 and then use in 
> >> non-int32-truncating manner. 
> >> 
> >> then unfortunately even V8's optimizing compiler gets confused. It 
> >> does not have designated uint32 representation and does not try to 
> >> infer when int32 can be safely used instead of uint32 (another example 
> >> of this bug: http://code.google.com/p/v8/issues/detail?id=2097). 
> >> 
> >> I suggest you post your code here if possible so that we could take a 
> >> look. 
> >> 
> >> -- 
> >> Vyacheslav Egorov 
> >> 
> >> On Wed, May 30, 2012 at 4:40 PM, Jakob Kummerow <jkumme...@chromium.org> 
>
> >> wrote: 
> >> > As long as you're running unoptimized code on a 32-bit V8, this is 
> >> > expected: 
> >> > 31-bit integers are stored directly as "small integers", the 32nd bit 
> is 
> >> > used to tag them as such, whereas 32-bit integers are converted to 
> >> > doubles 
> >> > and stored as objects on the heap, which makes accessing them more 
> >> > expensive. 
> >> > 
> >> > When your code runs long enough for the optimizer to kick in, it 
> should 
> >> > recognize this situation, use untagged 32-bit integer values in 
> >> > optimized 
> >> > code, and the difference between 31-bit and 32-bit values should go 
> >> > away. If 
> >> > it doesn't, please post a reduced test case that exhibits the 
> behavior 
> >> > so 
> >> > that we can investigate. (Running the code for a second or so should 
> be 
> >> > enough to get the full effect of optimization and make the initial 
> >> > difference negligible.) 
> >> > 
> >> > 
> >> > On Wed, May 30, 2012 at 4:31 PM, Joran Greef <jo...@ronomon.com> 
> wrote: 
> >> >> 
> >> >> I am implementing a table hash 
> >> >> (http://en.wikipedia.org/wiki/Tabulation_hashing) and noticed that 
> a 
> >> >> table 
> >> >> hash using a table of 31-bit unsigned integers is almost an order of 
> >> >> magnitude faster than a table hash using a table of 32-bit unsigned 
> >> >> integers. 
> >> >> 
> >> >> The former has an average hash time of 0.00007ms per 20 byte key for 
> a 
> >> >> 31-bit hash, and the latter has an average hash time of 0.00034ms 
> per 
> >> >> 20 
> >> >> byte key for a 32-bit hash. 
> >> >> 
> >> >> I figured that XOR on 8-bit integers would be faster than XOR on 
> 16-bit 
> >> >> integers would be faster than XOR on 24-bit integers would be faster 
> >> >> than 
> >> >> XOR on 32-bit integers, but did not anticipate such a difference 
> >> >> between 
> >> >> 31-bit and 32-bit integers. 
> >> >> 
> >> >> Is there something regarding XOR that I may be missing that could 
> >> >> explain 
> >> >> the difference? 
> >> >> 
> >> >> 
> >> > -- 
> >> > v8-users mailing list 
> >> > v8-users@googlegroups.com 
> >> > http://groups.google.com/group/v8-users 
> > 
> > -- 
> > v8-users mailing list 
> > v8-users@googlegroups.com 
> > http://groups.google.com/group/v8-users 
>

-- 
v8-users mailing list
v8-users@googlegroups.com
http://groups.google.com/group/v8-users

Reply via email to