ty, it's just an implementation of the dense hash map described 
by http://sparsehash.googlecode.com/svn/trunk/doc/implementation.html

It stores the keys and values side by side, in a single buffer, using open 
addressing and triangular probing (requires the number of buckets to be a 
power of two in order for the triangular probing to reach every bucket), 
and a tabulation hash (http://en.wikipedia.org/wiki/Tabulation_hashing) 
with pre-generated random tables (the key length is fixed so this kind of 
hash is ideal for that and very fast with good distribution). The buffer is 
doubled in size whenever the (number of entries plus the number of deleted 
entries) over the (number of buckets) is more than 0.6 to avoid too much 
clustering.

Here's the original 
discussion: https://groups.google.com/d/topic/nodejs/yLbuS7YONTI/discussion

Here's a discussion on the hash function in 
v8-users: 
https://groups.google.com/forum/#!msg/v8-users/zGCS_wEMawU/6mConTiBUyMJ

On Friday, July 13, 2012 9:37:46 AM UTC+2, Yi wrote:
>
> Hi Joran,
>
> We are facing a similar performance problem as your million-entry-object. 
>
> May I ask you to explain a bit detail about the implementation of ”Buffer 
> backed open addressed hash table “
>
> Many thanks,
>
> ty
>
>
> 2012/7/13 Joran Greef <jo...@ronomon.com>
>
>> To get a very rough idea of how much time your program spends in gc, you 
>> can expose the gc by running node with the --nouse_idle_notification 
>> --expose_gc flags and then call gc manually with "global.gc()" and time how 
>> long that takes.
>>
>> I have a program where a single JS object was being used as a hash table 
>> with a few million entries. When the gc would run, as far as I am aware, it 
>> would iterate over every one of those entries and it would do this every 
>> few seconds, each time pausing for about 500ms. I switched to a Buffer 
>> backed open addressed hash table which is also more efficient in other 
>> respects.
>>
>>
>> On Friday, July 13, 2012 3:10:50 AM UTC+2, Alexey Petrushin wrote:
>>>
>>> There are rumors that current Node.js (or, more exactly V8 GC) performs 
>>> badly when there are lots of JS objects and memory used.
>>>
>>> Can You please explain what exatly is the problem - lots of objects or 
>>> lots of properties on one object (or array)?
>>>
>>> Maybe there are some benchmarks, would be interesting to see actual code 
>>> and numbers.
>>>
>>> As far as I know the main problem - lots of properties on one object, 
>>> not lots of objects itself (although I'm not sure). If so - would be the 
>>> in-memory graph database (about couple of hundreds of properties on each 
>>> node at max) a good case?
>>>
>>> Also I heard that latest versions of V8 has improved GC and that it 
>>> solved some parts of this problems - is this true, and when it will be 
>>> available in Node.js?
>>>
>>  -- 
>> Job Board: http://jobs.nodejs.org/
>> Posting guidelines: 
>> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
>> You received this message because you are subscribed to the Google
>> Groups "nodejs" group.
>> To post to this group, send email to nodejs@googlegroups.com
>> To unsubscribe from this group, send email to
>> nodejs+unsubscr...@googlegroups.com
>> For more options, visit this group at
>> http://groups.google.com/group/nodejs?hl=en?hl=en
>>
>
>

-- 
Job Board: http://jobs.nodejs.org/
Posting guidelines: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to nodejs@googlegroups.com
To unsubscribe from this group, send email to
nodejs+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

Reply via email to