Re: I'd like to congratulate you and the team that worked on Judy Arrays

Alan Silverstein Mon, 02 Apr 2012 11:35:18 -0700

Gang,

> Gcc at least has a way to "hint" to the compiler whether a branch is
> likely to be taken.  Felix actually emits these hints.  However I have
> not noticed gcc doing anything with them.


Amusing, it reminds me that in the last year I carefully marked some new
source functions "inline", but later discovered that gcc was basically
ignoring my advice and doing whatever the heck it wanted.  Turns out
"inline" is merely a suggestion, and I don't know of a way to force it
("I know what I'm doing and I want to waste space for performance")
other than reverting to macros.  (Maybe there's some gcc option or
#pragma I didn't look for...)

> A compromise API would be like:  "here is an array size 100, fill it
> with 100 entries starting at key such-and-such".  You could then
> optimally process the 100 entries, then get the next 100.

I vaguely recall (this was > 10 years ago) discussing multi-index
operations like this.  I know I actually WROTE code that batch-created
(inserted) a Judy1/L array rather faster than by simple iteration, and
it worked fine, but I don't know where it is today.  (Given a simple
array, or was it parallel arrays, of sparse indexes and values, libJudy
internally-smart code can create a valid Judy tree pretty quickly, I
think maybe 3X the one-at-a-time mode?)

> Generally, a map of word to "anything" would be good.  This can be
> done by map to a word, which is a pointer.  At the cost of an
> indirection and a memory management problem.

Agreed.  The solution, which we also toyed with but did not implement,
is to have libJudy offer APIs like JudyL but with arbitrary-sized value
areas whose memory is managed by libJudy.  In which case don't just
return a pointer to the caller, have them give you a pointer to the
data, and its size, and the library does the mem-create and copy-in.

> A trivial example:  an IP address is 4 bytes, some of which can be 0.
> Of course this can be done now, with a JudyL instead of S.

Right.  C strings are what I call "length-terminated" by the NUL bytes.
Perl strings for example are "length-associated," and any bit pattern
can appear in the string.  You can easily store them in a JudyL
meta-trie where the top level is the length, but you end up with them
sorted by length first, then lexicographically, not just
lexicographically.  You might not care.

More generally a variable-length key whose size varies in bits, not
bytes, can be stored the same way, just sorting initially by length in
bits, not bytes.  Of course a few bits/bytes are wasted in leaf value
areas.

Cheers,
Alan Silverstein

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Judy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/judy-devel

Re: I'd like to congratulate you and the team that worked on Judy Arrays

Reply via email to