Re: Improve Judy

Alan Silverstein Wed, 28 Mar 2007 14:42:03 -0800

> What we can do to further improve Judy performance?  Does anyone (Doug
> or Alan) has a TODO list?


Hmm...  Interesting question.  As if the chat here about locking wasn't
enough to keep you busy.  :-)

Well I looked in my old files, lo and behold I found a "Judy Project
Team Tasks and Issues List" HTML doc.  I'd forgotten about that.  But
it's labeled "HP Company Confidential", and while they probably don't
care any more, I'm not comfortable blasting the whole thing out.  But,
you asked about performance only...  So let me skim it for anything
specifically performance-related...

Nothing for Judy IV.  We had some cute ideas we didn't implement, but
partly because they didn't pan out or didn't seem worth exploring.  All
the tasks I see are housekeeping, like resolving TBD (to be determined)
comments in the code.

We did plan for a Judy V:  "As currently defined, Judy V means 'kernel
support'.  The measure of success is that Judy can be used in the HP-UX
kernel."

And item 1 was:  "Support threads via locking.  There are many
interesting issues about how to do fine-grained locking of a portion of
the Judy array (tree), and whether this should be explicit (under caller
control) or implicit (when an array is modified).  Doug observes that it
might take longer to lock a portion of a Judy tree than to perform a
modification, in which case why not just lock the whole tree?"

Hah!

But the rest of the Judy V ideas were not about performance either.

Later down I see:  "Do a performance comparison of JudyNL and Judy2L;
decide whether to ship and support either/both.  If yes, add Judy2L*()
to the manual entry.  Currently those functions are hidden in the
library, present but undocumented."

This was a stab at JudySL-like support for variable-length numeric keys.
You can always build a meta-trie (array of array of Judy), the
interesting question is how to handle variable length rather than just
longer (multi-word) but fixed length.  I observe that Perl strings have
a length attribute associated, while C strings have a terminator char;
only the latter allows lexical sorting (left to right) rather than
initial sorting by length; unless you do something elaborate in each
level to indicate the end of a string that can contain any char,
including null.  Maybe no performance gains per se here.

Ah, here's one:  "Support variable-length target value areas so users
need not malloc or free their own memory for objects pointed at by Judy
values.  (Having to free each value area requires walking the whole Judy
array via Judy*First() and Judy*Next() before calling Judy*FreeArray().)
We need a good name for this feature.  It's been called Judy* ('Judy
Star'), as in 'wildcard', but that's awkward; and JudyV, as in 'variable
value area', but that's confusing with Judy V."

And:  "Support an optional callback function to JudySLFreeArray(), maybe
other functions too?  See saved email on the subject.  Not needed if
Judy-Star() arrives, or if search functions are fast enough?"

Not mentioned here, that I noticed anyway, is the Judy batch-insert code
that I wrote.  I believe it's also in the library but undocumented
(except in source file comments) because we didn't have time to
thoroughly test it.  So if you think you are spending too much time in
Judy*Ins(), go look for the batch insertion code.  Mostly it pre-looks
for indexes going into/under each tree node to build it all at once.

Is that enough to keep you happy a while?  :-)

Alan

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Judy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/judy-devel

Re: Improve Judy

Reply via email to