On Mon, Feb 17, 2014 at 10:59 AM, Alan Silverstein <[email protected]> wrote:
> John et al, hoping Doug Baskins will reply to you as well, but in the
> meantime as a co-developer of libJudy, here's my $0.02 (and not worth
> much more I admit).
>
>> Judy is one of the most useful libraries out there and I use it all
>> the time.
>
> VERY nice to hear that! We busted our butts for a couple of years,
> ultimately a team of four people, trying to make it world-class. But we
> discovered how hard it is to "sell a library," especially without
> academic credentials or writeups. HP ended up canceling the project,
> laying off the team, and open-sourcing the code in 2002.


You shouldn't underestimate how much more exposure and influence the project
has now that it is open sourced. Besides the programs out there that link
against libjudy, there are probably dozen of radix tree implementations out
there than were inspired by reading about judys implementation. The heap
compression techniques I use in my jhc compiler were heavily influenced by
your work on judy and in particular your attention to cache line loads.

I actually originally used judy in the garbage collector, having the
mark/sweep passes keep the used bitmap in a Judy1 tree. This actually ended
up being very efficient, I could sweep just by discarding the judy tree and
not having to traverse the whole heap looking for free bits helped
allocation time a lot. I have since moved to a different system when I
switched to a dedicated allocator, however I still use it for heap
debugging. I use a JudyL array to associate meta-info with each memory
location,this is much better than adding tag words to the blocks themselves
as it doesn't change the layout at all which often can obscure bugs.

>> Something that always bothered me about it is the somewhat unusual
>> calling conventions, the use of macros and silent casting of 'void *'s
>> meaning it is very easy to accidentally use the API incorrectly and
>> it's conventions differ from pretty much every other library out there
>> making combining code problematic.
>
> Yeah, yeah, we know, we know, sorry. I'm personally responsible for not
> envisioning the use of autoconf, etc, for greater portability, which is
> ironic since libJudy actually has minimal (very few) OS/library
> dependencies.

autoconf is great, a pet peeve of mine is when people try to reimplement it
without understanding all it gives you, cross-platform builds are one of
those things you don't appreciate how tricky it is to do portably until you
come accross a non-autoconfed package you need to install.

I always assumed the odd API was to conform to some internal HP coding
standard created by someone who has a history with algol (hence the in/out
arguments) I've seen odder ones forced on programmers out there.

>> I went ahead and used c99's features (inline functions, immediate
>> struct values, defined uintptr_t and bool types) to make a typesafe
>> and more conformant api.
>
> OK, that's fun to watch. My only concern is our experience that it was
> way too easy to naively do something (like thread-safeness) that ruined
> the library's performance. You can't have too much run-time overhead
> going in and out or you can cut it in half, or even much worse.
> Syntactic sugar that doesn't overwhelm the algorithm -- great. Innocent
> layers of type-conversions or other foo, ugh.


Yeah, don't worry. my background is in embedded programming/operating
systems and performance programmimng, I never got over the habit of examining
the assembly output of every code change. With modern compilers there will
be no overhead. Of course, profile,profile,profile is always the key. I
increased the block allocation speed in jhc making every compiled program go
faster by 3-5% (!!) by changing one

if(x) a else b to if(!x) b else a

Actually the changes I made can potentially increase performance in a few
ways, not really relevant to this due to the inlining but useful to know

one is the use of the standard 'bool' type, which in addition to getting rid
of a dreaded cast to bool class of bugs can optimize code, namely, an ABI
can return a bool directly in a condition code rather than a register making
a branch on it trivial. Another is that since it knows the value is zero or
one, branchless optimizations like this can happen

bool b;
z = b ? a : b -> int foo[] = { a, b}; z = foo[b];
or the more direct but a bit more contrived.
z = b ? 43 : 44 -> z = 43 + b

another is by using actual structures the compiler can do alias analysis.
for instance

foo(judyl_t *j, int *x) {
   if (jEmpty(*j))
     blah;
  *x = 0;
  if (jEmpty(*j))
     bar;
}

Since judyl_t and int are different types, the compiler can assume that x
does not point to the same memory location as j, meaning it can combine both
jEmpty calls into one caching the result and removing a load. It would not
be able to do this with Pvoid_t values without explicitly checking if x == j.


>> I figured I post it if anyone else found it handy.
>
> Thanks for sharing it back in the spirit of open source. Let's see what
> others say, people who are more actively using libJudy. Me, I did use
> it successfully for a few projects later (contracting for HP and Avago),
> but retired a year ago.

Yup. Most everything I do is open source, even when I work for a company,
not necessarily directly, I was in finance for a while so obviously wasn't
open sourcing our proprietary trading models, but our infrastructure was
linux and OSS based so improvements I made I'd contribute back.

John

------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
_______________________________________________
Judy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/judy-devel

Reply via email to