Re: Am I using incorrectly?

john skaller via Judy-devel Thu, 23 Nov 2017 22:46:02 -0800

> On 24 Nov. 2017, at 16:24, Doug Baskins via Judy-devel 
> <[email protected]> wrote:
> 
> Matthew:
> 
> NO,  you are not using Judy correctly.  I think you are being deceived by the 
> macros
> J1S and JLI.


Its a bit unfortunate but the rule is DO NOT USE JUDY MACROS.
The macros are very bad because they confuse values with lvalues.

Unfortunately, it is the macros that are documented, rather than
the underlying functions.

You need instead to examine carefully the  C interface in the header files.
The interface is precise but also hard to understand for a reason
I’ll explain in a moment.

A Judy array is represented by a single machine address.
When you query it, you give that address to a Judy function.

However, when you wish to *modify* it, you have to instead
put the address in a variable and give the address of the variable.
That’s because, a new Judy array is created and the old one
lost by a modification, and you have to tell Judy where to put
the new one.

What makes this really confusing is the woeful type system of C.
JudyL arrays have keys which are single machine word and associate
with them values which are single machine words. The array itself
is a data structure which is represented by the machine address
of the top level node, so in fact NULL is a valid Judy array, its
an empy Judy array.

Judy actually tries hard to use names to distinguish a machine
address (pointer) form a data word (integer) by naming conventions,
but in real life when you use a Judy array you often want your
keys or data to actually be pointers, not pointer sized integer,
and you have to cast. And so the typing is extremely hard to
get right, because you have to cheat C, and because C has
a very weak type system.

The MACRO’s tried to make Judy easier to use but they just
add another layer of confusion on top of the problems inherent
in C itself.

Luckily, the actual interface is perfect. Which means, when
you understand what a function actually is intended to do,
you will immediately know the interface.

For example, JudyLGet is accepts a key (machine word),
and returns the address *in* the JudyL array where the user
data is stored. So you can then store a new value there.
If the key doesn’t exist, you get a NULL back instead.

On the other GetNext returns the next key after
the given one, so instead of giving it the key,
you have to give it the address of a variable containing
the key, so it can replace that key with the next one.
So it still returns the location of the user value, not
the key. Its exactly what you want when scanning
through the array.

I didn’t even bother looking up the docs. I know how it works
because I can *reason* what the interface must be.

Perhaps the trickiest thing to understand is errors.
Most Judy functions *cannot* have an error.
If a real error occurs, the error code at the location you gave
the function is updated.

Trying to Get a key that doesn’t exist is not an error!
What about trying to GetNext? Nope, that cannot be
an error, how else would you find the end but to
run past the last key?

So for example from my own code:

    //fprintf(stderr,"Calling Judy for next object\n");
    pshape = (Word_t*)JudyLNext(j_shape,(Word_t*)(void*)&current,&je); // GT

This is in my garbage collector. The key is the machine address of an object
allocated by malloc. The value here is the “shape” of the object, which is
a pointer to type descriptor.

The function *returns* the pointer to the storage location *inside the Judy 
array*
where my shape pointer lives, and it finds that location based on
the key stored in the variable current. It updates the key to the next one.
So you have to give Judy the *address* of the variable containing the key,
not just  the value of the key. At the end, the function returns NULL
because there is no location *inside* the Judy array for the next value
because there is no next key. 

If there’s an actual error, such as corrupted memory, then the error code
in variable je is updated, so I have to pass it a pointer to where my error
variable is. I don’t bother checking it.

If you read the above explanation very carefully you understand why
that function has *exactly* the correct interface, and that there is 
no other possible correct interface that is not more complex.

Which is why I say, the interface is actually perfect. Logical reasoning
plus Occams razor is enough to know what the interface HAS to be.
You can rely on it. You cannot rely on the MACROS.

—
john skaller
[email protected]
http://felix-lang.org


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Judy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/judy-devel

Re: Am I using incorrectly?

Reply via email to