John:

> Word_t *pshape= (Word_t*)JudyLGet(j_shape,(Word_t)memory,&je);
>   if(pshape==NULL||pshape == (Word_t*)(void*)PPJERR) abort();

I have been considering on getting rid of the PJError_t parameter in the 
Judy routines that do not malloc(2).  It seems that nobody uses it.
Judy returning a -1 in the "pointer to value" word do so because:

1) There was a malloc(2) failure.
2) There was a Judy array corruption.
3) There was a programming mistake.

The NULL return simply means "not found".  In JudyLGet() this is
not an error (unless the programmer does not expect it).

For example, in the above code, the "je" error structure is never used.
If you were lucky and looked at the error code number, it may of said: 
"not a Judy array" or some clue like that.

So, it might as well of been written:

Word_t *pshape= (Word_t*)JudyLGet(j_shape,(Word_t)memory,NULL);

Judy was designed to have a zero (NULL) in the error parameter place.

In Judy routines that do not call malloc(2), then the -1 return means a
(2 Judy array corruption or a (3) programming mistake -- both of which
should be gone with production code. (32 bit Judy1Count() is an exception).

In using Judy for the last 6 years, I have very seldom run into a Judy array
corruption -- usually the result of a (random) RAM failure.

One of the original  (perhaps misguided) reasons for the -1 return was
to avoid the "core dumps" from within the Judy code.  I got tired of 
explaining to programmers that it was their error in the calling parameters.

As for your mistake with JudyLGet(), we all have done it and I have considered
changing interface.  The design of JudyLIns() requires the "&" so that the 
"allocate the array with NULL pointer" property of Judy can be used.
This is very important in "applications" such as JudySL().  However, it is
possible (and perhaps desirable) to do that with an additional code outside
the JudyLIns() code and code that calls malloc(2)/free(2).
Again, this was not done originally because of the additional overhead
of a subroutine call -- no longer significant in a modern processor.
The Judy macros (JLG() etc.) were an attempt to avoid that mistake, but
it seems that programmers don't like using them for reasons I don't understand.

On another subject, you suggest the JudyLNext() return some
stateful information -- for performance.  I studied that very
extensively 6 years ago to improve the performance on Next/Prev.  
Again, on a modern processor, these routines almost always run when the
data is "in the cache" and they almost disappear from the profiles.
(Having stateful information does not help in avoiding cache misses).

I am still actively working on a new Judy.  Improved speed was the
highest objective, but it looks like greatly improved memory efficiency
will be the biggest improvement.

I take your suggestions very seriously, so keep them coming.
Nobody likes the Judy semantics, but I have not found a better
solution or even had suggestions -- with code examples.

The latest wrinkle with C++ not compatible with Judy is troublesome
for me.


> The typing and docs on the interface are not so good.. ouch.

I accept suggestions with examples on that too.  I do not like a
"man" page longer than 1 screen full.

Thanks for your interest and inputs.

Doug
 
Doug Baskins <[EMAIL PROTECTED]>



----- Original Message ----
From: skaller <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Cc: [email protected]; [EMAIL PROTECTED]
Sent: Monday, September 17, 2007 6:53:26 AM
Subject: Re: [Felix-impl] judy bug

On Sun, 2007-09-16 at 22:08 -0600, Alan Silverstein wrote:
> > Ah, found bug in my code -- how I hate that moronically stupid C
> > language.  Was using &a instead of a for JudyLGet.
> 
> All too easy.  I've made recent use of libJudy myself in my contract
> job, and even as one of the authors I find it surprisingly difficult to
> keep the pointer expressions correct.  In particular when *Pvalue return
> by JudyL or JudySL, which it only knows as a word -- no forced
> interpretation of the contents -- is known to the caller to contain a
> pointer.  So you must first check for Pvalue being null, etc:
> 
>     foo_t * Pfoo;
>     JLG(Pvalue, ...);
> 
>     Pfoo = ((Pvalue == NULL) ? NULL : (foo_t *) (*Pvalue));
> 
> It can get confusing, especially when dealing with array-of-array, etc.

I'm not using the macros .. so it is much worse: Ihave this, 
for example:

  Word_t *pshape= (Word_t*)JudyLGet(j_shape,(Word_t)memory,&je);
  if(pshape==NULL||pshape == (Word_t*)(void*)PPJERR) abort();

> But it sure has sped up some slow programs dramatically.

My experience is slightly different. The Felix gc is actually slightly
slower with Judy than the doubly linked list I used before.
Memory consumption is about the same I think .. not sure.

For me the BIG difference was in another area: functionality.

Previously, the Felix gc used memory blocks like:

    [ shape pointer ]
    [ array count ]
    [ flags for gc ]
    [ previous block ]
    [ next block ]
    [ padding for alignement]
====>    [ USER HEAP BLOCK ]

With a 'header' in front of every block on the heap to retain RTTI
for the garbage collector, array repeat count, and flags etc,
as well as the prev and next links, overhead 48 bytes on 64 bit box.
For list nodes of two words (16 bytes) that's a lot of overhead.

Furthermore pointers were like:

    struct { void * heap_block; offset_t offset; }

because the garbage collector could only work with pointers to the
block, not with pointers INTO the block.

Judy changed all that. The gc now has ZERO header, i.e. there is 
no header at all. Instead a JudyL array maps pointers to RTTI
shape data. A separate map for array count is kept for entries
with lengths NOT equal to 1 -- so non-arrays don't have an entry
in the JudyL array (that's 99% of most objects).

And using JudyL I can find the key equal to or just before any
pointer .. so the gc now works with interior pointers.

The ability of the gc is further enhanced, since it can now
detect 'foreign pointers' in O(1) [pointers not being managed
by the gc] 

Furthermore the user could malloc() some data, play about,
and then *register* the object for gc management .. impossible
if there had to be a header block on managed memory.

And Felix pointers are now plain C pointers. This actually
simplified the programming language and standard library
quite significantly (forgetting about the performance
improvement and simplification of the run time library ..)

So the result is: roughly the same amount of memory is wasted
on meta-data. Judy saves here and there on the constant block
overhead, but using separate arrays means the keys use up
space several times. Functionality/capability is vastly 
improved without sacrificing performance, which is critical
for a garbage collector.

Furthermore, it is possible to scan the RTTI of all the
objects in memory without scanning the actual memory.
When following pointers, it is only necessary to examine
objects that actually contain them .. so the gc can 
find all the reachable objects WITHOUT actually paging
in every object (only ones with pointers in them need
to be paged in). This should improve cache/VM locality
a bit, though I have done no measurements.

I read somewhere that a standard Hashtable is as fast
as Judy.. so why bother. Well, I have given one answer.
Hashtable can't give a sorted list of keys. Judy Can.

Hashtbl can't find 'next after' or 'next after or equal to'
or 'Next empty' and Judy can.

The typing and docs on the interface are not so good.. ouch.
But the actual *design* of the interface is close to perfect.


-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Judy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/judy-devel




-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Judy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/judy-devel

Reply via email to