It turns out I forgot to actually put in the restriction on the 'enum' to a
uint8, therefore that explains the difference in results between my
pre-processed and original Judy sources.
For now, I've let the compiler use it's default packing of 8 and just
adjusted all entries in MallocIF for Judy1 and JudyL to correctly round all
structure sizes upwards to the nearest word as per your suggestion for the
first such usage.
There are no longer any issues with the buffer overruns, at least for my
very limited test scenarios.
---
I do have further questions regarding the Judy1 vs JudyL implementations:
specifically, why JudyL takes 10x longer than Judy1 to run through the
insertion of items (same tests as before: 32K items stepping 32K to 1).
Method,Time,Memory:
Judy1, 20, 4.1K
Std::set, 140, 640K
Std::map, 140, 768K
JudyL, 200, 140K
Stdext::hash_set, 400, 400K (Bucket size=64bytes=cache line)
Stdext::hash_set, 530, 480K (Bucket size =16bytes=default)
Std::hash_map, 550, 608K (all defaults)
Given a straight bitmap for all 32K items would take exactly 4.0K, Judy1 is
an excellent replacement for such set membership tests.
However, I don't understand why JudyL takes 10x longer to perform the same
task - surely the overhead of handling an additional 32-bit value is not
that high?
Notice how the STL implementation (documentation available at
www.dinkumware.com, as they supply this for VS2005) for 'set' takes similar
times to JudyL, and yet neither 'set' nor 'map' have any kind of
cache-oriented optimisations and use way more memory thus increasing the
likelihood of cache issues.
If Judy1 didn't exist, I'd still say JudyL was great since it provides far
more acceptable memory usage than other published methods.
But such a large difference seems odd?
Regards,
Toni.
----
Toni Cassisi
Tovica Ltd
http://www.tovica.com
Tel: +44 (0) 7971 874 054
IM: AOL/Yahoo/MSN: tcassisi
From: Toni Cassisi [mailto:[EMAIL PROTECTED]
Sent: 08 May 2007 12:58
To: '[EMAIL PROTECTED]'; '[email protected]'
Subject: RE: Issue with Bit mask and Buffer Overruns
Have given that a go, however, I still see the same issue. Taking a look at
my pre-processed one, I see I have avoided the VS2005 error (converting to
an enum requires a cast) by changing the
J_UDY1_POPULATION_AND_MEMORY/je_ErrorNo to the JU_Errno_t enum but also used
the extension "enum JU_Errno_t : uint8_t", so I don't see how that could
have any impact on the size.
Looking at all the structure definitions and the way JudyMallocX seems to
always assume they are multiples of 32-bits (in my case), the changes you
mention are needed whenever the size of the structure is divided by
cJU_BYTESPERWORD.
What would be better: try to get everything working with a packaging of 1 or
ensure all structures were padded out to 32-bit multiples? The latter seems
like it is not really necessary on machines that handle unaligned data
access (Intel x86), however was Judy designed on a machine which did?
In that case, it must be the compiler enforcing the alignment rules,
although packing is "implementation dependant".
Regards,
Toni.
----
Toni Cassisi
Tovica Ltd
http://www.tovica.com
Tel: +44 (0) 7971 874 054
IM: AOL/Yahoo/MSN: tcassisi
From: Doug Baskins [mailto:[EMAIL PROTECTED]
Sent: 05 May 2007 01:37
To: Toni Cassisi; [email protected]
Subject: RE: Issue with Bit mask and Buffer Overruns
Toni:
Make the following edit to src/JudyCommon/JudyMallocIF.c
185c185,185
< Word_t Words = sizeof(jpm_t) / cJU_BYTESPERWORD;
---
> Word_t Words = (sizeof(jpm_t) + cJU_BYTESPERWORD - 1) /
cJU_BYTESPERWORD;
and see if your Buffer Overrun disappears.
Thanks,
Doug Baskins
Toni Cassisi <[EMAIL PROTECTED]> wrote:
Keep in mind my main reason for asking was due to the VS2005 fussiness,
something that could perhaps be a deficiency in its checking logic.
It looks like the "-" business first mentioned is to do with the weird
decision by Judy to store the remaining memory as opposed to the used
memory, so provided standard add/subtract are used, then doing a negate with
"-" is correct.
The problem is the comments around the second use in JudyPrivate.h:
#define JU_MASKLOWEREXC( BITPOS) ((BITPOS) - 1)
#define JU_MASKLOWERINC( BITPOS) (JU_MASKLOWEREXC(BITPOS) | (BITPOS))
#define JU_MASKHIGHERINC(BITPOS) (-(BITPOS))
#define JU_MASKHIGHEREXC(BITPOS) (JU_MASKHIGHERINC(BITPOS) ^ (BITPOS))
I'm not sure how to prove what the intention was, but they seem to be used
only for walking the JArray in sorted order OR to count the number of items.
Example: "-x" is equivalent to "~x + 1", so if the bitarray=0100 (base
2) and:
bitpos=3=0011, then, ~bitpos=1100 but -bitpos=1101
bitpos=2=0010, then, ~bitpos=1101 but -bitpos=1110
So far, ignoring the warning doesn't seem to have caused any problems in my
(limited) testing.
* Update on the buffer overrun:
The same thing occurs with the JudyL sources, no surprise really as they are
pretty similar.
However, I setup for Release mode, then ran all the source through the
pre-processor only then post-formatted it automatically with VS and things
are a lot clearer now: for me at any rate.
What is interesting is that I now never have any memory issues flagged up.
Perhaps the problem is due to an unintended side effect buried somewhere in
the DBGCODE() statements, or whatever else is being excluded by undefining
the DEBUG/_DEBUG et al variable?
I'm working to that theory for now.
* Final question
If trying to return-and-remove any item from within the set (managed by a
Judy1 structure), is it quicker to search from the first index or the
highest index?
I presume the JudyL and Judy1 are very similar in performance, although
there must be some differences due to the different data layouts.
Regards,
Toni.
----
Toni Cassisi
Tovica Ltd
http://www.tovica.com
Tel: +44 (0) 7971 874 054
IM: AOL/Yahoo/MSN: tcassisi
> -----Original Message-----
> From: Alan Silverstein [mailto:[EMAIL PROTECTED]
> Sent: 04 May 2007 16:36
> To: [email protected]; [EMAIL PROTECTED]
> Subject: Re: Issue with Bit mask and Buffer Overruns
>
> Toni et al, I know Doug was very comfortable with ~ and - operators and
> sometimes used them purposely without special commenting (where I might
> have explained the assumptions). Beyond that I need to save, study,
> and
> reply at greater length to your email later, unless Doug beats me to
> it.
>
> Alan
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Judy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/judy-devel
Toni Cassisi <[EMAIL PROTECTED]> wrote:
Keep in mind my main reason for asking was due to the VS2005 fussiness,
something that could perhaps be a deficiency in its checking logic.
It looks like the "-" business first mentioned is to do with the weird
decision by Judy to store the remaining memory as opposed to the used
memory, so provided standard add/subtract are used, then doing a negate with
"-" is correct.
The problem is the comments around the second use in JudyPrivate.h:
#define JU_MASKLOWEREXC( BITPOS) ((BITPOS) - 1)
#define JU_MASKLOWERINC( BITPOS) (JU_MASKLOWEREXC(BITPOS) | (BITPOS))
#define JU_MASKHIGHERINC(BITPOS) (-(BITPOS))
#define JU_MASKHIGHEREXC(BITPOS) (JU_MASKHIGHERINC(BITPOS) ^ (BITPOS))
I'm not sure how to prove what the intention was, but they seem to be used
only for walking the JArray in sorted order OR to count the number of items.
Example: "-x" is equivalent to "~x + 1", so if the bitarray=0100 (base
2) and:
bitpos=3=0011, then, ~bitpos=1100 but -bitpos=1101
bitpos=2=0010, then, ~bitpos=1101 but -bitpos=1110
So far, ignoring the warning doesn't seem to have caused any problems in my
(limited) testing.
* Update on the buffer overrun:
The same thing occurs with the JudyL sources, no surprise really as they are
pretty similar.
However, I setup for Release mode, then ran all the source through the
pre-processor only then post-formatted it automatically with VS and things
are a lot clearer now: for me at any rate.
What is interesting is that I now never have any memory issues flagged up.
Perhaps the problem is due to an unintended side effect buried somewhere in
the DBGCODE() statements, or whatever else is being excluded by undefining
the DEBUG/_DEBUG et al variable?
I'm working to that theory for now.
* Final question
If trying to return-and-remove any item from within the set (managed by a
Judy1 structure), is it quicker to search from the first index or the
highest index?
I presume the JudyL and Judy1 are very similar in performance, although
there must be some differences due to the different data layouts.
Regards,
Toni.
----
Toni Cassisi
Tovica Ltd
http://www.tovica.com
Tel: +44 (0) 7971 874 054
IM: AOL/Yahoo/MSN: tcassisi
> -----Original Message-----
> From: Alan Silverstein [mailto:[EMAIL PROTECTED]
> Sent: 04 May 2007 16:36
> To: [email protected]; [EMAIL PROTECTED]
> Subject: Re: Issue with Bit mask and Buffer Overruns
>
> Toni et al, I know Doug was very comfortable with ~ and - operators and
> sometimes used them purposely without special commenting (where I might
> have explained the assumptions). Beyond that I need to save, study,
> and
> reply at greater length to your email later, unless Doug beats me to
> it.
>
> Alan
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Judy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/judy-devel
Doug Baskins <[EMAIL PROTECTED]>
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Judy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/judy-devel