> For me the 2 major improvements to judy arrays are: 1 - Using Judy > arrays for non null terminated strings (this is currently a big > drawback)...
I'd like to know if that's because you want basic sorting of those strings as a side effect, or something else. Basic sorting = lexicographically (no knowledge of multibyte chars) versus sort-by-length. As I've mentioned before, I call this the length-terminated (like a C string) versus length-associated (like a Perl string) problem. In general, any variable-sized key/index (chars, bytes, bits, whatever) is length-associated, that is, no magic termination value inside the string. However, we wrote JudySL for C strings only, where NUL termination exists, hence you get lexicographical sorting for free. You can always save length-associated keys/indexes/strings efficiently by making the first level a JudyL array by length (got that?), but this array-of-array sorts/groups keys/strings by length first. "Too bad," if you want lexicographical sorting, use a different algorithm? Storing variable-length, length-associated keys sorted lexicographically is actually a fairly hard problem. You give up some speed and space by inserted magic "termination information" between the levels of the tree. I think I wrote a paper about this that might be hiding somewhere on SourceForge. Cheers, Alan Silverstein ------------------------------------------------------------------------------ Monitor your physical, virtual and cloud infrastructure from a single web console. Get in-depth insight into apps, servers, databases, vmware, SAP, cloud infrastructure, etc. Download 30-day Free Trial. Pricing starts from $795 for 25 servers or applications! http://p.sf.net/sfu/zoho_dev2dev_nov _______________________________________________ Judy-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/judy-devel
