Re: Prefix-only search?

john skaller Mon, 19 Nov 2012 14:06:34 -0800

On 20/11/2012, at 3:47 AM, Tim Margheim wrote:

> In the first question, I was asking if anyone can think of a more efficient 
> way to do a prefix search using the current Judy functions--not asking for 
> something more efficient than the current functions.
> 
> It's hard to see how to improve on "JudySLNext() followed by an strncmp() on 
> the returned key", but I thought it couldn't hurt to ask.


You can use a tree of JudyL -- which is what JudySL is anyhow.
So on a 64 bit machine, you'd do successive lookups on 8 bytes
of a string, and you could stop on failure.

> Hmm, I might have to look into that--my current application isn't affected 
> much by the insertion performance, but it could matter for another 
> application that I have in mind.  And I could make do with one cursor.

It's on GitHub somewhere.

> 
> Also, if you modified the Judy insert function to accept a pointer to a bool, 
> then the bool could return whether or not the insertion created a new element 
> or returned an already-existing one.  

Actually, since insert returns a pointer, you can just put the boolean
flag in the low bit of the pointer. Allocated store is always aligned
at least to an even address. You'd just have to remember to mask
out the low bit before dereferencing.

>> A more consistent function would be 
>> 
>>      find-between
> 
> Good point.  And it would be simple enough to write an inline prefix-search 
> function that auto-generates the upper-bound string and wraps a call to 
> find-between.  

Also you could write this function now and the simpler
prefix search now. Neither would be as efficient
as a Judy native one, but it would provide a use case for
performance testing if Doug ever decided it was worth implementing.
Or if you decided to modify Judy yourself.

> On the other hand, you'd also want to carefully compare the performance of a 
> find-between with the performance of a prefix search.  It might be that the 
> find-between requires some additional instructions.

Yes, it probably would. The prefix search would probably be quite
a bit faster.

>  Still, find-between would certainly perform better than running Next and 
> then checking if the returned key is outside your bounds.

You could always do a prefix search on the first 8 bytes first
with a separate JudyL array.  May not be that useful if the misses
are later in the string though. Perhaps an ordinary hash table
as a "precheck" (although if you hash the string you're scanning
the whole string which would be slow).


        
--
john skaller
[email protected]
http://felix-lang.org




------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
Judy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/judy-devel

Re: Prefix-only search?

Reply via email to