Re: range lookup using Judy;

Doug Baskins Sat, 12 Feb 2011 17:37:25 -0800

Pradeep:

I am sorry I miss understood what you wanted to do.   But I still
do not understand.  Perhaps knowing the application would help.
Judy has a number of advanced features that may help.

Thanks for your interest

doug

Doug Baskins <[email protected]>

From: "Bisht, Pradeep" <[email protected]>
To: Doug Baskins <[email protected]>
Cc: [email protected]
Sent: Saturday, February 12, 2011 12:38 AM
Subject: Re: range lookup using Judy;

My Judy elements are like this: <begin, count>, so I'm using index=<begin> => 
value=<count>

If JA = <0, 16>, <16, 8>, <32, 24> and range to lookup = <8, 128> then
overlapping ranges = <0, 8>, <16, 8>, <32, 24> and updated JA = <0, 16>, <16, 
8>, <24, 8>, <32, 24>, <56, 72>

If I understand you correctly then I should:
1. do a count of the lookup range <8, 128> i.e, 128 - 8 = 120.
2. do a First on the start of the range i.e, JLF (8) which shall return <16, 8> 
but the previous entry <0, 16> already contains <8> how to determine that?
3. do a Next on (120). - I did not quite follow this, sorry.
4. use Next to continue search.

Correct?

From: Doug Baskins <[email protected]>
To: "Bisht, Pradeep" <[email protected]>
Cc: [email protected]
Sent: Fri, February 11, 2011 11:10:01 PM
Subject: Re: range lookup using Judy;

Pradeep:

I think you are making it a bit more complicated than needed.  I think I would
first do a Count of the range desired and do a First on the start of range and 
then do a Next on the Count - 1.  First is an inclusive search, and Next is
exclusive.  First is used to start a search, and Next is used to continue 
search.
In my unreleased version of Judy, First is simply a Get, followed by a Next if
Get fails. 

Thanks for your interest,
doug

Doug Baskins <[email protected]>

From: "Bisht, Pradeep" <[email protected]>
To: john skaller <[email protected]>
Cc: judy <[email protected]>
Sent: Fri, February 11, 2011 8:18:34 PM
Subject: Re: range lookup using Judy;

okay I think the thing I was doing wrong was to use the next number to look for 
as the input to JLN while I should be using the index returned by JLF or the 
previous JLN call as an input to the next JLN. now looks to work fine. I do 
this:

1. JLL to find starting of my range.
2. JLF to find the next potential overlap.
3. keep doing JLN untill I go past the potential overlap range or the entire 
input range is exhausted.

From: "Bisht, Pradeep" <[email protected]>
To: john skaller <[email protected]>
Cc: judy <[email protected]>
Sent: Fri, February 11, 2011 6:38:51 PM
Subject: Re: range lookup using Judy;

sorry sent too early. The problem is that JudyLNext (JLN) performs an exclusive 
search. so after doing JLF when I'm looking for an index say 40 using JLN it 
returns me 41 even if 40 exists. Now at this point I think I have no way but to 
do JLP/JLL to check the one previous to make sure it is not 40 or contains 40 
(as it is a range). so essentially I have to do a JLP/JLL for every JLN. 
correct?

Now the question: are JLP and JLN faster compared to JLF and JLL? 

Thank you.

From: "Bisht, Pradeep" <[email protected]>
To: john skaller <[email protected]>
Cc: judy <[email protected]>
Sent: Fri, February 11, 2011 6:22:01 PM
Subject: Re: range lookup using Judy;

Thanks John. I was using JudyL and was wondering if there is any clever way of 
avoiding that back and forth search.

From: john skaller <[email protected]>
To: "Bisht, Pradeep" <[email protected]>
Cc: judy <[email protected]>
Sent: Fri, February 11, 2011 4:57:20 PM
Subject: Re: range lookup using Judy;

On 12/02/2011, at 11:27 AM, Bisht, Pradeep wrote:

> I need to perform range lookup on a set which has 256 entries.
> There are millions of such independent 256 entries sets.
> currently I'm having linked list for each such set where each
> node contains begin and end of the range. Each entry in the
> set is disjoint. Now I need to do a lookup of a range and
> find out all the entries that overlaps with the input range and
> also insert the regions which do not exist.
> e.g, set = {(0, 8), (16, 24)}
>      lookup of (0, 32) should return entry index 0, 16 and
>      set = {(0, 8), (8, 8), (16, 24), (24, 8)}
> I 'm thinking of evaluating JudyArray for this purpose. Is it possible
> to use JudyArray to solve this problem faster than a linked list or a range 
> tree? which
> Judy variation will be most suited? And how? Thanks for the suggestions.

I think Judy1 can do this easily. You can use Judy1First and Judy1Next
to find all the extant entries. Since there's only 256 entries per set,
it's not too hard to put all of the entries in a range in. You can find
all the "holes" the same way. After you get these results you can optimise
them to use ranges.

If you want to use ranges directly, I think you can also do it with JudyL.
This is a bit harder, but I do a similar thing with my garbage collector.
You make a map:

lowvalue --> high value

to represent a range. Then when scanning you do this:

JudyLFirst will find the NEXT key after the one you want most
of the time (unless you actually hit the key). So then you use
JudyLPrev to go back one key, and get the previous start/end
range. Now you can check if your input start value is in that range or not:
if so it's the start of the range and you have the end as the minimum
of the stored end and your input end of range. Otherwise
the key you previously found is the start of the range provided it
is greater than your input end of range.

So now you can use JudyLNext to get all the subsequent ranges
until you overshoot.

This is O(1). No way to do it faster. The inverse (holes between the selected
ranges) is trivial.

So yes, I'd say Judy is perfect for this job :)

--
john skaller
[email protected]

Don't be flakey. Get Yahoo! Mail for Mobile and 
always stay connected to friends.

Looking for earth-friendly autos? 
Browse Top Cars by "Green Rating" at Yahoo! Autos' Green Center.

------------------------------------------------------------------------------
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb

_______________________________________________
Judy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/judy-devel

Re: range lookup using Judy;

Reply via email to