On Jun 13, 2:32 pm, Lee Sander <[EMAIL PROTECTED]> wrote: > hi, > I have the following problem which is turning out to be non-trivial. I > realize that this is not > exactly a python problem but more of an algorithm problem -- but I > post it here because > I want to implement this in python. > > I want to write a code that given an interval (integer tuple: > start,stop) will find which other > interval it matches to. Here is an example, > suppose I have the following NON-OVERLAPPING intervals > > 10 - 21 ==> 'a1' > 34 - 51 ==> 'a2' > 77 - 101 ==> 'a3' > etc > > So, suppose I'm give an interval such as (42,50), I should go through > the list and map it to "a2" etc > if the region is a subset of an exiting interval. If the query > interval does not exist in the table or > maps to two or more intervals (eg, (45, 80)) the program return a > None. ...snip... > Many thanks > Lee
OK - I'm going to assume your intervals are inclusive (i.e. 34-51 contains both 34 and 51). If your intervals are all really all non-overlapping, one thing you can try is to put all the endpoints in a single list, and sort it. Then, you can use the bisect module to search for intervals, which will give you a logarithmic time algorithm. Here, I'm going to assume you just need the index of the containing interval. If you really need a name (i.e. 'a1' or 'a2'), you can use a list of names, and index into that. I hope those assumptions are valid! if so, the following should work: from bisect import bisect # assume initial intervals are sorted intervals=[(10,21),(34,51), (77,101)] # get a sorted list of endpoints endpts=sorted([a for a,b in intervals]+[b for a,b in intervals]) def test_interval(ivl,endpts): # Find where the left endpoint of ivl would lie in the list # i.e. the index of the first element greater than ivl[0] l_idx=bisect(endpts,ivl[0]) # If l_idx is even, then it lies between two intervals, and thus # is not contained in any interval. Otherwise it returns the index if l_idx % 2 == 0: return None # if l_idx is out of bounds (i.e. ==len(endpts)), then ivl is # not contained in any interval (too large) if l_idx==len(endpts): return None # Now, all we need to do is check that the right endpoint of ivl is # less than or equal to the corresponding endpoint in the list. # Happily, that endpoint is at index l_idx if ivl[1]<=endpts[l_idx]: # So return the index of the interval: return (l_idx-1)/2 else: return None Then, from a shell: >>> print tst.test_interval((0,13),endpts) None >>> print tst.test_interval((15,21),endpts) 0 >>> print tst.test_interval((35,40),endpts) 1 >>> print tst.test_interval((40,80),endpts) None >>> print tst.test_interval((109,200),endpts) None -- http://mail.python.org/mailman/listinfo/python-list