This may be an algorithmic question, but I'm trying to code it in Python, so...
I have a list of pairwise regions, each with an integer start and end and a float data point. There may be overlaps between the regions. I want to resolve this into an ordered list with no overlapping regions. My initial solution was to sort the list by the start point, and then compare each adjacent region, clipping any overlapping section in half. I've attached code at the bottom. Unfortunately, this does not work well if you have sections that have three or more overlapping regions. A more general solution is to work out where all the overlaps are before I start. Then I can break up the region space based on what regions overlap each section and take averages of all the datapoints that are present in a particular section. Devising an algorithm to do this is making my brain hurt. Any ideas? Peter # also validates the data def clipRanges(regions): for i in range(len(regions) - 1): thispoint = regions[i] nextpoint = regions[i+1] assert thispoint[1] > thispoint[0] and nextpoint[1] > nextpoint[0], "point read not valid" thisend = thispoint[1] nextstart = nextpoint[0] diff = thisend - nextstart # a difference of zero is too close together if diff > -1: if diff % 2 == 1: diff += 1 correction = diff / 2 newend = thisend - correction newstart = newend + 1 assert newend > thispoint[0] and nextpoint[1] > newstart, "new range not valid!" newthispoint = (thispoint[0], newend, thispoint[2]) newnextpoint = (newstart, nextpoint[1], nextpoint[2]) regions[i] = newthispoint regions[i+1] = newnextpoint return regions regions = [ (0,10,2.5), (12,22,3.5), (15,25,1.2), (23, 30,0.01), (27, 37,1.23), (30, 35, 1.45) ] regions2 = [ (0,10,2.5), (1,11,1.1), (2,12,1.2) ] # works fine, produces [(0, 10, 2.5), (12, 18, 3.5), (19, 24, 1.2), (25, 28, 0.01), (29, 33, 1.23), (34, 35, 1.45)] print clipRanges(regions) # violates "new range not valid" assertion print clipRanges(regions2) -- http://mail.python.org/mailman/listinfo/python-list