George Sakkis wrote:
On Aug 27, 3:00 pm, Gerard flanagan <[EMAIL PROTECTED]> wrote:

[EMAIL PROTECTED] wrote:
I have a list that starts with zeros, has sporadic data, and then has
good data. I define the point at  which the data turns good to be the
first index with a non-zero entry that is followed by at least 4
consecutive non-zero data items (i.e. a week's worth of non-zero
data). For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8,
9], I would define the point at which data turns good to be 4 (1
followed by 2, 3, 4, 5).
I have a simple algorithm to identify this changepoint, but it looks
crude: is there a cleaner, more elegant way to do this?
    flag = True
    i=-1
    j=0
    while flag and i < len(retHist)-1:
        i += 1
        if retHist[i] == 0:
            j = 0
        else:
            j += 1
            if j == 5:
                flag = False
    del retHist[:i-4]
Thanks in advance for your help
Thomas Philips
data = [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

def itergood(indata):
     indata = iter(indata)
     buf = []
     while len(buf) < 4:
         buf.append(indata.next())
         if buf[-1] == 0:
             buf[:] = []
     for x in buf:
         yield x
     for x in indata:
         yield x

for d in itergood(data):
     print d

This seems the most efficient so far for arbitrary iterables. With a
few micro-optimizations it becomes:

from itertools import chain

def itergood(indata, good_ones=4):
    indata = iter(indata); get_next = indata.next
    buf = []; append = buf.append
    while len(buf) < good_ones:
        next = get_next()
        if next: append(next)
        else: del buf[:]
    return chain(buf, indata)

$ python -m timeit -s "x = 1000*[0, 0, 0, 1, 2, 3] + [1,2,3,4]; from
itergood import itergood" "list(itergood(x))"
100 loops, best of 3: 3.09 msec per loop

And with Psyco enabled:
$ python -m timeit -s "x = 1000*[0, 0, 0, 1, 2, 3] + [1,2,3,4]; from
itergood import itergood" "list(itergood(x))"
1000 loops, best of 3: 466 usec per loop

George
--

I always forget the 'del slice' method for clearing a list, thanks.

I think that returning a `chain` means that the function is not itself a generator. And so if the indata has length less than or equal to the threshold (good_ones), an unhandled StopIteration is raised before the return statement is reached.


G.

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to