[Numpy-discussion] Help to process a large data file

2008-10-01 Thread frank wang
Hi, I have a large data file which contains 2 columns of data. The two columns only have zero and one. Now I want to cound how many one in between if both columns are one. For example, if my data is: 1 0 0 0 1 1 0 0 0 1x 0 1x 0 0 0 1x 1 1 0 0 0 1x 0 1x 1 1 Then my

Re: [Numpy-discussion] Help to process a large data file

2008-10-02 Thread David Huard
Frank, How about that: x = np.loadtxt('file') z = x.sum(1) # Reduce data to an array of 0,1,2 rz = z[z>0] # Remove all 0s since you don't want to count those. loc = np.where(rz==2)[0] # The location of the (1,1)s count = np.diff(loc) - 1 # The spacing between those (1,1)s, ie, the numbe

Re: [Numpy-discussion] Help to process a large data file

2008-10-02 Thread orionbelt2
Frank, I would imagine that you cannot get a much better performance in python than this, which avoids string conversions: c = [] count = 0 for line in open('foo'): if line == '1 1\n': c.append(count) count = 0 else: if '1' in line: count += 1 One could do some n

Re: [Numpy-discussion] Help to process a large data file

2008-10-02 Thread frank wang
Oct 2008 17:43:37 +0200> From: [EMAIL PROTECTED]> To: > numpy-discussion@scipy.org> CC: [EMAIL PROTECTED]> Subject: Re: > [Numpy-discussion] Help to process a large data file> > Frank,> > I would > imagine that you cannot get a much better performance in python

Re: [Numpy-discussion] Help to process a large data file

2008-10-03 Thread David Huard
I did not try the second solution from Chris since it is too slow as Chris > stated. > > Frank > > > > Date: Thu, 2 Oct 2008 17:43:37 +0200 > > From: [EMAIL PROTECTED] > > To: numpy-discussion@scipy.org > > CC: [EMAIL PROTECTED] > > Subject: Re: [Numpy-d