Re: advice : how do you iterate with an acc ?

2005-12-03 Thread Ben Finney
[EMAIL PROTECTED] wrote:
 acc = []# accumulator ;)
 for line in fileinput.input():
 if condition(line):
 if acc:#1
 doSomething(acc)#1
 acc = []
 else:
 acc.append(line)
 if acc:#2
 doSomething(acc)#2

Looks like you'd be better off making an Accumulator that knows what
to do.

 class Accumulator(list):
... def flush(self):
... if len(self):
... print Flushing items: %s % self
... del self[:]
...
 lines = [
... spam, eggs, FLUSH,
... beans, rat, FLUSH,
... strawberry,
... ]

 acc = Accumulator()
 for line in lines:
... if line == 'FLUSH':
... acc.flush()
... else:
... acc.append(line)
...
Flushing items: ['spam', 'eggs']
Flushing items: ['beans', 'rat']
 acc.flush()
Flushing items: ['strawberry']


-- 
 \ [W]e are still the first generation of users, and for all that |
  `\ we may have invented the net, we still don't really get it.  |
_o__) -- Douglas Adams |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: advice : how do you iterate with an acc ?

2005-12-03 Thread Bengt Richter
On 2 Dec 2005 18:34:12 -0800, [EMAIL PROTECTED] wrote:


Bengt Richter wrote:
 It looks to me like itertools.groupby could get you close to what you want,
 e.g., (untested)
Ah, groupby. The generic string.split() equivalent. But the doc said
the input needs to be sorted.


  seq = [3,1,4,'t',0,3,4,2,'t',3,1,4]
  import itertools
  def condition(item): return item=='t'
 ...
  def dosomething(it): return 'doing something with %r'%list(it)
 ...
  for condresult, acciter in itertools.groupby(seq, condition):
 ... if not condresult:
 ... dosomething(acciter)
 ...
 'doing something with [3, 1, 4]'
 'doing something with [0, 3, 4, 2]'
 'doing something with [3, 1, 4]'

I think the input only needs to be sorted if you a trying to group sorted 
subsequences of the input.
I.e., you can't get them extracted together unless the condition is satisfied 
for a contiguous group, which
only happens if the input is sorted. But AFAIK the grouping logic just scans 
and applies key condition
and returns iterators for the subsequences that yield the same key function 
result, along with that result.
So it's a general subsequence extractor. You just have to supply the key 
function to make the condition value
change when a group ends and a new one begins. And the value can be arbitrary, 
or just toggle beween two values, e.g.

  for condresult, acciter in itertools.groupby(range(20), lambda x:x%3==0 or 
  x==5):
 ... print '%6s: %r'%(condresult, list(acciter))
 ...
   True: [0]
  False: [1, 2]
   True: [3]
  False: [4]
   True: [5, 6]
  False: [7, 8]
   True: [9]
  False: [10, 11]
   True: [12]
  False: [13, 14]
   True: [15]
  False: [16, 17]
   True: [18]
  False: [19]

or a condresult that stays the same in groups, but every group result is 
different:

  for condresult, acciter in itertools.groupby(range(20), lambda x:x//3):
 ... print '%6s: %r'%(condresult, list(acciter))
 ...
  0: [0, 1, 2]
  1: [3, 4, 5]
  2: [6, 7, 8]
  3: [9, 10, 11]
  4: [12, 13, 14]
  5: [15, 16, 17]
  6: [18, 19]

Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: advice : how do you iterate with an acc ?

2005-12-03 Thread bonono

Bengt Richter wrote:
 On 2 Dec 2005 18:34:12 -0800, [EMAIL PROTECTED] wrote:

 
 Bengt Richter wrote:
  It looks to me like itertools.groupby could get you close to what you want,
  e.g., (untested)
 Ah, groupby. The generic string.split() equivalent. But the doc said
 the input needs to be sorted.
 

   seq = [3,1,4,'t',0,3,4,2,'t',3,1,4]
   import itertools
   def condition(item): return item=='t'
  ...
   def dosomething(it): return 'doing something with %r'%list(it)
  ...
   for condresult, acciter in itertools.groupby(seq, condition):
  ... if not condresult:
  ... dosomething(acciter)
  ...
  'doing something with [3, 1, 4]'
  'doing something with [0, 3, 4, 2]'
  'doing something with [3, 1, 4]'

 I think the input only needs to be sorted if you a trying to group sorted 
 subsequences of the input.
 I.e., you can't get them extracted together unless the condition is satisfied 
 for a contiguous group, which
 only happens if the input is sorted. But AFAIK the grouping logic just scans 
 and applies key condition
 and returns iterators for the subsequences that yield the same key function 
 result, along with that result.
 So it's a general subsequence extractor. You just have to supply the key 
 function to make the condition value
 change when a group ends and a new one begins. And the value can be 
 arbitrary, or just toggle beween two values, e.g.

   for condresult, acciter in itertools.groupby(range(20), lambda x:x%3==0 
 or x==5):
  ... print '%6s: %r'%(condresult, list(acciter))
  ...
True: [0]
   False: [1, 2]
True: [3]
   False: [4]
True: [5, 6]
   False: [7, 8]
True: [9]
   False: [10, 11]
True: [12]
   False: [13, 14]
True: [15]
   False: [16, 17]
True: [18]
   False: [19]

 or a condresult that stays the same in groups, but every group result is 
 different:

   for condresult, acciter in itertools.groupby(range(20), lambda x:x//3):
  ... print '%6s: %r'%(condresult, list(acciter))
  ...
   0: [0, 1, 2]
   1: [3, 4, 5]
   2: [6, 7, 8]
   3: [9, 10, 11]
   4: [12, 13, 14]
   5: [15, 16, 17]
   6: [18, 19]

Thanks. So it basically has an internal state storing the last
condition result and if it flips(different), a new group starts.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: advice : how do you iterate with an acc ?

2005-12-03 Thread Scott David Daniels
Jeffrey Schwab wrote:
 [EMAIL PROTECTED] wrote:
 hello,

  i often encounter something like:

 acc = []# accumulator ;)
 for line in fileinput.input():
 if condition(line):
 if acc:#1
 doSomething(acc)#1
 acc = []
 else:
 acc.append(line)
 if acc:#2
 doSomething(acc)#2
 
 Could you add a sentry to the end of your input?  E.g.:
 for line in fileinput.input() + line_that_matches_condition:
 This way, you wouldn't need a separate check at the end.

Check itertools for a good way to do this:

 import itertools
 SENTRY = 'something for which condition(SENTRY) is True'

 f = open(filename)
 try:
 for line in itertools.chain(f, [SENTRY]):
 if condition(line):
 if acc:
 doSomething(acc)
 acc = []
 else:
 acc.append(line)
 assert acc == []
 finally:
 f.close()


--Scott David Daniels
[EMAIL PROTECTED]
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: advice : how do you iterate with an acc ?

2005-12-03 Thread Bengt Richter
On 3 Dec 2005 03:28:19 -0800, [EMAIL PROTECTED] wrote:


Bengt Richter wrote:
 On 2 Dec 2005 18:34:12 -0800, [EMAIL PROTECTED] wrote:

 
 Bengt Richter wrote:
  It looks to me like itertools.groupby could get you close to what you 
  want,
  e.g., (untested)
 Ah, groupby. The generic string.split() equivalent. But the doc said
 the input needs to be sorted.
 

   seq = [3,1,4,'t',0,3,4,2,'t',3,1,4]
   import itertools
   def condition(item): return item=='t'
  ...
   def dosomething(it): return 'doing something with %r'%list(it)
  ...
   for condresult, acciter in itertools.groupby(seq, condition):
  ... if not condresult:
  ... dosomething(acciter)
  ...
  'doing something with [3, 1, 4]'
  'doing something with [0, 3, 4, 2]'
  'doing something with [3, 1, 4]'

 I think the input only needs to be sorted if you a trying to group sorted 
 subsequences of the input.
 I.e., you can't get them extracted together unless the condition is 
 satisfied for a contiguous group, which
 only happens if the input is sorted. But AFAIK the grouping logic just scans 
 and applies key condition
 and returns iterators for the subsequences that yield the same key function 
 result, along with that result.
 So it's a general subsequence extractor. You just have to supply the key 
 function to make the condition value
 change when a group ends and a new one begins. And the value can be 
 arbitrary, or just toggle beween two values, e.g.

   for condresult, acciter in itertools.groupby(range(20), lambda x:x%3==0 
 or x==5):
  ... print '%6s: %r'%(condresult, list(acciter))
  ...
True: [0]
   False: [1, 2]
True: [3]
   False: [4]
True: [5, 6]
   False: [7, 8]
True: [9]
   False: [10, 11]
True: [12]
   False: [13, 14]
True: [15]
   False: [16, 17]
True: [18]
   False: [19]

 or a condresult that stays the same in groups, but every group result is 
 different:

   for condresult, acciter in itertools.groupby(range(20), lambda x:x//3):
  ... print '%6s: %r'%(condresult, list(acciter))
  ...
   0: [0, 1, 2]
   1: [3, 4, 5]
   2: [6, 7, 8]
   3: [9, 10, 11]
   4: [12, 13, 14]
   5: [15, 16, 17]
   6: [18, 19]

Thanks. So it basically has an internal state storing the last
condition result and if it flips(different), a new group starts.

So it appears. But note that flips(different) seems to be based on ==,
and default key function is just passthrough like lambda x:x, so e.g. integers
and floats will group together if their values are equal.
E.g., to elucidate further,

Default key function:
  from itertools import groupby
  for k,g in groupby([0, 0.0, 0j, [], (), None, 1, 1.0, 1j]):
 ... print k, list(g)
 ...
 0 [0, 0.0, 0j]
 [] [[]]
 () [()]
 None [None]
 1 [1, 1.0]
 1j [1j]

Group by bool value:
  for k,g in groupby([0, 0.0, 0j, [], (), None, 1, 1.0, 1j], key=bool):
 ... print k, list(g)
 ...
 False [0, 0.0, 0j, [], (), None]
 True [1, 1.0, 1j]

It's not trying to sort, so it doesn't trip on complex
  for k,g in groupby([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j]):
 ... print k, list(g)
 ...
 0 [0, 0.0, 0j]
 [] [[]]
 () [()]
 None [None]
 1 [1, 1.0]
 1j [1j]
 2j [2j]

But you have to watch out if you try to pre-sort stuff that includes complex 
numbers
  for k,g in groupby(sorted([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j])):
 ... print k, list(g)
 ...
 Traceback (most recent call last):
   File stdin, line 1, in ?
 TypeError: cannot compare complex numbers using , =, , =

And if you do sort using a key function, it doesn't mean groupy inherits that 
keyfunction for grouping
unless you specify it

  def keyfun(x):
 ... if isinstance(x, (int, long, float)): return x
 ... else: return type(x).__name__
 ...
  for k,g in groupby(sorted([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j], 
  key=keyfun)):
 ... print k, list(g)
 ...
 0 [0, 0.0]
 1 [1, 1.0]
 None [None]
 0j [0j]
 1j [1j]
 2j [2j]
 [] [[]]
 () [()]

Vs giving groupby the same keyfun
  for k,g in groupby(sorted([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j], 
  key=keyfun), keyfun):
 ... print k, list(g)
 ...
 0 [0, 0.0]
 1 [1, 1.0]
 NoneType [None]
 complex [0j, 1j, 2j]
 list [[]]
 tuple [()]


Exmple of unsorted vs sorted subgroup extraction:

  for k,g in groupby('this that other thing note order'.split(), key=lambda 
  s:s[0]):
 ... print k, list(g)
 ...
 t ['this', 'that']
 o ['other']
 t ['thing']
 n ['note']
 o ['order']

vs.

  for k,g in groupby(sorted('this that other thing note order'.split()), 
  key=lambda s:s[0]):
 ... print k, list(g)
 ...
 n ['note']
 o ['order', 'other']
 t ['that', 'thing', 'this']

Oops, that key would be less brittle as (untested) key=lambda s:s[:1], e.g., in 
case a split with args was used.

Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: advice : how do you iterate with an acc ?

2005-12-02 Thread Bengt Richter
On 2 Dec 2005 17:08:02 -0800, [EMAIL PROTECTED] wrote:


[EMAIL PROTECTED] wrote:
 hello,

 i'm wondering how people from here handle this, as i often encounter
 something like:

 acc = []# accumulator ;)
 for line in fileinput.input():
 if condition(line):
 if acc:#1
 doSomething(acc)#1
 acc = []
 else:
 acc.append(line)
 if acc:#2
 doSomething(acc)#2

 BTW i am particularly annoyed by #1 and #2 as it is a reptition, and i
 think it is quite error prone, how will you do it in a pythonic way ?

It looks to me like itertools.groupby could get you close to what you want,
e.g., (untested)

import itertools
for condresult, acciter in itertools.groupby(fileinput.imput(), condition):
if not condresult:
dosomething(list(acciter)) # or dosomething(acciter) if iterator is 
usable

IOW, groupy collects contiguous lines for which condition evaluates to a 
distinct
value. Assuming this is a funtion that returns only two distinct values (for 
true
and false, like True and False), then if I understand your program's logic, you
do nothing with the line(s) that actually satisfy the condition, you just 
trigger
on them as delimiters and want to process the nonempty groups of the other 
lines,
so the if not condresult: should select those. Groupby won't return an empty 
group AFAIK,
so you don't need to test for that. Also, you won't need the list call in 
list(acciter)
if your dosomething can accept an iterator instead of a list.

Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: advice : how do you iterate with an acc ?

2005-12-02 Thread Dan Sommers
On 2 Dec 2005 16:45:38 -0800,
[EMAIL PROTECTED] wrote:

 hello,
 i'm wondering how people from here handle this, as i often encounter
 something like:

 acc = []# accumulator ;)
 for line in fileinput.input():
 if condition(line):
 if acc:#1
 doSomething(acc)#1
 acc = []
 else:
 acc.append(line)
 if acc:#2
 doSomething(acc)#2

 BTW i am particularly annoyed by #1 and #2 as it is a reptition, and i
 think it is quite error prone, how will you do it in a pythonic way ?

If doSomething handled an empty list gracefully, then you would have
less repetition:

acc = []
for line in fileinput.input():
if condition(line):
doSomething(acc)#1
acc = []
else:
acc.append(line)
doSomething(acc)#2

If condition were simple enough and the file(s) small enough, perhaps
you could read the whole file at once and use split to separate the
pieces:

contents = file.read()
for acc in contents.split( this is the delimiter line\n ):
doSomething(acc.split(\n))

(There are probably some strange cases of repeated delimiter lines or
delimiter lines occurring at the beginning or end of the file for which
the above code will not work.  Caveat emptor.)

If condition were a little more complicated, perhaps re.split would
work.

Or maybe you could look at split and see what it does (since your code
is conceptually very similar to it).

OTOH, FWIW, your version is very clean and very readable and fits my
brain perfectly.

HTH,
Dan

-- 
Dan Sommers
http://www.tombstonezero.net/dan/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: advice : how do you iterate with an acc ?

2005-12-02 Thread bonono

Bengt Richter wrote:
 It looks to me like itertools.groupby could get you close to what you want,
 e.g., (untested)
Ah, groupby. The generic string.split() equivalent. But the doc said
the input needs to be sorted.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: advice : how do you iterate with an acc ?

2005-12-02 Thread Jeffrey Schwab
[EMAIL PROTECTED] wrote:
 hello,
 
 i'm wondering how people from here handle this, as i often encounter
 something like:
 
 acc = []# accumulator ;)
 for line in fileinput.input():
 if condition(line):
 if acc:#1
 doSomething(acc)#1
 acc = []
 else:
 acc.append(line)
 if acc:#2
 doSomething(acc)#2
 
 BTW i am particularly annoyed by #1 and #2 as it is a reptition, and i
 think it is quite error prone, how will you do it in a pythonic way ?

Could you add a sentry to the end of your input?  E.g.:

for line in fileinput.input() + line_that_matches_condition:

This way, you wouldn't need a separate check at the end.
-- 
http://mail.python.org/mailman/listinfo/python-list