Re: Beginner Question : Iterators and zip

2008-07-14 Thread cokofreedom

 zip(*vec_list) will zip together all entries in vec_list
 Do be aware that zip stops on the shortest iterable.  So if vec[1] is
 shorter than vec[0] and matches otherwise, your output line will be
 truncated.  Or if vec[1] is longer and vec[0] matches as far as it goes,
 there will be no signal either.


Do note that from Python 3.0 there is another form of zip that will
read until all lists are exhausted, with the other being filled up
with a settable default value. Very useful!
--
http://mail.python.org/mailman/listinfo/python-list


Re: Beginner Question : Iterators and zip

2008-07-14 Thread moogyd
On 13 Jul, 19:49, Terry Reedy [EMAIL PROTECTED] wrote:
 [EMAIL PROTECTED] wrote:
  What is this *lis operation called? I am having trouble finding any
  reference to it in the python docs or the book learning python.

 One might call this argument unpacking, but
 Language Manual / Expressions / Primaries / Calls
 simply calls it *expression syntax.
 If the syntax *expression appears in the function call, expression must
 evaluate to a sequence. Elements from this sequence are treated as if
 they were additional positional arguments; if there are positional
 arguments x1,...,*xN* , and expression evaluates to a sequence
 y1,...,*yM*, this is equivalent to a call with M+N positional arguments
 x1,...,*xN*,*y1*,...,*yM*.

 See Compound Statements / Function definitions for the mirror syntax in
 definitions.

 tjr

Thanks,

It's starting to make sense :-)

Steven
--
http://mail.python.org/mailman/listinfo/python-list


Re: Beginner Question : Iterators and zip

2008-07-13 Thread moogyd
On 12 Jul, 21:50, [EMAIL PROTECTED]
[EMAIL PROTECTED] wrote:
 On 12 juil, 20:55, [EMAIL PROTECTED] wrote:



 zip is (mostly) ok. What you're missing is how to use it for any
 arbitrary number of sequences. Try this instead:

  lists = [range(5), range(5,11), range(11, 16)]
  lists

 [[0, 1, 2, 3, 4], [5, 6, 7, 8, 9, 10], [11, 12, 13, 14, 15]] for item in 
 zip(*lists):

 ... print item
 ...
 (0, 5, 11)
 (1, 6, 12)
 (2, 7, 13)
 (3, 8, 14)
 (4, 9, 15)

What is this *lis operation called? I am having trouble finding any
reference to it in the python docs or the book learning python.

  Any other comments/suggestions appreciated.

 There's a difflib package in the standard lib. Did you give it a try ?

I'll check it out, but I am a newbie, so I am writing this as a
(useful) learning excercise.

Thanks for the help

Steven


--
http://mail.python.org/mailman/listinfo/python-list


Re: Beginner Question : Iterators and zip

2008-07-13 Thread Terry Reedy

[EMAIL PROTECTED] wrote:


What is this *lis operation called? I am having trouble finding any
reference to it in the python docs or the book learning python.


One might call this argument unpacking, but
Language Manual / Expressions / Primaries / Calls
simply calls it *expression syntax.
If the syntax *expression appears in the function call, expression must 
evaluate to a sequence. Elements from this sequence are treated as if 
they were additional positional arguments; if there are positional 
arguments x1,...,*xN* , and expression evaluates to a sequence 
y1,...,*yM*, this is equivalent to a call with M+N positional arguments 
x1,...,*xN*,*y1*,...,*yM*.


See Compound Statements / Function definitions for the mirror syntax in 
definitions.


tjr

--
http://mail.python.org/mailman/listinfo/python-list


Beginner Question : Iterators and zip

2008-07-12 Thread moogyd
Hi group,

I have a basic question on the zip built in function.

I am writing a simple text file comparison script, that compares line
by line and character by character. The output is the original file,
with an X in place of any characters that are different.

I have managed a solution for a fixed (3) number of files, but I want
a solution of any number of input files.

The outline of my solution:

for vec in zip(vec_list[0],vec_list[1],vec_list[2]):
res = ''
for entry in zip(vec[0],vec[1],vec[2]):
if len(set(entry))  1:
res = res+'X'
else:
res = res+entry[0]
outfile.write(res)

So vec is a tuple containing a line from each file, and then entry is
a tuple containg a character from each line.

2 questions
1) What is the general solution. Using zip in this way looks wrong. Is
there another function that does what I want
2) I am using set to remove any repeated characters. Is there a
better way ?

Any other comments/suggestions appreciated.

Thanks,

Steven





--
http://mail.python.org/mailman/listinfo/python-list


Re: Beginner Question : Iterators and zip

2008-07-12 Thread Larry Bates

[EMAIL PROTECTED] wrote:

Hi group,

I have a basic question on the zip built in function.

I am writing a simple text file comparison script, that compares line
by line and character by character. The output is the original file,
with an X in place of any characters that are different.

I have managed a solution for a fixed (3) number of files, but I want
a solution of any number of input files.

The outline of my solution:

for vec in zip(vec_list[0],vec_list[1],vec_list[2]):
res = ''
for entry in zip(vec[0],vec[1],vec[2]):
if len(set(entry))  1:
res = res+'X'
else:
res = res+entry[0]
outfile.write(res)

So vec is a tuple containing a line from each file, and then entry is
a tuple containg a character from each line.

2 questions
1) What is the general solution. Using zip in this way looks wrong. Is
there another function that does what I want
2) I am using set to remove any repeated characters. Is there a
better way ?

Any other comments/suggestions appreciated.

Thanks,

Steven






You should take a look at Python's difflib library.  I probably already does
what you are attempting to re-invent.

-Larry
--
http://mail.python.org/mailman/listinfo/python-list


Re: Beginner Question : Iterators and zip

2008-07-12 Thread [EMAIL PROTECTED]
On 12 juil, 20:55, [EMAIL PROTECTED] wrote:
 Hi group,

 I have a basic question on the zip built in function.

 I am writing a simple text file comparison script, that compares line
 by line and character by character. The output is the original file,
 with an X in place of any characters that are different.

 I have managed a solution for a fixed (3) number of files, but I want
 a solution of any number of input files.

 The outline of my solution:

 for vec in zip(vec_list[0],vec_list[1],vec_list[2]):
 res = ''
 for entry in zip(vec[0],vec[1],vec[2]):
 if len(set(entry))  1:
 res = res+'X'
 else:
 res = res+entry[0]
 outfile.write(res)

 So vec is a tuple containing a line from each file, and then entry is
 a tuple containg a character from each line.

 2 questions
 1) What is the general solution. Using zip in this way looks wrong. Is
 there another function that does what I want

zip is (mostly) ok. What you're missing is how to use it for any
arbitrary number of sequences. Try this instead:

 lists = [range(5), range(5,11), range(11, 16)]
 lists
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9, 10], [11, 12, 13, 14, 15]]
 for item in zip(*lists):
... print item
...
(0, 5, 11)
(1, 6, 12)
(2, 7, 13)
(3, 8, 14)
(4, 9, 15)
 lists = [range(5), range(5,11), range(11, 16), range(16, 20)]
 for item in zip(*lists):
... print item
...
(0, 5, 11, 16)
(1, 6, 12, 17)
(2, 7, 13, 18)
(3, 8, 14, 19)


The only caveat with zip() is that it will only use as many items as
there are in your shorter sequence, ie:

 zip(range(3), range(10))
[(0, 0), (1, 1), (2, 2)]
 zip(range(30), range(10))
[(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8,
8), (9, 9)]


So you'd better pad your sequences to make them as long as the longer
one. There are idioms for doing this using the itertools package's
chain and repeat iterators, but I'll leave concrete example as an
exercice to the reader !-)

 2) I am using set to remove any repeated characters. Is there a
 better way ?

That's probably what I'd do too.

 Any other comments/suggestions appreciated.

There's a difflib package in the standard lib. Did you give it a try ?
--
http://mail.python.org/mailman/listinfo/python-list


Re: Beginner Question : Iterators and zip

2008-07-12 Thread Terry Reedy



[EMAIL PROTECTED] wrote:

Hi group,

I have a basic question on the zip built in function.

I am writing a simple text file comparison script, that compares line
by line and character by character. The output is the original file,
with an X in place of any characters that are different.

I have managed a solution for a fixed (3) number of files, but I want
a solution of any number of input files.

The outline of my solution:

for vec in zip(vec_list[0],vec_list[1],vec_list[2]):
res = ''
for entry in zip(vec[0],vec[1],vec[2]):
if len(set(entry))  1:
res = res+'X'
else:
res = res+entry[0]
outfile.write(res)

So vec is a tuple containing a line from each file, and then entry is
a tuple containg a character from each line.

2 questions
1) What is the general solution. Using zip in this way looks wrong. Is
there another function that does what I want


zip(*vec_list) will zip together all entries in vec_list
Do be aware that zip stops on the shortest iterable.  So if vec[1] is 
shorter than vec[0] and matches otherwise, your output line will be 
truncated.  Or if vec[1] is longer and vec[0] matches as far as it goes, 
there will be no signal either.


res=rex+whatever can be written as res+=whatever


2) I am using set to remove any repeated characters. Is there a
better way ?


I might have written a third loop to compare vec[0] to vec[1]..., but 
your set solution is easier and prettier.


If speed is an issue, don't rebuild the output line char by char.  Just 
change what is needed in a mutable copy.  I like this better anyway.


res = list(vec[0]) # if all ascii, in 3.0 use bytearray
for n, entry in enumerate(zip(vec[0],vec[1],vec[2])):
  if len(set(entry))  1:
  res[n] = 'X'
  outfile.write(''.join(res)) # in 3.0, write(res)

tjr




--
http://mail.python.org/mailman/listinfo/python-list