Re: [Tutor] line number when reading files using csv module

2006-10-30 Thread Duncan Gibson
Duncan Gibson wrote:
  
  import csv
  
  class MyReader(object):
  
  def __init__(self, inputFile):
  self.reader = csv.reader(inputFile, delimiter=' ')
  self.lineNumber = 0
  
  def __iter__(self):
  self.lineNumber += 1
  return self.reader.__iter__()
 
  Is there some other __special__ method that I need to forward to the
  csv.reader, or have I lost all control once __iter__ has done its job?

Kent Johnson wrote:
 __iter__() should return self, not self.reader.__iter__(), otherwise 
 Python is using the actual csv.reader not your wrapper. And don't 
 increment line number here.
 
 You lost control because you gave it away.


Thanks Kent. The penny has dropped and it makes a lot more sense now.

I was looking for at __iter__ as a special function that *created* an
iterator, but all it really does is signal that the returned object
will implement the iterator interface, and the next() method in
particular. 

Cheers
Duncan
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] line number when reading files using csv module

2006-10-30 Thread Kent Johnson
Duncan Gibson wrote:
 Duncan Gibson wrote:
 import csv

 class MyReader(object):
 
 def __init__(self, inputFile):
 self.reader = csv.reader(inputFile, delimiter=' ')
 self.lineNumber = 0

 def __iter__(self):
 self.lineNumber += 1
 return self.reader.__iter__()

 Is there some other __special__ method that I need to forward to the
 csv.reader, or have I lost all control once __iter__ has done its job?
 
 Kent Johnson wrote:
 __iter__() should return self, not self.reader.__iter__(), otherwise 
 Python is using the actual csv.reader not your wrapper. And don't 
 increment line number here.

 You lost control because you gave it away.
 
 
 Thanks Kent. The penny has dropped and it makes a lot more sense now.
 
 I was looking for at __iter__ as a special function that *created* an
 iterator, but all it really does is signal that the returned object
 will implement the iterator interface, and the next() method in
 particular. 

I think you might still be a tiny bit confused. __iter__() is not just a 
marker, it does return an iterator. It helps if you distinguish two 
cases - iterable and iterator.

An iterable object is one for which iter(obj) returns an iterator. An 
iterator is an object with a next() method. As a special case, iterators 
are also iterable - if obj is already an iterator, then iter(obj) is 
just obj.

In modern Python (since 2.2) iter() is implemented by calling the 
special method __iter__() which should return an iterator. For a 
sequence like a list, __iter__() will return a new iterator object. 
Again, as a special case, __iter__() for an iterator returns the 
iterator itself.

You might want to read this:
http://www.python.org/doc/2.2.3/whatsnew/node4.html

I hope I am shedding light here :-)
Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] line number when reading files using csv module

2006-10-27 Thread Duncan Gibson

If I have the following data file, data.csv:
1 2 3
2 3 4 5

then I can read it in Python 2.4 on linux using:

import csv
f = file('data.csv', 'rb')
reader = csv.reader(f)
for data in reader:
print data

OK, that's all well and good, but I would like to record
the line number in the file. According to the documentation,
each reader object has a public 'line_num' attribute
http://docs.python.org/lib/node265.html and
http://docs.python.org/lib/csv-examples.html supports this.

If I now change the loop to read:

for data in reader:
print reader.line_num, data

I'm presented with the error:
AttributeError: '_csv.reader' object has no attribute 'line_num'

This has floored me. I've even looked at the source code and I can
see the line_num variable in the underlying _csv.c file. I can even
see the test_csv.py code that checks it!

def test_read_linenum(self):
r = csv.reader(['line,1', 'line,2', 'line,3'])
self.assertEqual(r.line_num, 0)

I suspect this is something so obvious that I just can't see the
wood for the trees and I will kick myself.

Any ideas?

Cheers
Duncan
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] line number when reading files using csv module

2006-10-27 Thread Duncan Gibson
On Fri, 27 Oct 2006 11:35:40 +0200
Duncan Gibson [EMAIL PROTECTED] wrote:

 
 If I have the following data file, data.csv:
 1 2 3
 2 3 4 5
 
 then I can read it in Python 2.4 on linux using:
 
 import csv
 f = file('data.csv', 'rb')
 reader = csv.reader(f)
 for data in reader:
 print data

Oops, mixing examples here. I forgot to say that
I'm actually using

reader = csv.reader(f, delimiter=' ')

so it will read the data correctly even if there
isn't a comma in sight in the csv file, but that's
a side issue to the line number problem.

Cheers
Duncan
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] line number when reading files using csv module

2006-10-27 Thread Kent Johnson
Duncan Gibson wrote:
 If I have the following data file, data.csv:
 1 2 3
 2 3 4 5
 
 then I can read it in Python 2.4 on linux using:
 
 import csv
 f = file('data.csv', 'rb')
 reader = csv.reader(f)
 for data in reader:
 print data
 
 OK, that's all well and good, but I would like to record
 the line number in the file. According to the documentation,
 each reader object has a public 'line_num' attribute
 http://docs.python.org/lib/node265.html and
 http://docs.python.org/lib/csv-examples.html supports this.
 
 If I now change the loop to read:
 
 for data in reader:
 print reader.line_num, data
 
 I'm presented with the error:
 AttributeError: '_csv.reader' object has no attribute 'line_num'

The line_num attribute is new in Python 2.5. This is a doc bug, it 
should be noted in the description of line_num.

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] line number when reading files using csv module

2006-10-27 Thread Duncan Gibson


Kent Johnson wrote:
 The line_num attribute is new in Python 2.5. This is a doc bug,
 it should be noted in the description of line_num.

Is there some way to create a wrapper around a 2.4 csv.reader to
give me pseudo line number handling? I've been experimenting with:

import csv

class MyReader(object):

def __init__(self, inputFile):
self.reader = csv.reader(inputFile, delimiter=' ')
self.lineNumber = 0

def __iter__(self):
self.lineNumber += 1
return self.reader.__iter__()

def next(self):
self.lineNumber += 1# do I need this one?
return self.reader.next()

if __name__ == '__main__':
inputFile = file('data.csv', 'rb')
reader = MyReader(inputFile)
for data in reader:
print reader.lineNumber, data

But that doesn't seem to do what I want. If I add some print statements
to the methods, I can see that it calls __iter__ only once:

__iter__
1 ['1', '2', '3']
1 ['2', '3', '4', '5']
1 ['3', '4', '5', '6', '7']
1 ['4', '5', '6', '7']
1 ['5', '6', '7', '8', '9']

Is there some other __special__ method that I need to forward to the
csv.reader, or have I lost all control once __iter__ has done its job?

Cheers
Duncan
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] line number when reading files using csv module

2006-10-27 Thread Kent Johnson
Duncan Gibson wrote:
 
 Kent Johnson wrote:
 The line_num attribute is new in Python 2.5. This is a doc bug,
 it should be noted in the description of line_num.
 
 Is there some way to create a wrapper around a 2.4 csv.reader to
 give me pseudo line number handling? I've been experimenting with:
 
 import csv
 
 class MyReader(object):
 
 def __init__(self, inputFile):
 self.reader = csv.reader(inputFile, delimiter=' ')
 self.lineNumber = 0
 
 def __iter__(self):
 self.lineNumber += 1
 return self.reader.__iter__()

__iter__() should return self, not self.reader.__iter__(), otherwise 
Python is using the actual csv.reader not your wrapper. And don't 
increment line number here.

 
 def next(self):
 self.lineNumber += 1# do I need this one?

Yes.

 return self.reader.next()

An easier way to do this is to use enumerate():
inputFile = file('data.csv', 'rb')
reader = csv.reader(inputFile)
for i, data in enumerate(reader):
   print i, data

Of course this will not necessarily give you the true line number, as a 
record may span multiple file lines and you may be processing header and 
blank lines at the start of the file before you get any records. But it 
is probably the best you can do and maybe it is what you need.

 
 if __name__ == '__main__':
 inputFile = file('data.csv', 'rb')
 reader = MyReader(inputFile)
 for data in reader:
 print reader.lineNumber, data
 
 But that doesn't seem to do what I want. If I add some print statements
 to the methods, I can see that it calls __iter__ only once:

__iter__() will only be called once, it is next() that is called 
multiple times.

 
 __iter__
 1 ['1', '2', '3']
 1 ['2', '3', '4', '5']
 1 ['3', '4', '5', '6', '7']
 1 ['4', '5', '6', '7']
 1 ['5', '6', '7', '8', '9']
 
 Is there some other __special__ method that I need to forward to the
 csv.reader, or have I lost all control once __iter__ has done its job?

You lost control because you gave it away.

Kent

 
 Cheers
 Duncan
 ___
 Tutor maillist  -  Tutor@python.org
 http://mail.python.org/mailman/listinfo/tutor
 
 


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] line number when reading files using csv module

2006-10-27 Thread Bob Gailer
Duncan Gibson wrote:
 If I have the following data file, data.csv:
 1 2 3
 2 3 4 5

 then I can read it in Python 2.4 on linux using:

 import csv
 f = file('data.csv', 'rb')
 reader = csv.reader(f)
 for data in reader:
 print data

 OK, that's all well and good, but I would like to record
 the line number in the file. According to the documentation,
 each reader object has a public 'line_num' attribute
 http://docs.python.org/lib/node265.html and
 http://docs.python.org/lib/csv-examples.html supports this.

 If I now change the loop to read:

 for data in reader:
 print reader.line_num, data

 I'm presented with the error:
 AttributeError: '_csv.reader' object has no attribute 'line_num'
   
Well I get the same exception. dir(reader) does not show line_num.
 This has floored me. I've even looked at the source code and I can
 see the line_num variable in the underlying _csv.c file. I can even
 see the test_csv.py code that checks it!

 def test_read_linenum(self):
 r = csv.reader(['line,1', 'line,2', 'line,3'])
 self.assertEqual(r.line_num, 0)

 I suspect this is something so obvious that I just can't see the
 wood for the trees and I will kick myself.

 Any ideas?

 Cheers
 Duncan
 ___
 Tutor maillist  -  Tutor@python.org
 http://mail.python.org/mailman/listinfo/tutor

   


-- 
Bob Gailer
510-978-4454

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] line number when reading files using csv module

2006-10-27 Thread Bob Gailer
Duncan Gibson wrote:
 [snip]

 but I would like to record the line number in the file. 
   
How about using enumerate():

 for line_num, data in enumerate(reader):
 print reader.line_num, data

   


-- 
Bob Gailer
510-978-4454

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] line number when reading files using csv module CORRECTION

2006-10-27 Thread Bob Gailer
Bob Gailer wrote:
 Duncan Gibson wrote:
   
 [snip]
 

   
 but I would like to record the line number in the file. 
   
 
 How about using enumerate():
   
 for line_num, data in enumerate(reader):
 # print reader.line_num, data # SHOULD BE:
 print line_num, data


 
   


-- 
Bob Gailer
510-978-4454

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor