Re: Newby: how to transform text into lines of text

Gabriel Genellina Sun, 25 Jan 2009 20:58:56 -0800

En Mon, 26 Jan 2009 00:23:30 -0200, John Machin <sjmac...@lexicon.net>escribió:

On Jan 26, 1:03 pm, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:

It's so easy that don't doing that is just inexcusable lazyness :)
Your own example, written using the csv module:

import csv

f = csv.reader(open('customer_x.txt','rb'), delimiter='\t')
headers = f.next()
for line in f:
     field1, field2, field3 = line
     do_stuff()


And where in all of that do you recommend that .decode(some_encoding)
be inserted?

For encodings that don't use embedded NUL bytes (latin1, utf8) I'd decodethe fields right when extracting them:


    field1, field2, field3 = (field.decode('utf8') for field in line)

For encodings that allow NUL bytes, I'd use any of the recipes in the csvmodule documentation.

(That is, if I care about the encoding at all. Perhaps the file containsonly numbers. Perhaps it contains only ASCII characters. Perhaps I'm onlyinterested in some fields for which the encoding is irrelevant. Perhaps itis an internally generated file and it doesn't matter as long as I use thesame encoding on output)But I admit that in general, the "decode input early when reading, work inunicode, encode output late when writing" is the best practice.


--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list

Re: Newby: how to transform text into lines of text

Reply via email to