Bret wrote:
i have a csv file like so:
row1,field1,[field2][text in field2 "quote, quote"],field3,field
row2,field1,[field2]text in field2 "quote, quote",field3,field

using csv.reader to read the file, the first row is broken into two
fields:
[field2][text in field2 "quote
and
 quote"

while the second row is read correctly with:
[field2]text in field2 "quote, quote"
being one field.

any ideas how to make csv.reader work correctly for the first case?
the problem is the comma inside the quote inside the brackets, ie:
[","]

When posting, give version, minimum code that has problem, and actual output. Cut and past latter two. Reports are less credible otherwise.

Using 3.1rc1

txt = [
'''row1,field1,[field2][text in field2 "quote, quote"],field3,field''',
'''row2,field1,[field2] text in field2 "quote, quote", field3,field''',
'''row2,field1, field2  text in field2 "quote, quote", field3,field''',
]
import csv
for row in csv.reader(txt): print(len(row),row)

produces

6 ['row1', 'field1', '[field2][text in field2 "quote', ' quote"]', field3', 'field'] 6 ['row2', 'field1', '[field2] text in field2 "quote', ' quote"', ' field3', 'field'] 6 ['row2', 'field1', ' field2 text in field2 "quote', ' quote"', ' field3', 'field']

In 3.1 at least, the presence or absence of brackets is irrelevant, as I expected it to be. For double quotes to protect the comma delimiter, the *entire field* must be quoted, not just part of it.

If you want to escape the delimiter without quoting entire fields, use an escape char and change the dialect. For example

txt = [
'''row1,field1,[field2][text in field2 "quote`, quote"],field3,field''',
'''row2,field1,[field2] text in field2 "quote`, quote", field3,field''',
'''row2,field1, field2  text in field2 "quote`, quote", field3,field''',
]
import csv
for row in csv.reader(txt, quoting=csv.QUOTE_NONE, escapechar = '`'):
    print(len(row),row)

produces what you desire

5 ['row1', 'field1', '[field2][text in field2 "quote, quote"]', 'field3', 'field'] 5 ['row2', 'field1', '[field2] text in field2 "quote, quote"', ' field3', 'field'] 5 ['row2', 'field1', ' field2 text in field2 "quote, quote"', ' field3', 'field']


Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to