Bret wrote:
i have a csv file like so:
row1,field1,[field2][text in field2 "quote, quote"],field3,field
row2,field1,[field2]text in field2 "quote, quote",field3,field
using csv.reader to read the file, the first row is broken into two
fields:
[field2][text in field2 "quote
and
quote"
while the second row is read correctly with:
[field2]text in field2 "quote, quote"
being one field.
any ideas how to make csv.reader work correctly for the first case?
the problem is the comma inside the quote inside the brackets, ie:
[","]
When posting, give version, minimum code that has problem, and actual
output. Cut and past latter two. Reports are less credible otherwise.
Using 3.1rc1
txt = [
'''row1,field1,[field2][text in field2 "quote, quote"],field3,field''',
'''row2,field1,[field2] text in field2 "quote, quote", field3,field''',
'''row2,field1, field2 text in field2 "quote, quote", field3,field''',
]
import csv
for row in csv.reader(txt): print(len(row),row)
produces
6 ['row1', 'field1', '[field2][text in field2 "quote', ' quote"]',
field3', 'field']
6 ['row2', 'field1', '[field2] text in field2 "quote', ' quote"', '
field3', 'field']
6 ['row2', 'field1', ' field2 text in field2 "quote', ' quote"', '
field3', 'field']
In 3.1 at least, the presence or absence of brackets is irrelevant, as I
expected it to be. For double quotes to protect the comma delimiter,
the *entire field* must be quoted, not just part of it.
If you want to escape the delimiter without quoting entire fields, use
an escape char and change the dialect. For example
txt = [
'''row1,field1,[field2][text in field2 "quote`, quote"],field3,field''',
'''row2,field1,[field2] text in field2 "quote`, quote", field3,field''',
'''row2,field1, field2 text in field2 "quote`, quote", field3,field''',
]
import csv
for row in csv.reader(txt, quoting=csv.QUOTE_NONE, escapechar = '`'):
print(len(row),row)
produces what you desire
5 ['row1', 'field1', '[field2][text in field2 "quote, quote"]',
'field3', 'field']
5 ['row2', 'field1', '[field2] text in field2 "quote, quote"', '
field3', 'field']
5 ['row2', 'field1', ' field2 text in field2 "quote, quote"', '
field3', 'field']
Terry Jan Reedy
--
http://mail.python.org/mailman/listinfo/python-list