On Aug 4, 6:15 pm, Ryan Rosario <[EMAIL PROTECTED]> wrote: > On Aug 4, 1:01 am, John Machin <[EMAIL PROTECTED]> wrote: > > > On Aug 4, 5:49 pm, Ryan Rosario <[EMAIL PROTECTED]> wrote: > > > > Thanks Emile! Works almost perfectly, but is there some way I can > > > adapt this to quote fields that contain a comma in them? > > > You originally said "I have a very large CSV file that contains double > > quoted fields (since they contain commas)". Are you now saying that > > if a field contained a comma, you didn't wrap the field in quotes? Or > > is this a separate question unrelated to your original problem? > > I enclosed all text fields within quotes. The problem is that I have > quotes embedded inside those text fields as well and I did not double/ > escape them. Emile's snippet takes care of the escaping but it strips > the outer quotes from the text fields and if there are commas inside > the text field, the field is split into multiple fields. Of course, it > is possible that I am not using the snippet correctly I suppose.
Without you actually showing how you are using it, I can only surmise: Emile's snippet is pushing it through the csv reading process, to demonstrate that his series of replaces works (on your *sole* example, at least). Note carefully his output for one line is a *list* of fields. The repr() of that list looks superficially like a line of csv input. It looks like you are csv-reading it a second time, using quotechar="'", after stripping off the enclosing []. If this guess is not correct, please show what you are actually doing. If (as you said) you require a fixed csv file, you need to read the bad file line by line, use Emile's chain of replaces, and write each fixed line out to the new file. -- http://mail.python.org/mailman/listinfo/python-list