Re: Trying to fix Invalid CSV File

John Machin Mon, 04 Aug 2008 02:36:29 -0700

On Aug 4, 6:15 pm, Ryan Rosario <[EMAIL PROTECTED]> wrote:
> On Aug 4, 1:01 am, John Machin <[EMAIL PROTECTED]> wrote:
>
> > On Aug 4, 5:49 pm, Ryan Rosario <[EMAIL PROTECTED]> wrote:
>
> > > Thanks Emile! Works almost perfectly, but is there some way I can
> > > adapt this to quote fields that contain a comma in them?
>
> > You originally said "I have a very large CSV file that contains double
> > quoted fields (since they contain commas)". Are you now saying  that
> > if a field contained a comma, you didn't wrap the field in quotes? Or
> > is this a separate question unrelated to your original problem?
>
> I enclosed all text fields within quotes. The problem is that I have
> quotes embedded inside those text fields as well and I did not double/
> escape them. Emile's snippet takes care of the escaping but it strips
> the outer quotes from the text fields and if there are commas inside
> the text field, the field is split into multiple fields. Of course, it
> is possible that I am not using the snippet correctly I suppose.


Without you actually showing how you are using it, I can only surmise:

Emile's snippet is pushing it through the csv reading process, to
demonstrate that his series of replaces works (on your *sole* example,
at least). Note carefully his output for one line is a *list* of
fields. The repr() of that list looks superficially like a line of csv
input. It looks like you are csv-reading it a second time, using
quotechar="'", after stripping off the enclosing []. If this guess is
not correct, please show what you are actually doing.

If (as you said) you require a fixed csv file, you need to read the
bad file line by line, use Emile's chain of replaces, and write each
fixed line out to the new file.
--
http://mail.python.org/mailman/listinfo/python-list

Re: Trying to fix Invalid CSV File

Reply via email to