Sam Roberts <[EMAIL PROTECTED]> wrote:
> Quoting Scott Taylor <[EMAIL PROTECTED]>, who wrote: > > Does anyone have a routine to convert from a csv file to an sql database, > > that works. I'm having a hard time with it. > > Below is what I've done, don't know how it would work with other > database types, but works well for me so far. > # Strip whitespace around the "," seperators. > > $_ =~ s/\s*,\s*/,/g; > > local @_ = split(/,/, $_); You appear to be assuming that there are no commas inside the data itself. CSV is in my experience a very annoying "standard". It looks so simple that everyone just hacks there own conversions, and consequently there are minor variations floating around... e.g. I had to deal with files that had whitespace after the comma separators, and Text::CSV_XS does not seem to allow that. DBD::CSV just uses Text::CSV_XS, by the way, so you don't gain anything by switching to the DBD version, except possibly a syntax consistent with what you're already familiar with as a database programmer. Going back to Scott's original questions: > What I can't get it to do > is recognize a multi-line field, ie: > field1_text, field2_text, "field3_line1^M > field3_line2^M > field3_line3",field4_text,... > I even tried stripping out the doze <CR> (^M). However, I'm not so > concerned about that part, just in case someone has some insight on that. The following is from http://www.perldoc.com/cpan/Text/CSV_XS.html binary If this attribute is TRUE, you may use binary characters in quoted fields, including line feeds, carriage returns and NUL bytes. (The latter must be escaped as "0.) By default this feature is off. [...] To sum it up, $csv = Text::CSV_XS->new(); is equivalent to $csv = Text::CSV_XS->new({ 'quote_char' => '"', 'escape_char' => '"', 'sep_char' => ',', 'binary' => 0 }); So just doing *this* is Scott's mistake: my $csv = Text::CSV_XS->new; You should probably be doing something like this: my $csv = Text::CSV_XS->new({ 'binary' => 1 }); Note that "binary" is also necessary if you've got text with iso8859 extended characters in it.