Jay...

Your post neatly articulates virtually every facet of this issue.
Thank you.  I wish we could get everyone to stop using csv.  I hate to
look at xml but I often wish everyone would use it instead of csv.  I
would hate to see any of the sqlite core devs  waste time on csv.

Carlos


On 9/25/09, Jay A. Kreibich <j...@kreibi.ch> wrote:
> On Fri, Sep 25, 2009 at 10:24:15AM +1000, BareFeet scratched on the wall:
>
>
>> In reality, in the thousands of CSV files I've dealt with
>> over the years, they all follow the same standard:
>>
>> 1. Rows are delimited by a new line (return and/or line feed).
>> 2. Columns are delimited by a comma.
>> 3. "Quoted text" is treated as one value item, including any commas or
>> new lines within it.
>> 4. A double quote "" is used to put a quote within quotes.
>>
>> That's it.
>
>   This is more or less the standard put forth by RFC 4180.  And if this
>   is all you've encountered, you're not using very many different
>   applications or you're primarily dealing with numbers and simple
>   strings that don't contain quotes or commas.  CSV works very very
>   well if you never get into the question of escapes, but details count.
>
>   Reading the RFC only proves my point.  SQLite v3 is older than that
>   doc, and it pretty much admits the cat was out of the bag a long time
>   ago.  There are a ton of optional and might/may/could sections that
>   event the format they define has a lot of holes in it (i.e. headers,
>   or no headers?).
>
>> Everything I've seen uses this.
>
>   According to the RFC Excel doesn't use double-quotes for anything.
>   You might not care about Excel, but I'm willing to bet it is one of
>   the most-- if not the most-- common exporters of CSV.  The question
>   of getting data from Excel into SQLite shows up on the list every now
>   and then.
>
>> Some don't need delimiters
>> in values, so they don't need quotes, but the encompassing
>> specification works for all cases.
>
>   No, it doesn't.  Working on a large website that provided CSV exports
>   for a number of data sources, I've seen plenty of examples that don't
>   work.  Finding a common format that could be exported into a handful
>   of common desktop apps was so pointless we seriously considered
>   getting rid of CSV all together, because we got tired of our users
>   telling us how simple CSV was, and why couldn't we just do this one
>   thing differently so it would work on *their* application.
>
>> It's not that big a deal for SQLite to support it, so it should.
>
>   If it is so simple, and you know where the code is...
>
>   This is, perhaps, the biggest fallacy of CSV... people think it
>   is a "simple" format (it isn't), and assume that code support to
>   "correctly" (whatever that is) read it is simple.  It isn't.  The RFC
>   has a formal grammar that requires over a dozen elements to define!
>
>   Most people setting out to build a CSV reader never think to use a
>   full grammar and parser-- after all, it is such a "simple" format--
>   and find themselves in a mess of code soon enough.  Seriously, give
>   it a try.
>
>   Carlos's Python script (nice!) is a great example.  His comment "I am
>   so grateful I did not have to write a parser for CSV" is dead on.
>   And, as he points out, the reason the Python module is so good is
>   that it is adaptive, and really reads five or six different variants
>   of CSV (something a reader can do but a writer cannot).  He was
>   also able to clobber it all together in a few hours or less (because
>   someone else spent a few hundred hours on the CSV module), further
>   proving that advanced support of this kind of thing is really outside
>   of the scope of SQLite3.  After all, the .import command is part of
>   the shell, not part of the core library.
>
>
>
>
>   CSV is a great quick and dirty format to move data.  But it isn't
>   "simple" and it isn't nearly as universal as many assume.  It works
>   great if you're just moving simple numbers and strings that don't
>   include commas, but becomes a mess when you get into exceptions.
>
>   Personally, I'd rather have the SQLite team working on core database
>   features than trying to build a better CSV parser.  The problem
>   is non-trivial and borders on unobtainable and, as Carlos
>   proved so clearly, there are better, easier, faster ways.
>
>    -j
>
>
> --
> Jay A. Kreibich < J A Y  @  K R E I B I.C H >
>
> "Our opponent is an alien starship packed with atomic bombs.  We have
>  a protractor."   "I'll go home and see if I can scrounge up a ruler
>  and a piece of string."  --from Anathem by Neal Stephenson
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to