On Mon, Feb 15, 2010 at 09:11:31PM +0000, Simon Slavin scratched on the wall:
> 
> On 15 Feb 2010, at 8:35pm, Jay A. Kreibich wrote:
> > On Mon, Feb 15, 2010 at 07:31:43PM +0000, Simon Slavin scratched on the 
> > wall:
> >> So the command-line tool cannot correctly read the CSV files 
> >> it output itself ?  Okay, that's messed up. Something should be done.
> > 
> >  Yes, and always will be.  Different version of Excel have similar
> >  issues.  The topic of what the "correct" format of a CSV file has 
> >  been beaten to death on this list in the past. [snip]
> > 
> >  The simple fact is that everyone has a different idea of what makes
> >  up "proper" CSV, so the format is pretty worthless for general use.
> 
> I think this is more extreme than that.  What Phil has found here is
> that the command-line tool -- one single program -- has two different
> ideas about what CSV format means.

  Yes, I understand.  *yawn*   You sound surprised when you say that.

> It uses one when it outputs, but
> it won't accept the same format when it inputs.  So the program is
> itself inconsistent: however you define 'csv format', either its
> output or input function is broken.

  Depends on what the program is for.  Others might argue that it is
  doing its job.  The import/export features are, well... import/export
  features.  It isn't a native file format, and-- if you can accept the
  idea that it is an inconsistent format-- I see no real reason why
  you might expect it to "round trip."  As others have pointed out, you
  can't export HTML and then re-import it.  If you accept the idea that
  "CSV isn't CSV isn't CSV", then there should be little reason to
  expect you can import an export.

  If faced with the choice of making the import and export features
  work with popular programs such as Excel, Access, and other SQL
  database systems, I think it would be a much higher priority that the
  importer imports correctly from the most popular data sources, and
  that the exporter exports to a format understood by the most popular
  data consumers.  And that's it.  In both cases, "SQLite" is not likely
  to be on either the "most popular sources" or "most popular consumer"
  list.  There are much better ways to move data between two instances
  of SQLite.  

  Again, if you can truly accept the idea that CSV is a non-format, and
  nearly every library and every application out there have different
  ideas of how it should work in the nitty-gritty details, then if the
  goal is a solid import and a solid export to and from different
  sources, you should very well expect the two systems are not
  compatible with each other, any more than you would accept an HTML
  export does not work as an SQL import.  And if you're response is
  "But they're both CSV!", then you're not really getting it.



  But, perhaps more to the point, you still seem to be operating under
  the impression that "if we just fix this one thing..." everyone will
  be happy.  Or, at least, fewer people will be unhappy.

  And I'm trying to tell you it isn't true, and never will be true.
  You can trust me on this, you can go read the archives (where this
  has been disucssed again and again and again and now once more), or you
  can go out and try to write your own general-purpose CSV importer and
  watch the wave of "this importer is stupid, it doesn't understand CSV
  from ABC application!" responses.  Ask the folks that wrote the Python
  module.

  And in the tend, even if you can't import and export, it doesn't matter.
  Making it so that you can round-trip by "fixing that one thing" is
  very unlikely to solve any of the larger problems.  For every person
  that likes the new way better, you're going to find someone that hates
  you for breaking what was working.

  Which puts the development team in a very difficult position.  They can
  spend a lot of time and effort moving the CSV implementation around,
  but not really fixing anything or making any substantial number of
  customers more happy.  Or they can invest a huge amount of time
  writing an amazing adaptive parser that reads about 95% of the files
  out there.  This is likely to be several thousand lines of code that
  needs to be written, debugged, and maintained.  Yes really.  Again,
  ask the Python folks.
  
  Or they can ignore the whole mess go do creative and amazing
  database things.  You can guess what I would vote for.

  Personally, if I was in their position I would have completely ripped
  out CSV support long ago.  Reduces support effort, keeps the code
  base small and neat, and makes everyone equally (un)happy.  If you
  want CSV, then write your own, download one of the modules out there,
  or use a different pointy-clicky shell tool.

> I would have thought that this would mean it failed at least a unit test.

  The massive amount of testing is focused on the core library.  I
  suspect there is (relatively) little testing for the shell.c file.
  It isn't really why you download SQLite.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Our opponent is an alien starship packed with atomic bombs.  We have
 a protractor."   "I'll go home and see if I can scrounge up a ruler
 and a piece of string."  --from Anathem by Neal Stephenson
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to