Re: [sqlite] Writing and reading a csv using sqlite3

2010-02-16 Thread Jay A. Kreibich
On Tue, Feb 16, 2010 at 10:24:58AM +, Phil Hibbs scratched on the wall:
> Thanks, Jay, you're absolutely right. I've encountered a few csv
> incompatibilities myself, even within the same tool (one case used \"
> whereas the other used "").
> 
> Well, except for your last point, I only wanted sqlite3.exe to script
> some unit tests, so it really is why I downloaded SQLite. I can cope
> though.

  Fair enough, but I think you understand the point I'm getting at. 


  I'm actually very encouraged by effort to build a robust virtual table
  module that deals with CSV files.  I *would* like to see solid,
  quality CSV support.  Although it might be somewhat ancillary to
  the database system itself, solid import and export functions
  are important to lowering the barrier-of-entry, and making the overall
  product more valuable as a general-purpose tool.
  
  My main concern has always been that support of a top notch CSV 
  library is a *much* larger effort that most people think and when
  faced with a limited amount of development resources I would
  consider it less important than more central issues**.  If it was a
  simple fix, that's one thing, but this is something much larger.

  However, building the CSV library as a module addresses most of these
  concerns.  First and foremost, if a significant effort is going to be
  made to "do it right," the payback is much larger.  Building a module
  makes the functionality available to every user and every application
  that uses SQLite, not just folks that use the command-line shell.
  It also puts a nice clear line between module development and core
  database development.  This helps with development, testing, upkeep,
  and support, even if there are some developers helping out on both
  sides of the fence.

  I hope we see great things from this effort.


  ** on the other hand, now that we've got checkpoints, recursive trigger
 support, and foreign keys, what's left?  Better FTS would be nice.
 right/full outer joins?  More ALTER TABLE support?  The development
 team has really been on top of most of the major missing features.


-j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Our opponent is an alien starship packed with atomic bombs.  We have
 a protractor."   "I'll go home and see if I can scrounge up a ruler
 and a piece of string."  --from Anathem by Neal Stephenson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Writing and reading a csv using sqlite3

2010-02-16 Thread Phil Hibbs
Thanks, Jay, you're absolutely right. I've encountered a few csv
incompatibilities myself, even within the same tool (one case used \"
whereas the other used "").

Well, except for your last point, I only wanted sqlite3.exe to script
some unit tests, so it really is why I downloaded SQLite. I can cope
though.

Phil Hibbs.
-- 
Don't you just hate self-referential sigs?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Writing and reading a csv using sqlite3

2010-02-15 Thread Jay A. Kreibich
On Mon, Feb 15, 2010 at 09:11:31PM +, Simon Slavin scratched on the wall:
> 
> On 15 Feb 2010, at 8:35pm, Jay A. Kreibich wrote:
> > On Mon, Feb 15, 2010 at 07:31:43PM +, Simon Slavin scratched on the 
> > wall:
> >> So the command-line tool cannot correctly read the CSV files 
> >> it output itself ?  Okay, that's messed up. Something should be done.
> > 
> >  Yes, and always will be.  Different version of Excel have similar
> >  issues.  The topic of what the "correct" format of a CSV file has 
> >  been beaten to death on this list in the past. [snip]
> > 
> >  The simple fact is that everyone has a different idea of what makes
> >  up "proper" CSV, so the format is pretty worthless for general use.
> 
> I think this is more extreme than that.  What Phil has found here is
> that the command-line tool -- one single program -- has two different
> ideas about what CSV format means.

  Yes, I understand.  *yawn*   You sound surprised when you say that.

> It uses one when it outputs, but
> it won't accept the same format when it inputs.  So the program is
> itself inconsistent: however you define 'csv format', either its
> output or input function is broken.

  Depends on what the program is for.  Others might argue that it is
  doing its job.  The import/export features are, well... import/export
  features.  It isn't a native file format, and-- if you can accept the
  idea that it is an inconsistent format-- I see no real reason why
  you might expect it to "round trip."  As others have pointed out, you
  can't export HTML and then re-import it.  If you accept the idea that
  "CSV isn't CSV isn't CSV", then there should be little reason to
  expect you can import an export.

  If faced with the choice of making the import and export features
  work with popular programs such as Excel, Access, and other SQL
  database systems, I think it would be a much higher priority that the
  importer imports correctly from the most popular data sources, and
  that the exporter exports to a format understood by the most popular
  data consumers.  And that's it.  In both cases, "SQLite" is not likely
  to be on either the "most popular sources" or "most popular consumer"
  list.  There are much better ways to move data between two instances
  of SQLite.  

  Again, if you can truly accept the idea that CSV is a non-format, and
  nearly every library and every application out there have different
  ideas of how it should work in the nitty-gritty details, then if the
  goal is a solid import and a solid export to and from different
  sources, you should very well expect the two systems are not
  compatible with each other, any more than you would accept an HTML
  export does not work as an SQL import.  And if you're response is
  "But they're both CSV!", then you're not really getting it.



  But, perhaps more to the point, you still seem to be operating under
  the impression that "if we just fix this one thing..." everyone will
  be happy.  Or, at least, fewer people will be unhappy.

  And I'm trying to tell you it isn't true, and never will be true.
  You can trust me on this, you can go read the archives (where this
  has been disucssed again and again and again and now once more), or you
  can go out and try to write your own general-purpose CSV importer and
  watch the wave of "this importer is stupid, it doesn't understand CSV
  from ABC application!" responses.  Ask the folks that wrote the Python
  module.

  And in the tend, even if you can't import and export, it doesn't matter.
  Making it so that you can round-trip by "fixing that one thing" is
  very unlikely to solve any of the larger problems.  For every person
  that likes the new way better, you're going to find someone that hates
  you for breaking what was working.

  Which puts the development team in a very difficult position.  They can
  spend a lot of time and effort moving the CSV implementation around,
  but not really fixing anything or making any substantial number of
  customers more happy.  Or they can invest a huge amount of time
  writing an amazing adaptive parser that reads about 95% of the files
  out there.  This is likely to be several thousand lines of code that
  needs to be written, debugged, and maintained.  Yes really.  Again,
  ask the Python folks.
  
  Or they can ignore the whole mess go do creative and amazing
  database things.  You can guess what I would vote for.

  Personally, if I was in their position I would have completely ripped
  out CSV support long ago.  Reduces support effort, keeps the code
  base small and neat, and makes everyone equally (un)happy.  If you
  want CSV, then write your own, download one of the modules out there,
  or use a different pointy-clicky shell tool.

> I would have thought that this would mean it failed at least a unit test.

  The massive amount of testing is focused on the core library.  I
  suspect there is (relatively) little testing for the shell.c file.
  It isn't 

Re: [sqlite] Writing and reading a csv using sqlite3

2010-02-15 Thread Shane Harrelson
On Mon, Feb 15, 2010 at 4:31 PM, Simon Slavin  wrote:

>
> On 15 Feb 2010, at 9:28pm, Roger Binns wrote:
>
> > Simon Slavin wrote:
> >> It uses one when it outputs, but it won't accept the same format when it
> inputs.  So the program is itself inconsistent: however you define 'csv
> format', either its output or input function is broken.
> >
> > There is a ticket covering this:
> >
> >  http://www.sqlite.org/src/tktview?name=c25aab7e7e
>
> Ahha.  And that points to
>
> http://www.sqlite.org/cvstrac/tktview?tn=3276
>
> which explains and fixes the problem in one way.  Okay.
>
> Simon.
>
>

In the past, the shell tool (CLI) was never "officially" supported.   There
were no testscripts to verify correct operation, or guarantees of included
features or compatibility from one version to the next, unlike with the
"core" SQLite library.

We've taken steps to more formally support this, as is indicated by the
addition of some test scripts for it here:
http://www.sqlite.org/src/dir?name=tool   However, there are still many
features of the CLI that we've yet to write tests for, including tests for
CSV support.

Work has begun to "revamp" the CSV support entirely, by implementing it with
a virtual table interface.   You can see the beginnings of it here in this
branch:  http://www.sqlite.org/src/timeline?t=csv_ext   There are a couple
more third party CSV extensions for SQLite that are much more advanced, and
it's unlikely that this implementation will ever reach that level of
sophistication or support much more than the current import/export needs of
the CLI.

HTH.
-Shane
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Writing and reading a csv using sqlite3

2010-02-15 Thread Simon Slavin

On 15 Feb 2010, at 9:28pm, Roger Binns wrote:

> Simon Slavin wrote:
>> It uses one when it outputs, but it won't accept the same format when it 
>> inputs.  So the program is itself inconsistent: however you define 'csv 
>> format', either its output or input function is broken.
> 
> There is a ticket covering this:
> 
>  http://www.sqlite.org/src/tktview?name=c25aab7e7e

Ahha.  And that points to

http://www.sqlite.org/cvstrac/tktview?tn=3276

which explains and fixes the problem in one way.  Okay.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Writing and reading a csv using sqlite3

2010-02-15 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Simon Slavin wrote:
> It uses one when it outputs, but it won't accept the same format when it 
> inputs.  So the program is itself inconsistent: however you define 'csv 
> format', either its output or input function is broken.

There is a ticket covering this:

  http://www.sqlite.org/src/tktview?name=c25aab7e7e

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkt5vJIACgkQmOOfHg372QQ5xwCeKJu6+I+ZRfiNLpH8HfDtGapR
kXIAniEY7gCzfUXVnc7vZnH4KyRmEmbT
=l7Wm
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Writing and reading a csv using sqlite3

2010-02-15 Thread Simon Slavin

On 15 Feb 2010, at 8:35pm, Jay A. Kreibich wrote:

> On Mon, Feb 15, 2010 at 07:31:43PM +, Simon Slavin scratched on the wall:
>> 
>> So the command-line tool cannot correctly read the CSV files 
>> it output itself ?  Okay, that's messed up. Something should be done.
> 
>  Yes, and always will be.  Different version of Excel have similar
>  issues.  The topic of what the "correct" format of a CSV file has 
>  been beaten to death on this list in the past. [snip]
> 
>  The simple fact is that everyone has a different idea of what makes
>  up "proper" CSV, so the format is pretty worthless for general use.

I think this is more extreme than that.  What Phil has found here is that the 
command-line tool -- one single program -- has two different ideas about what 
CSV format means.  It uses one when it outputs, but it won't accept the same 
format when it inputs.  So the program is itself inconsistent: however you 
define 'csv format', either its output or input function is broken.

Among other things this means that you cannot test the command-line tool with a 
file generated using the command-line tool.  Generate a file, read it back in, 
and see if you get the same data and you find it doesn't.  I would have thought 
that this would mean it failed at least a unit test.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Writing and reading a csv using sqlite3

2010-02-15 Thread Phil Hibbs
> So the command-line tool cannot correctly read the CSV files it output
> itself ?  Okay, that's messed up. Something should be done.

On the other hand, should it read HTML output back in again? At
present, the .import only supports separators, no other processing is
done. It wouldn't be too hard to respect the MODE_Csv setting on
import, but the others might be a little more tricky.

Phil Hibbs.
-- 
Don't you just hate self-referential sigs?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Writing and reading a csv using sqlite3

2010-02-15 Thread Jay A. Kreibich
On Mon, Feb 15, 2010 at 07:31:43PM +, Simon Slavin scratched on the wall:
> 
> So the command-line tool cannot correctly read the CSV files 
> it output itself ?  Okay, that's messed up. Something should be done.

  Yes, and always will be.  Different version of Excel have similar
  issues.  The topic of what the "correct" format of a CSV file has 
  been beaten to death on this list in the past.  I suggest that if
  you have the need to import a large number of specific CSV files, you
  write your own importer and/or exporter, either via your own code
  that matches your needs or by using one of the more popular Perl
  or Python parsers.

  The simple fact is that everyone has a different idea of what makes
  up "proper" CSV, so the format is pretty worthless for general use.
  Now, I know that sounds funny, but go read the archives.  If you
  think CSV is simple, I invite you to have a look at the massive
  Python module and all the odd special-case bits it has.  I suspect
  the SQLite team has just dismissed the whole issue out of hand, and
  personally I agree with that.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Our opponent is an alien starship packed with atomic bombs.  We have
 a protractor."   "I'll go home and see if I can scrounge up a ruler
 and a piece of string."  --from Anathem by Neal Stephenson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Writing and reading a csv using sqlite3

2010-02-15 Thread Simon Slavin

On 15 Feb 2010, at 7:23pm, Phil Hibbs wrote:

> Simon Slavin:
>> The command-line tool is correct as far as it goes in producing CSV
>> files: if you use quotes around text fields, it is correct to double quotes
>> inside those fields.  Take a look at the CSVs you're trying to import.
>> Are the text fields delimited by quotes ?  If quotes appear in the field,
>> are they doubled ?
> 
> In this case, the csv file that I'm trying to import was created by
> SQLite. And yes, fields containing quotes or commas are quoted with
> the internal quotes doubled up. It's SQLite3's import that does not
> correctly handle the doubled-up quotes that it correctly wrote out.

So the command-line tool cannot correctly read the CSV files it output itself ? 
 Okay, that's messed up. Something should be done.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Writing and reading a csv using sqlite3

2010-02-15 Thread Phil Hibbs
Simon Slavin:
> The command-line tool is correct as far as it goes in producing CSV
> files: if you use quotes around text fields, it is correct to double quotes
> inside those fields.  Take a look at the CSVs you're trying to import.
> Are the text fields delimited by quotes ?  If quotes appear in the field,
> are they doubled ?

In this case, the csv file that I'm trying to import was created by
SQLite. And yes, fields containing quotes or commas are quoted with
the internal quotes doubled up. It's SQLite3's import that does not
correctly handle the doubled-up quotes that it correctly wrote out.

Phil Hibbs.
-- 
Don't you just hate self-referential sigs?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Writing and reading a csv using sqlite3

2010-02-15 Thread Simon Slavin

On 15 Feb 2010, at 4:33pm, Phil Hibbs wrote:

> Paul Corke:
>> Do you need to read the csv data back in to a sqlite database?  If
>> not, then the file you've got should be ok.
> 
> Well, what I need to do is import csv files from other sources, this
> was just a test to see if there were any gotchas. So I do need to fix
> the importing.

The command-line tool is correct as far as it goes in producing CSV files: if 
you use quotes around text fields, it is correct to double quotes inside those 
fields.  Take a look at the CSVs you're trying to import.  Are the text fields 
delimited by quotes ?  If quotes appear in the field, are they doubled ?

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Writing and reading a csv using sqlite3

2010-02-15 Thread Phil Hibbs
Paul Corke:
> Not doing the ".mode csv" before the ".import" makes things better
> but doesn't fix it.

That breaks when I have more than one column though, "line 1: expected
3 columns of data but found 1".

> Do you need to read the csv data back in to a sqlite database?  If
> not, then the file you've got should be ok.

Well, what I need to do is import csv files from other sources, this
was just a test to see if there were any gotchas. So I do need to fix
the importing.

> If you do then I think
> you will need to either alter shell.c or create your own program
> to read a csv file and insert it.

I don't have a C compiler here at work, I'll look into installing one
at home. I really should, it's been a long time since I did any C.

Thanks for the replies. Is this something that should be fed back into
the master source for shell.c? Should I post to the devs list?

Phil Hibbs.
-- 
Don't you just hate self-referential sigs?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Writing and reading a csv using sqlite3

2010-02-15 Thread Simon Slavin

On 15 Feb 2010, at 2:50pm, Phil Hibbs wrote:

> If I write out data in csv format using SQLite3 shell, it doubles up
> all the quote characters. Any ideas how can I stop it doing this?
> 
> C:\sqlite>sqlite3 test.db
> SQLite version 3.6.22
> Enter ".help" for instructions
> Enter SQL statements terminated with a ";"
> sqlite> create table test ( f1 varchar(100) );
> sqlite> insert into test values('test 1');
> sqlite> insert into test values('test "2"');
> sqlite> .mode csv
> sqlite> .output test.csv
> sqlite> select * from test;
> sqlite> .quit
> 
> C:\sqlite>type test.csv
> "test 1"
> "test ""2"""

Good problem description and debug text.

It's doubling quotes up because they're inside quoted text strings.  I don't 
see an easy way to change it.  Unfortunately, I don't think any global change 
command will help you solve the problem.  You can't change all occurences of 
three quotes in a row because you don't know whether they're at the beginning 
or end of a field.

Depending on how complicated your output is, you might instead use a different 
.mode .  Perhaps 'list' or 'tabs' might give you something you can clear up 
more easily.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users