Re: [sqlite] Writing and reading a csv using sqlite3
On Tue, Feb 16, 2010 at 10:24:58AM +, Phil Hibbs scratched on the wall: > Thanks, Jay, you're absolutely right. I've encountered a few csv > incompatibilities myself, even within the same tool (one case used \" > whereas the other used ""). > > Well, except for your last point, I only wanted sqlite3.exe to script > some unit tests, so it really is why I downloaded SQLite. I can cope > though. Fair enough, but I think you understand the point I'm getting at. I'm actually very encouraged by effort to build a robust virtual table module that deals with CSV files. I *would* like to see solid, quality CSV support. Although it might be somewhat ancillary to the database system itself, solid import and export functions are important to lowering the barrier-of-entry, and making the overall product more valuable as a general-purpose tool. My main concern has always been that support of a top notch CSV library is a *much* larger effort that most people think and when faced with a limited amount of development resources I would consider it less important than more central issues**. If it was a simple fix, that's one thing, but this is something much larger. However, building the CSV library as a module addresses most of these concerns. First and foremost, if a significant effort is going to be made to "do it right," the payback is much larger. Building a module makes the functionality available to every user and every application that uses SQLite, not just folks that use the command-line shell. It also puts a nice clear line between module development and core database development. This helps with development, testing, upkeep, and support, even if there are some developers helping out on both sides of the fence. I hope we see great things from this effort. ** on the other hand, now that we've got checkpoints, recursive trigger support, and foreign keys, what's left? Better FTS would be nice. right/full outer joins? More ALTER TABLE support? The development team has really been on top of most of the major missing features. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Our opponent is an alien starship packed with atomic bombs. We have a protractor." "I'll go home and see if I can scrounge up a ruler and a piece of string." --from Anathem by Neal Stephenson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Writing and reading a csv using sqlite3
Thanks, Jay, you're absolutely right. I've encountered a few csv incompatibilities myself, even within the same tool (one case used \" whereas the other used ""). Well, except for your last point, I only wanted sqlite3.exe to script some unit tests, so it really is why I downloaded SQLite. I can cope though. Phil Hibbs. -- Don't you just hate self-referential sigs? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Writing and reading a csv using sqlite3
On Mon, Feb 15, 2010 at 09:11:31PM +, Simon Slavin scratched on the wall: > > On 15 Feb 2010, at 8:35pm, Jay A. Kreibich wrote: > > On Mon, Feb 15, 2010 at 07:31:43PM +, Simon Slavin scratched on the > > wall: > >> So the command-line tool cannot correctly read the CSV files > >> it output itself ? Okay, that's messed up. Something should be done. > > > > Yes, and always will be. Different version of Excel have similar > > issues. The topic of what the "correct" format of a CSV file has > > been beaten to death on this list in the past. [snip] > > > > The simple fact is that everyone has a different idea of what makes > > up "proper" CSV, so the format is pretty worthless for general use. > > I think this is more extreme than that. What Phil has found here is > that the command-line tool -- one single program -- has two different > ideas about what CSV format means. Yes, I understand. *yawn* You sound surprised when you say that. > It uses one when it outputs, but > it won't accept the same format when it inputs. So the program is > itself inconsistent: however you define 'csv format', either its > output or input function is broken. Depends on what the program is for. Others might argue that it is doing its job. The import/export features are, well... import/export features. It isn't a native file format, and-- if you can accept the idea that it is an inconsistent format-- I see no real reason why you might expect it to "round trip." As others have pointed out, you can't export HTML and then re-import it. If you accept the idea that "CSV isn't CSV isn't CSV", then there should be little reason to expect you can import an export. If faced with the choice of making the import and export features work with popular programs such as Excel, Access, and other SQL database systems, I think it would be a much higher priority that the importer imports correctly from the most popular data sources, and that the exporter exports to a format understood by the most popular data consumers. And that's it. In both cases, "SQLite" is not likely to be on either the "most popular sources" or "most popular consumer" list. There are much better ways to move data between two instances of SQLite. Again, if you can truly accept the idea that CSV is a non-format, and nearly every library and every application out there have different ideas of how it should work in the nitty-gritty details, then if the goal is a solid import and a solid export to and from different sources, you should very well expect the two systems are not compatible with each other, any more than you would accept an HTML export does not work as an SQL import. And if you're response is "But they're both CSV!", then you're not really getting it. But, perhaps more to the point, you still seem to be operating under the impression that "if we just fix this one thing..." everyone will be happy. Or, at least, fewer people will be unhappy. And I'm trying to tell you it isn't true, and never will be true. You can trust me on this, you can go read the archives (where this has been disucssed again and again and again and now once more), or you can go out and try to write your own general-purpose CSV importer and watch the wave of "this importer is stupid, it doesn't understand CSV from ABC application!" responses. Ask the folks that wrote the Python module. And in the tend, even if you can't import and export, it doesn't matter. Making it so that you can round-trip by "fixing that one thing" is very unlikely to solve any of the larger problems. For every person that likes the new way better, you're going to find someone that hates you for breaking what was working. Which puts the development team in a very difficult position. They can spend a lot of time and effort moving the CSV implementation around, but not really fixing anything or making any substantial number of customers more happy. Or they can invest a huge amount of time writing an amazing adaptive parser that reads about 95% of the files out there. This is likely to be several thousand lines of code that needs to be written, debugged, and maintained. Yes really. Again, ask the Python folks. Or they can ignore the whole mess go do creative and amazing database things. You can guess what I would vote for. Personally, if I was in their position I would have completely ripped out CSV support long ago. Reduces support effort, keeps the code base small and neat, and makes everyone equally (un)happy. If you want CSV, then write your own, download one of the modules out there, or use a different pointy-clicky shell tool. > I would have thought that this would mean it failed at least a unit test. The massive amount of testing is focused on the core library. I suspect there is (relatively) little testing for the shell.c file. It isn't
Re: [sqlite] Writing and reading a csv using sqlite3
On Mon, Feb 15, 2010 at 4:31 PM, Simon Slavinwrote: > > On 15 Feb 2010, at 9:28pm, Roger Binns wrote: > > > Simon Slavin wrote: > >> It uses one when it outputs, but it won't accept the same format when it > inputs. So the program is itself inconsistent: however you define 'csv > format', either its output or input function is broken. > > > > There is a ticket covering this: > > > > http://www.sqlite.org/src/tktview?name=c25aab7e7e > > Ahha. And that points to > > http://www.sqlite.org/cvstrac/tktview?tn=3276 > > which explains and fixes the problem in one way. Okay. > > Simon. > > In the past, the shell tool (CLI) was never "officially" supported. There were no testscripts to verify correct operation, or guarantees of included features or compatibility from one version to the next, unlike with the "core" SQLite library. We've taken steps to more formally support this, as is indicated by the addition of some test scripts for it here: http://www.sqlite.org/src/dir?name=tool However, there are still many features of the CLI that we've yet to write tests for, including tests for CSV support. Work has begun to "revamp" the CSV support entirely, by implementing it with a virtual table interface. You can see the beginnings of it here in this branch: http://www.sqlite.org/src/timeline?t=csv_ext There are a couple more third party CSV extensions for SQLite that are much more advanced, and it's unlikely that this implementation will ever reach that level of sophistication or support much more than the current import/export needs of the CLI. HTH. -Shane ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Writing and reading a csv using sqlite3
On 15 Feb 2010, at 9:28pm, Roger Binns wrote: > Simon Slavin wrote: >> It uses one when it outputs, but it won't accept the same format when it >> inputs. So the program is itself inconsistent: however you define 'csv >> format', either its output or input function is broken. > > There is a ticket covering this: > > http://www.sqlite.org/src/tktview?name=c25aab7e7e Ahha. And that points to http://www.sqlite.org/cvstrac/tktview?tn=3276 which explains and fixes the problem in one way. Okay. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Writing and reading a csv using sqlite3
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Simon Slavin wrote: > It uses one when it outputs, but it won't accept the same format when it > inputs. So the program is itself inconsistent: however you define 'csv > format', either its output or input function is broken. There is a ticket covering this: http://www.sqlite.org/src/tktview?name=c25aab7e7e Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkt5vJIACgkQmOOfHg372QQ5xwCeKJu6+I+ZRfiNLpH8HfDtGapR kXIAniEY7gCzfUXVnc7vZnH4KyRmEmbT =l7Wm -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Writing and reading a csv using sqlite3
On 15 Feb 2010, at 8:35pm, Jay A. Kreibich wrote: > On Mon, Feb 15, 2010 at 07:31:43PM +, Simon Slavin scratched on the wall: >> >> So the command-line tool cannot correctly read the CSV files >> it output itself ? Okay, that's messed up. Something should be done. > > Yes, and always will be. Different version of Excel have similar > issues. The topic of what the "correct" format of a CSV file has > been beaten to death on this list in the past. [snip] > > The simple fact is that everyone has a different idea of what makes > up "proper" CSV, so the format is pretty worthless for general use. I think this is more extreme than that. What Phil has found here is that the command-line tool -- one single program -- has two different ideas about what CSV format means. It uses one when it outputs, but it won't accept the same format when it inputs. So the program is itself inconsistent: however you define 'csv format', either its output or input function is broken. Among other things this means that you cannot test the command-line tool with a file generated using the command-line tool. Generate a file, read it back in, and see if you get the same data and you find it doesn't. I would have thought that this would mean it failed at least a unit test. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Writing and reading a csv using sqlite3
> So the command-line tool cannot correctly read the CSV files it output > itself ? Okay, that's messed up. Something should be done. On the other hand, should it read HTML output back in again? At present, the .import only supports separators, no other processing is done. It wouldn't be too hard to respect the MODE_Csv setting on import, but the others might be a little more tricky. Phil Hibbs. -- Don't you just hate self-referential sigs? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Writing and reading a csv using sqlite3
On Mon, Feb 15, 2010 at 07:31:43PM +, Simon Slavin scratched on the wall: > > So the command-line tool cannot correctly read the CSV files > it output itself ? Okay, that's messed up. Something should be done. Yes, and always will be. Different version of Excel have similar issues. The topic of what the "correct" format of a CSV file has been beaten to death on this list in the past. I suggest that if you have the need to import a large number of specific CSV files, you write your own importer and/or exporter, either via your own code that matches your needs or by using one of the more popular Perl or Python parsers. The simple fact is that everyone has a different idea of what makes up "proper" CSV, so the format is pretty worthless for general use. Now, I know that sounds funny, but go read the archives. If you think CSV is simple, I invite you to have a look at the massive Python module and all the odd special-case bits it has. I suspect the SQLite team has just dismissed the whole issue out of hand, and personally I agree with that. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Our opponent is an alien starship packed with atomic bombs. We have a protractor." "I'll go home and see if I can scrounge up a ruler and a piece of string." --from Anathem by Neal Stephenson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Writing and reading a csv using sqlite3
On 15 Feb 2010, at 7:23pm, Phil Hibbs wrote: > Simon Slavin: >> The command-line tool is correct as far as it goes in producing CSV >> files: if you use quotes around text fields, it is correct to double quotes >> inside those fields. Take a look at the CSVs you're trying to import. >> Are the text fields delimited by quotes ? If quotes appear in the field, >> are they doubled ? > > In this case, the csv file that I'm trying to import was created by > SQLite. And yes, fields containing quotes or commas are quoted with > the internal quotes doubled up. It's SQLite3's import that does not > correctly handle the doubled-up quotes that it correctly wrote out. So the command-line tool cannot correctly read the CSV files it output itself ? Okay, that's messed up. Something should be done. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Writing and reading a csv using sqlite3
Simon Slavin: > The command-line tool is correct as far as it goes in producing CSV > files: if you use quotes around text fields, it is correct to double quotes > inside those fields. Take a look at the CSVs you're trying to import. > Are the text fields delimited by quotes ? If quotes appear in the field, > are they doubled ? In this case, the csv file that I'm trying to import was created by SQLite. And yes, fields containing quotes or commas are quoted with the internal quotes doubled up. It's SQLite3's import that does not correctly handle the doubled-up quotes that it correctly wrote out. Phil Hibbs. -- Don't you just hate self-referential sigs? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Writing and reading a csv using sqlite3
On 15 Feb 2010, at 4:33pm, Phil Hibbs wrote: > Paul Corke: >> Do you need to read the csv data back in to a sqlite database? If >> not, then the file you've got should be ok. > > Well, what I need to do is import csv files from other sources, this > was just a test to see if there were any gotchas. So I do need to fix > the importing. The command-line tool is correct as far as it goes in producing CSV files: if you use quotes around text fields, it is correct to double quotes inside those fields. Take a look at the CSVs you're trying to import. Are the text fields delimited by quotes ? If quotes appear in the field, are they doubled ? Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Writing and reading a csv using sqlite3
Paul Corke: > Not doing the ".mode csv" before the ".import" makes things better > but doesn't fix it. That breaks when I have more than one column though, "line 1: expected 3 columns of data but found 1". > Do you need to read the csv data back in to a sqlite database? If > not, then the file you've got should be ok. Well, what I need to do is import csv files from other sources, this was just a test to see if there were any gotchas. So I do need to fix the importing. > If you do then I think > you will need to either alter shell.c or create your own program > to read a csv file and insert it. I don't have a C compiler here at work, I'll look into installing one at home. I really should, it's been a long time since I did any C. Thanks for the replies. Is this something that should be fed back into the master source for shell.c? Should I post to the devs list? Phil Hibbs. -- Don't you just hate self-referential sigs? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Writing and reading a csv using sqlite3
On 15 Feb 2010, at 2:50pm, Phil Hibbs wrote: > If I write out data in csv format using SQLite3 shell, it doubles up > all the quote characters. Any ideas how can I stop it doing this? > > C:\sqlite>sqlite3 test.db > SQLite version 3.6.22 > Enter ".help" for instructions > Enter SQL statements terminated with a ";" > sqlite> create table test ( f1 varchar(100) ); > sqlite> insert into test values('test 1'); > sqlite> insert into test values('test "2"'); > sqlite> .mode csv > sqlite> .output test.csv > sqlite> select * from test; > sqlite> .quit > > C:\sqlite>type test.csv > "test 1" > "test ""2""" Good problem description and debug text. It's doubling quotes up because they're inside quoted text strings. I don't see an easy way to change it. Unfortunately, I don't think any global change command will help you solve the problem. You can't change all occurences of three quotes in a row because you don't know whether they're at the beginning or end of a field. Depending on how complicated your output is, you might instead use a different .mode . Perhaps 'list' or 'tabs' might give you something you can clear up more easily. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users