First - Dr. Hipp - thanks for the .import fix from 1/16 baked into 3.7.11 - it now recognizes commas embedded within character strings in a csv + strips the double-quotes. This makes the loading much smoother.
============== Now for 3 strange events I've been running into. Sorry - the data volumes are large - so it's very time consuming to run many tests to pinpoint these further. Just raising the flag in case any of this rings a bell with anyone. ============== (1) I import data into a table in one DB (2) I then create an identical table in a 2nd DB (3) I then copy the table from DB1 to DB2 - with some SQL ETL cleanups. Surprisingly - the data gets appended to the initial table on DB1 instead of the new table on DB2. I tried to replicate it with a couple of rows - but it works fine. However - with the real 100 million row table it happens. My workaround for the moment is to give the 2 tables on the 2 DBs different names and ALTER RENAME when all is said and done. That works fine. sqlite3 orig.db3 create table xyz (.....); import xyz.csv xyz .exit sqlite3 final.db3 create table xyz (.....); attach database 'orig.db3' as indb; insert into xyz select * from indb.xyz .exit Look at the size of orig.db3 - it's now double in size with twice as many rows Look at the size of final.db3 - 0K ============== The reason I use the 2 DB approach in leu of just UPDATEing the original DB - is that on UPDATE sqlite seems to try and read the table first. At least it seems to do that on large tables. Apparently - if the table is large enough - memory usage slowly rises - and then I get an "out of memory" message. So UPDATE works on smaller tables (up to 10-20GB) but for larger ones I need to create a whole new copy of the table for every ETL iteration. Since I would like each large table to end up in its own DB, and I would like to avoid putting multiple intermediate tables in the same DB, then delete the unnecessary ones, then finally VACUUM - I use multiple physical DBs - hence - the abovementioned attach issue. ============== If I run an .import and mistype something - like the path of the .csv file - or the wrong .separator - the commands obviously fails. I correct the syntax and run again - I sometimes get a "expected x columns but found y columns.... stopping at row nn" type of message. Initiall I very carefully looked at the data and tried to load "nn+1" rows. Everything worked. I reran the .import and all was fine. Then I learned to just close / reopen the DB - run the .import again - without changing anything - and all works fine. It's as if when .import runs against a problem - it doesn't clean something up - which leads to other messages. Starting from scratch - making sure the syntax is correct the first time - works fine. ============== Again - sorry these sounds so fuzzy. I'll try to provide more detail as I run across it - but just in case these are simple to figure out. Thanks ! _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users