Hi, I am currently using sqlite3 version:
3.8.1 2013-10-17 12:57:35 c78be6d786c19073b3a6730dfe3fb1be54f5657a I found an edge case bug in the sqlite3 shell when importing csv data with fields containing embedded quotes, CRs and LFs: When a field contains an embedded quote, and that embedded quote is immediately followed by an EOL then the current csv parser will consider the end of field to be reached and fail to import the row correctly. For example the following csv, a single data row with column headers will fail to import correctly. column1,column2,column3,column4 fielddata1,fielddata2,"field "" data"" 3",fielddata4 The offending code is in shell.c in function static char *csv_read_one_field(CSVReader *p); An example fix could be something like the following -- adding a counter flag 'cQuoteComplete' to track quotes properly. static char *csv_read_one_field(CSVReader *p){ int c, pc; int cSep = p->cSeparator; int cQuoteComplete = 0; p->n = 0; c = fgetc(p->in); if( c==EOF || seenInterrupt ){ p->cTerm = EOF; return 0; } if( c=='"' ){ int startLine = p->nLine; int cQuote = c; cQuoteComplete+=1; pc = 0; while( 1 ){ c = fgetc(p->in); if( c=='\n' ) p->nLine++; if( c==cQuote ){ cQuoteComplete+=1; if( pc==cQuote ){ pc = 0; continue; } } if( (c==cSep && pc==cQuote) || (c=='\n' && pc==cQuote) || (c=='\n' && pc=='\r' && p->n>=2 && p->z[p->n-2]==cQuote) || (c==EOF && pc==cQuote) ){ if (cQuoteComplete%2 == 0) { do{ p->n--; }while( p->z[p->n]!=cQuote ); p->cTerm = c; break; } } if( c==EOF ){ fprintf(stderr, "%s:%d: unterminated %c-quoted field\n", p->zFile, startLine, cQuote); p->cTerm = EOF; break; } csv_append_char(p, c); pc = c; } if( cQuoteComplete%2 != 0 ){ fprintf(stderr, "%s:%d: unescaped %c character\n", p->zFile, p->nLine, cQuote); } }else{ while( c!=EOF && c!=cSep && c!='\n' ){ csv_append_char(p, c); c = fgetc(p->in); } if( c=='\n' ){ p->nLine++; if( p->n>1 && p->z[p->n-1]=='\r' ) p->n--; } p->cTerm = c; } if( p->z ) p->z[p->n] = 0; return p->z; } I built the sqlite3 shell from the almagamation source and tested the above change using my import data; about 1Gb of messy, but rfc4180 compliant, CSV. It all imported cleanly. sqlite3 is a wonderful bit of software. I have been using it for some time now to munge and query multi-gigabyte size data sets and am very impressed with its performance and capabilities. Best Regards, Lindsay _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users