"Simon Slavin" <[email protected]> schrieb im Newsbeitrag news:[email protected]...
> Alok is writing in a language which doesn't compile very well. Nah, even with VB.NET it should be possible, to bring the import "up to speed" ... at least a factor 20 improvement should be achievable even there. > And he has another system which splits his data up > into smaller chunks (presumably the 10.5 rows) > before it gets to the one which uses SQLite. As I read his replies, he was talking about the import- time for 10500 test-rows (tab-delimited-records). Do you mean the "Sub-Splitting" of a single "import-row" into "column-values"? > But it might be worth trying larger chunks, since much of > the 35 seconds could be overhead and doubling the > amount of data might not take much more time. Yep, that's what I was trying to point out (indirectly)... ;-) Garrys advise (to read in the entire File beforehand - and perform the Splitting InMemory) is not that bad in terms of performance, as long as the Input-Files come in sizes below - let's say - 4-8 MB (given the amount of RAM available on current "Desktop-machines", which I assume is the target-hardware, Aloks solution is finally running on). This would clearly separate "File-interaction" too - first the FileRead on the ImportFile - then the indirect File-Write-actions over the SQLite-inserts. In case one does this "single-Row-Based", then the (File-)System would have to switch between small Read- and Write-Ops in a pretty high frequency - reading and writing only small buffers - and that's not good for performance. So in case the Import-Files are larger, you're entirely right - one should read larger buffers from the Import-Files (so that for example 1000-4000 Rows/Records fit in) - then perform the Row/Col splitting for this amount of records in Memory - and "put all of them out" in one single SQLite- transaction. This way the Filesystem-switching between Reads and Writes would not have such a high frequency - also operating on larger buffers (or "SQLite-Journal-Pages"). This latter approach would cause a bit more "coding- efforts", since you would have to take care on "half- read-rows between InputBuffers", but this is possible to handle and worthwhile for the over-all-performance. On embedded systems your "single-row-reading"- recommendation makes a whole lot more sense of course (with regards to mem-consumption during the import-process). Maybe Alok shades some light on his target-hardware - since VB.NET can be used also on the "compact- framework" (on embedded hardware) - although I think, that he's targeting "Desktop-environments". Olaf _______________________________________________ sqlite-users mailing list [email protected] http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

