"Simon Slavin" <[email protected]> schrieb im
Newsbeitrag news:[email protected]...

> Alok is writing in a language which doesn't compile very well.
Nah, even with VB.NET it should be possible,
to bring the import "up to speed" ... at least a factor 20
improvement should be achievable even there.

> And he has another system which splits his data up
> into smaller chunks (presumably the 10.5 rows)
> before it gets to the one which uses SQLite.
As I read his replies, he was talking about the import-
time for 10500 test-rows (tab-delimited-records).
Do you mean the "Sub-Splitting" of a single "import-row"
into "column-values"?

> But it might be worth trying larger chunks, since much of
> the 35 seconds could be overhead and doubling the
> amount of data might not take much more time.
Yep, that's what I was trying to point out (indirectly)... ;-)

Garrys advise (to read in the entire File beforehand - and
perform the Splitting InMemory) is not that bad in terms
of performance, as long as the Input-Files come in sizes
below - let's say - 4-8 MB (given the amount of RAM
available on current "Desktop-machines", which I assume
is the target-hardware, Aloks solution is finally running on).

This would clearly separate "File-interaction" too -
first the FileRead on the ImportFile - then the indirect
File-Write-actions over the SQLite-inserts.

In case one does this "single-Row-Based", then the
(File-)System would have to switch between small
Read- and Write-Ops in a pretty high frequency -
reading and writing only small buffers - and that's not
good for performance.

So in case the Import-Files are larger, you're entirely right -
one should read larger buffers from the Import-Files (so
that for example 1000-4000 Rows/Records fit in) - then
perform the Row/Col splitting for this amount of records
in Memory - and "put all of them out" in one single SQLite-
transaction. This way the Filesystem-switching between
Reads and Writes would not have such a high frequency -
also operating on larger buffers (or "SQLite-Journal-Pages").

This latter approach would cause a bit more "coding-
efforts", since you would have to take care on "half-
read-rows between InputBuffers", but this is possible
to handle and worthwhile for the over-all-performance.

On embedded systems your "single-row-reading"-
recommendation makes a whole lot more sense of
course (with regards to mem-consumption during
the import-process).

Maybe Alok shades some light on his target-hardware -
since VB.NET can be used also on the "compact-
framework" (on embedded hardware) - although I
think, that he's targeting "Desktop-environments".

Olaf





_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to