Re: [sqlite] Testing performance hypothesis of scanning a table with no indexes versus reading same data from a file

Stelios Bounanos Mon, 21 Jul 2014 15:46:12 -0700

> On Mon, 21 Jul 2014 13:05:13 +0000, Markus Schaber <m.scha...@codesys.com> 
> said:


> Hi,
> Von: David Canterbrie

>> I've been tasked with trying to understand how much of a performance hit one
>> would get if one had to scan a table in its entirety versus reading the same
>> data stored as a new-line (or some sort like that) from a file.
>> 
>> The hypothesis I suppose we're trying to understand is that reading
>> sequentially from SQLite (without indices) should be comparable to reading
>> from a file that has the same data +/- 1-2%
>> 
>> My first question is that does sound reasonable, and has someone ever done
>> such a test?

> I guess this highly depends on your data format and parser code.

> The author of 
> http://sebastianraschka.com/Articles/sqlite3_database.html#results
> claims a factor ~20 speed advantage for SQLite.

I read that page when if was first posted here a few days ago and found
that claim extremely hard to believe (with those particular programs and
that data), not the least because query_sqlite_db.py executes a query
that never fetches anything from the database (the rows were inserted
with feature1=Yes and the select query binds the parameter to "YES").

Just for fun... read_lines.py takes just over 1 second on a somewhat
slower Xeon, for a 121MiB file containing 6M lines of the format
``this is line %07d\n''.  As does query_sqlite.py when it's looking for
"YES" and returning nothing.  With the correct binding it takes about
10s.  Warm cache in all cases.

And a cpython program that simply increments a counter 6M times takes
about 0.7s on the same machine.


Regards,

-- 
Stelios.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Testing performance hypothesis of scanning a table with no indexes versus reading same data from a file

Reply via email to