Re: [sqlite] Is it possible to optimize this query on a very large database table.

Frank Chang Mon, 10 Oct 2011 09:49:13 -0700

Igor Tandetnik,
The fieldname groups in our BlobLastNameTable consist of
multiple rows where each pair of columns [FieldName, BLOB[Vertices]] is unique.
Therefore, every fieldname group does not just have a single row but instead
1000's or 10000's rows. So that is why we use a group by/order by and subselect
clause to locate the first/minimum row id row in each fieldname group.
Once we know the first/minimum row id of each unique fieldname
group, we would lke to write a sqlite UPDATE Statement to accumulate all the
BLOB vertices of all the rows with that unique fieldname into the first(i.e
MIN(ROWID)) row's BLOB(Vertices)column for each unique fieldname group. Then
we would like to discard all the rows in each fieldname group of rows that
have an rowid different from the first row(i.e MIN(rowid)).
Because we using a C++ WINDOWS/LINUX/Solaris UNIX multithreaded
program where each concurrent worker thread has its own sqlite database and
sqlite table and sqlite index , we would like each concurrence worker thread to
run as fast as possible on a multicore CPU. We have profiled the worker threads
and we have found that the sqlite statements are the bottleneck. So, that is
why we would like our queries to run as fast as possible by avoiding full index
scans as Florian Weimer pointed out in the post about Postgres' DISTINCT ON
feature which we are trying to simulate on Sqlite.
I hope I have provided you more information. I did try your
suggestion: select FieldName, rowid from BlobLastNameTest. However, it
generates 5.5 million rows of output which would force our C++ Worker threads
to do the BLOB(Vertices) aggregation. Unfortunately , that would would take too
long so we were hoping that we could devise a nice query to let the SQLite
query processor do all the Blob(Vertices) aggregation in a few minutes or less
. Please let me know if you have any questions or suggestions. Thank you for
your help.
--

--------------------------------------------------------------------------------------------------------------------------
I take it back - the behavior of your query is well-defined, for the simple
reason that FieldName is unique,
so every group only has a single row in it (as someone else has kindly pointed
out - I missed this detail on the
first reading). For that same reason, the whole GROUP BY and sub-select dance
is completely pointless.
Your query is just a very elaborate and wasteful way to write


select FieldName, rowid from BlobLastNameTest; 

----------------------------------------------------------------------------------------------------------------------------
                                      
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Is it possible to optimize this query on a very large database table.

Reply via email to