Re: [sqlite] SQLite JDBC - org.sqlite.NativeDB.step taking up too much time

2009-09-27 Thread George Bills
This makes a lot of sense now - I haven't read up on the C interface and 
assumed that "step" was a JDBC thing. Thanks for clearing that up.

George.

On 09/26/2009 01:01 AM, Pavel Ivanov wrote:
>> Does anyone know why this method might be so expensive?
>>  
> Because it's main method to execute your queries.
> I believe I can safely assume that org.sqlite.NativeDB.step maps
> directly to C call sqlite3_step() and it's the only function where
> query is executed. It doesn't just iterate through resultset it
> obtains this resultset it too.
>
> Pavel
>
> On Thu, Sep 24, 2009 at 11:24 PM, George Bills<gbi...@funnelback.com>  wrote:
>
>> Hi everyone - this might be a general JDBC question rather than an
>> SQLite specific question, but I'm not sure.
>>
>> I've got an app that's writing to an SQLite database using the latest
>> SQLite JDBC driver ("v056, based on SQLite 3.6.14.2") from
>> http://www.zentus.com/sqlitejdbc/.
>>
>> I've done some basic profiling using hprof, and initially things are
>> fine. But then my tables get quite large (e.g. 26 million / 7 million /
>> 6 million rows), and after this point, I find that a huge amount of my
>> apps time is being spent in "org.sqlite.NativeDB.step".
>>
>> Does anyone know why this method might be so expensive? I probably need
>> to dig deeper with my profiling and find out exactly which methods are
>> calling the expensive "step"s, but I'm fairly new to working with JDBC
>> and any hints or suggestions would be helpful.
>>
>> During the applications run, I'll be running a certain amount of selects
>> and inserts, but in all cases, I'll be selecting or inserting 1 row at a
>> time, so I wouldn't expect that iterating over my ResultSet (I assume
>> this is what org.sqlite.NativeDB.step is doing) would be very expensive.
>>
>> The code that does the selection of a row, given an SQL string and an
>> array of string parameters is like:
>>
>>  selectStatement = databaseConnection.prepareStatement(SQL);
>>  setStatementParameters(selectStatement, parameters);
>>  resultSet = selectStatement.executeQuery();
>>  singleResult = parseResultForSingleElement(resultSet,
>> columnName);
>>
>> The code that does the insertion of a row, given an SQL string and an
>> array of string parameters is like:
>>
>>  insertStatement = databaseConnection.prepareStatement(SQL);
>>  setStatementParameters(insertStatement, parameters);
>>  insertStatement.executeUpdate();
>>  resultSet = insertStatement.getGeneratedKeys();
>>  autoincrementPK = parseResultForSingleElement(resultSet, null);
>>
>> parseResultForSingleElement is like:
>>
>>  if (numColumns == 1&&  res.next()) {
>>  E val;
>>  if (columnName != null) {
>>  val = (E) res.getObject(columnName);
>>  } else {
>>  val = (E) res.getObject(1);
>>  }
>>  if (res.wasNull()) {
>>  ret = new SQLVal(ResultStatus.RESULT_WAS_NULL);
>>  } else {
>>  ret = new SQLVal(val);
>>  }
>>  }
>>
>> Thanks for any help.
>> George.
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>>  
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SQLite JDBC - org.sqlite.NativeDB.step taking up too much time

2009-09-24 Thread George Bills
Hi everyone - this might be a general JDBC question rather than an 
SQLite specific question, but I'm not sure.

I've got an app that's writing to an SQLite database using the latest 
SQLite JDBC driver ("v056, based on SQLite 3.6.14.2") from 
http://www.zentus.com/sqlitejdbc/.

I've done some basic profiling using hprof, and initially things are 
fine. But then my tables get quite large (e.g. 26 million / 7 million / 
6 million rows), and after this point, I find that a huge amount of my 
apps time is being spent in "org.sqlite.NativeDB.step".

Does anyone know why this method might be so expensive? I probably need 
to dig deeper with my profiling and find out exactly which methods are 
calling the expensive "step"s, but I'm fairly new to working with JDBC 
and any hints or suggestions would be helpful.

During the applications run, I'll be running a certain amount of selects 
and inserts, but in all cases, I'll be selecting or inserting 1 row at a 
time, so I wouldn't expect that iterating over my ResultSet (I assume 
this is what org.sqlite.NativeDB.step is doing) would be very expensive.

The code that does the selection of a row, given an SQL string and an 
array of string parameters is like:

 selectStatement = databaseConnection.prepareStatement(SQL);
 setStatementParameters(selectStatement, parameters);
 resultSet = selectStatement.executeQuery();
 singleResult = parseResultForSingleElement(resultSet, 
columnName);

The code that does the insertion of a row, given an SQL string and an 
array of string parameters is like:

 insertStatement = databaseConnection.prepareStatement(SQL);
 setStatementParameters(insertStatement, parameters);
 insertStatement.executeUpdate();
 resultSet = insertStatement.getGeneratedKeys();
 autoincrementPK = parseResultForSingleElement(resultSet, null);

parseResultForSingleElement is like:

 if (numColumns == 1 && res.next()) {
 E val;
 if (columnName != null) {
 val = (E) res.getObject(columnName);
 } else {
 val = (E) res.getObject(1);
 }
 if (res.wasNull()) {
 ret = new SQLVal(ResultStatus.RESULT_WAS_NULL);
 } else {
 ret = new SQLVal(val);
 }
 }

Thanks for any help.
George.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Sqlite profiler

2009-09-24 Thread George Bills
On 09/23/2009 11:16 PM, mcnamaragio wrote:
> Hello,
>
> Would anyone be interested in sqlite profiler? If yes what features would
> you expect from it?
>
> Thank you.
>
Yes, I would be interested. A high level breakdown of what SQLite is 
doing, and the time it's taking to do it would be useful for me - e.g. 
"finding appropriate rows from index, 3.123 seconds, grouping by rows 
and summing column, 2.456 seconds, ordering column by that sum, 2.987 
seconds, taking top 10 rows, 0.123 seconds, joining to table B, 0.927 
seconds" would be kind of what I'd be hoping for.

As for features, I wouldn't want much beyond the above. It wouldn't have 
to be graphical, or need to attach to a running query / SQLite session 
or anything like that for it to be useful to me.

I can probably find most of the "what is SQLite doing" info by looking 
at the "EXPLAIN" output, but so far I haven't put any effort into 
learning the opcodes. It seems like that output is a little more 
low-level than what I'd really want, and it doesn't give me the running 
time of each phase. Not to say that it isn't useful, but a profiler that 
fits somewhere in between "EXPLAIN" and "EXPLAIN QUERY PLAN" would be 
great for my needs.

George.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Undesired query plan when using GROUP BY

2009-09-13 Thread George Bills
Hi all, I was wondering if someone here could help me with understanding 
/ solving an sqlite problem I'm having. Any help would be appreciated.

=SCHEMA=

CREATE TABLE pop_words_wpk (
word_id INTEGER PRIMARY KEY,
occurrences INTEGER
);
CREATE INDEX pop_words_wpk_oidx ON pop_words_wpk(occurrences);

=QUERY ONE - FAST=

Query: "SELECT word_id, occurrences FROM pop_words_wpk ORDER BY 
occurrences DESC LIMIT 10;"
Time: 0.005s

Query plan is:
0|0|TABLE pop_words_wpk WITH INDEX pop_words_wpk_oidx ORDER BY

=QUERY TWO - SLOW=

Query: "SELECT word_id, SUM(occurrences) FROM pop_words_wpk GROUP 
BY(word_id) ORDER BY SUM(occurrences) DESC LIMIT 10;"
Time: 7s (1400 times slower)

Query plan is:
0|0|TABLE pop_words_wpk USING PRIMARY KEY ORDER BY



Both of these queries return exactly the same result, but the first 
query is much, much faster.

My assumption is that for the first query, sqlite realizes that it can 
just grab the rows with the 10 highest occurrences in the occurrences 
index. In the second query, it doesn't realize that, and grabs every 
single row, iterating over the word_id PRIMARY KEY. My guess is that 
sqlite isn't built to understand that word_id is unique (i.e. GROUP BY 
is guaranteed to do absolutely nothing in this case).

Is there any way of tricking sqlite into running the same query plan as 
in the first query? I've regenerated my database and ran the queries 
again using the newest sqlite (3.6.18), and that didn't change either of 
the plans. Running ANALYZE didn't improve things either. I also get the 
same behavior when using a unique index instead of a primary key.

I could just change my application so that it understands when it can 
get away without using GROUP BY and SUM (in the real application, there 
are some slightly different tables that might actually need a GROUP BY 
in some circumstances), but it seems a little messy to me, especially if 
someone here knows of a much "neater" solution.

On a slightly related note, the EXPLAIN documentation states that the 
"EXPLAIN QUERY PLAN['s] behavior is undocumented, unspecified, and 
variable" - should I just ignore the numbers shown in the query plan, or 
could they be useful for my understanding of what sqlite is doing?

Thanks,
George.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users