Re: [sqlite] How do I update multiple rows in a single sql statement

2014-12-11 Thread James K. Lowden
On Tue, 09 Dec 2014 10:46:23 -0500
Igor Tandetnik  wrote:

> On 12/9/2014 10:38 AM, James K. Lowden wrote:
> > If the subquery to the right of the SET clause produces
> > more than one row, the statement fails.
> 
> Are you sure? Normally, a scalar subquery doesn't fail when the 
> resultset contains more than one row - it just silently produces the 
> value from the first row of the first column. 

Well, I *was* sure.  I don't know about "normally", but you're right
that SQLite gets it wrong, see below.  I'm pretty sure the standard
calls for a diagnostic anywhere a scalar is required and not provided.  

There is a workaround worth knowing: if you add, 

group by k having count(*) = 1

to the UPDATE statement below, it works correctly in the sense that
it becomes deterministic.  A separate check is required of course to
determine if there were any count(*) > 1.  

[snip]
create table T ( k int primary key, v string );
create table S ( k int, v string, primary key( k,v) );
insert into T values (1, 'a'), (2, 'b');
insert into S values (1, 'y'), (1, 'z');
select * from T;
k   v 
--  --
1   a 
2   b 
select * from S;
k   v 
--  --
1   y 
1   z 
select * from T join S on T.k = S.k;
k   v   k   v 
--  --  --  --
1   a   1   y 
1   a   1   z 
update T 
set v = (select v from S where k = T.k)
where exists (
  select 1 
  from S where k = T.k
);
select * from T;
k   v 
--  --
1   y 
2   b 
[pins]

--jkl
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bugreport - slowdown in sqlite after the ANALYZE statement

2014-12-11 Thread James K. Lowden
On Wed, 10 Dec 2014 08:49:21 +0100
Eduardo Morras  wrote:

> > Why ORDER BY on INSERT?  Does it work better?  I would expect the
> > unnecessary sort to be pure overhead.  
> 
> If you insert in correct index order, the index update phase is
> faster because it don't need rebalance the b-tree so often after each
> insert.

OK, but at the cost of sorting the input first.  Rebalancing a tree
requires diddling a few pointers.  Sorting in the average case is 
O(log N), and possibly worse, plus the attendant I/O.  

--jkl

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Online/Hot backup of WAL journalling mode database

2014-12-11 Thread Nick

On 11 Dec 2014, at 20:39, David King wrote:

> Why are you trying to hard to avoid using the backup API? It sounds like it 
> does exactly what you want

Backup API works great if you have periods of no writing. However, if a process 
writes during the backup then the API would stop and start over again. So if 
you have frequent writes then theoretically the backup API would not complete. 

In an ideal world the backup API would only copy pages altered during the write 
rather than start over.

Regards
Nick
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Online/Hot backup of WAL journalling mode database

2014-12-11 Thread David King
Why are you trying to hard to avoid using the backup API? It sounds like it 
does exactly what you want



On 11 Dec 2014, at 12:36, Nick  wrote:

> 
> On 11 Dec 2014, at 10:43, Simon Slavin wrote:
> 
>> 
>> I don't know enough about the internals of SQLite to be sure, but various 
>> parts of me are concerned that this is a bad idea.  I don't know what WAL 
>> mode would be like without checkpointing but there has to be a reason for 
>> checkpointing and disabling it between backups sounds bad.
>> 
> 
> I would say the docs (https://www.sqlite.org/wal.html#how_to_checkpoint) do 
> not imply application initiated checkpoints is a bad idea.
> 
> Regards
> Nick
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Online/Hot backup of WAL journalling mode database

2014-12-11 Thread Nick

On 11 Dec 2014, at 10:43, Simon Slavin wrote:

> 
> I don't know enough about the internals of SQLite to be sure, but various 
> parts of me are concerned that this is a bad idea.  I don't know what WAL 
> mode would be like without checkpointing but there has to be a reason for 
> checkpointing and disabling it between backups sounds bad.
> 

I would say the docs (https://www.sqlite.org/wal.html#how_to_checkpoint) do not 
imply application initiated checkpoints is a bad idea.

Regards
Nick
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Online/Hot backup of WAL journalling mode database

2014-12-11 Thread Nick

On 11 Dec 2014, at 10:08, Dan Kennedy wrote:

> On 12/11/2014 05:49 AM, Nick wrote:
>> On 10 Dec 2014, at 07:35, Dan Kennedy wrote:
>> 
>>> Strictly speaking the database file may not be well-formed even if there is 
>>> no ongoing checkpoint. If:
>>> 
>>>  a) process A opens a read transaction,
>>>  b) process B opens and commits a write transaction to the database,
>>>  c) process C checkpoints the db,
>>> 
>>> then the db file considered without the *-wal file may be corrupt. The 
>>> problem comes about because process C can only checkpoint frames up until 
>>> the start of B's transaction. And there is an optimization that will 
>>> prevent it from copying any earlier frames for which there exists a frame 
>>> in B's transaction that corresponds to the same database page. So it 
>>> effectively copis only a subset of the modifications made by earlier 
>>> transactions into the db file - not necessarily creating a valid db file.
>> Can this corruption be detected by running PRAGMA quick_check / 
>> integrity_check? Having the occasional backup db corrupted would be 
>> tolerable.
> 
> In many cases, but not generally. There would exist cases where a part of a 
> committed transaction was lost, or the values in unindexed columns where 
> replaced, that sort of thing.

Ok. Presumably a SQLITE_CHECKPOINT_FULL or SQLITE_CHECKPOINT_RESTART 
checkpoint mode would ensure the db file is valid?

Regards
Nick
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to speed up database open

2014-12-11 Thread Paul

> 
> On 2014/12/11 17:58, Paul wrote:
> >
> >> On 2014/12/11 13:51, Paul wrote:
> >> I have yet to try and test if dropping stat tables worth the effort. Some 
> >> databases in fact can grow pretty big, up to few 
> >> hundred of megabytes// 
> 
> In that case maybe keep the Stat1 tables and there is also the option of 
> using stuff like "USING" clauses and "LIKELY" or "UNLIKELY" 
> planner directives in your queries to force a tried and tested query plan on 
> the QP in lieu of using the stat tables - but now you 
> are getting very hands-on with your data and no longer leaving it up to the 
> internal wisdom of SQLite - something I don't usually 
> advocate, but as you rightly observed - your case is quite special.
> 

I am going to disable stat4 and keep only stat1. Yes, I could of use USING and 
co.
But as you correctly pointed out, sqlite does very good job optimizing queries.
I trust sqlite more than I trust my intuition.
>From my personal experience, intuition does not perform very well in complex 
>systems with a lot of variables.

Best regards,
Paul
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Counting rows

2014-12-11 Thread Adam Devita
>From previous reading (years ago on this list)
I normally do

select count(1) from tableName ;
to count the rows in a table.

as an alternate, select count(primary_key_or_SomeIndexName) from tableName
when trying to get an actual count.

beware:
select count(someField) from table; will not count rows where someField is
null
select count(1) from table; will.

Adam





On Thu, Dec 11, 2014 at 11:39 AM, Dominique Devienne 
wrote:

> On Thu, Dec 11, 2014 at 4:19 PM, Simon Slavin 
> wrote:
>
> > In my table which had about 300 million (sic.) rows I did this
> > SELECT count(*) FROM myTable;
> > to count the number of rows.  After half an hour it was still processing
> > and I had to kill it.
> >
>
> I have a little utility that connects to Oracle, and does a big UNION ALL
> query to get the counts of all my tables (82 currently):
>
> TOTAL: 1,900,343 rows in 20 tables (out of 82)
> 0.129u 0.051s 0:00.66 25.7% 0+0k 32+32io 0pf+0w  (COLD)
> 0.128u 0.045s 0:00.34 47.0% 0+0k 0+32io 0pf+0w (HOT)
>
> Granted, it's not 300M rows, just 1.9M, but that's 660ms (340ms when in
> cache), and that's counting the startup and connect time (~ 170ms).
>
> The plan is INDEX (FAST FULL SCAN) of the PK (a raw(16) for a GUID). FWIW,
> for context/comparison. --DD
>
> PS: I was actually surprised it was that cheap.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
--
VerifEye Technologies Inc.
151 Whitehall Dr. Unit 2
Markham, ON
L3R 9T1
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Counting rows

2014-12-11 Thread Simon Slavin

On 11 Dec 2014, at 4:39pm, Dominique Devienne  wrote:

> I have a little utility that connects to Oracle, and does a big UNION ALL
> query to get the counts of all my tables (82 currently):

Yeah, it's easy in Oracle.  The problem is that SQLite3 uses a tree to store 
lists, and it does not store the total number of entries separately.  So to 
count the number of rows in a table SQLite has to walk the entire tree: go up 
and down all the branches to find which rows exist, whether any have been 
deleted, etc..

SQLite4 uses a different file format and I understand it does not have this 
problem.  Which doesn't help me at all right now.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Counting rows

2014-12-11 Thread Dominique Devienne
On Thu, Dec 11, 2014 at 4:19 PM, Simon Slavin  wrote:

> In my table which had about 300 million (sic.) rows I did this
> SELECT count(*) FROM myTable;
> to count the number of rows.  After half an hour it was still processing
> and I had to kill it.
>

I have a little utility that connects to Oracle, and does a big UNION ALL
query to get the counts of all my tables (82 currently):

TOTAL: 1,900,343 rows in 20 tables (out of 82)
0.129u 0.051s 0:00.66 25.7% 0+0k 32+32io 0pf+0w  (COLD)
0.128u 0.045s 0:00.34 47.0% 0+0k 0+32io 0pf+0w (HOT)

Granted, it's not 300M rows, just 1.9M, but that's 660ms (340ms when in
cache), and that's counting the startup and connect time (~ 170ms).

The plan is INDEX (FAST FULL SCAN) of the PK (a raw(16) for a GUID). FWIW,
for context/comparison. --DD

PS: I was actually surprised it was that cheap.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Counting rows

2014-12-11 Thread Simon Slavin

On 11 Dec 2014, at 3:58pm, Paul Sanderson  wrote:

> would count _rowid_ from mytable be quicker

Hmm.  Given that these tables have the normal use of rowid, and that rows in 
this table are only inserted, never deleted, I wonder whether

SELECT max(rowid) FROM myTable

would have given the right result, almost instantly.  Can't check it now, but 
thanks for the idea, Paul.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Select count(*)

2014-12-11 Thread Richard Hipp
On Thu, Dec 11, 2014 at 11:27 AM, Dominique Devienne 
wrote:

>
> So why couldn't sqlite using the PK index to reduce the IO when doing a
> "select count(*) from t_with_non_int_pk", to avoid scanning the table?
>

It does.
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to speed up database open

2014-12-11 Thread RSmith


On 2014/12/11 17:58, Paul wrote:



On 2014/12/11 13:51, Paul wrote:
I have yet to try and test if dropping stat tables worth the effort. Some databases in fact can grow pretty big, up to few 
hundred of megabytes// 


In that case maybe keep the Stat1 tables and there is also the option of using stuff like "USING" clauses and "LIKELY" or "UNLIKELY" 
planner directives in your queries to force a tried and tested query plan on the QP in lieu of using the stat tables - but now you 
are getting very hands-on with your data and no longer leaving it up to the internal wisdom of SQLite - something I don't usually 
advocate, but as you rightly observed - your case is quite special.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Select count(*)

2014-12-11 Thread Marc L. Allen
I believe that when NULLs are allowed as PKs, they are all distinct.  So, you 
can multiple rows with a NULL value as the PK.

-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Dominique Devienne
Sent: Thursday, December 11, 2014 11:27 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Select count(*)

On Thu, Dec 11, 2014 at 5:03 PM, Gabriel Corneanu  wrote:

> I asked a similar question some time ago...
> See here:
> https://www.mail-archive.com/sqlite-users@sqlite.org/msg77488.html



>> not a problem for rowid/pk (which are not allowed to be NULL), but it 
>> matters a lot in the general case.

> DRH write:

> PRIMARY KEYs (except for INTEGER PRIMARY KEYs) are allowed to be NULL 
> in SQLite.  This goes back to a bug in the code from many years ago.  
> By the time the bug was discovered, SQLite was already in wide-spread 
> use and so the decision was made to not fix the bug since doing so 
> would cause

> compatibility problems.

it's a bit sad to carry "fundamental" baggage like, in all cases.

I can see why it would stay the default, but surely a pragma could remedy the 
situation to get the best possible conformance?
I know this was discussed before on list, but I for one would welcome this new 
pragma.

To come back on the subject, even if SQLite allows a null PK, there
(hopefully) can be a single row using it, and it (this null row) is still 
indexed, no?
So why couldn't sqlite using the PK index to reduce the IO when doing a "select 
count(*) from t_with_non_int_pk", to avoid scanning the table?

Perhaps it's a naive question, but I don't see why ATM, and would welcome an 
explanation. Thanks, --DD ___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


This email and any attachments are only for use by the intended recipient(s) 
and may contain legally privileged, confidential, proprietary or otherwise 
private information. Any unauthorized use, reproduction, dissemination, 
distribution or other disclosure of the contents of this e-mail or its 
attachments is strictly prohibited. If you have received this email in error, 
please notify the sender immediately and delete the original.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to speed up database open

2014-12-11 Thread Paul
> On Thu, Dec 11, 2014 at 10:58 AM, Paul  wrote:
> 
> >
> > I have yet to try and test if dropping stat tables worth the effort.
> >
> 
> Most of the work is involved in loading sqlite_stat4. On the other hand,
> most of the benefit comes from sqlite_stat1. So consider compiling without
> SQLITE_ENABLE_STAT4. You will still probably get good query plans, but the
> startup time should be reduced.
> 

Thanks you for analysis of the data, Richard. 
I am going to try your advice and post the results.

Best regards,
Paul

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Select count(*)

2014-12-11 Thread Dominique Devienne
On Thu, Dec 11, 2014 at 5:03 PM, Gabriel Corneanu  wrote:

> I asked a similar question some time ago...
> See here:
> https://www.mail-archive.com/sqlite-users@sqlite.org/msg77488.html



>> not a problem for rowid/pk (which are not allowed to be NULL), but it
>> matters a lot in the general case.

> DRH write:

> PRIMARY KEYs (except for INTEGER PRIMARY KEYs) are allowed to be NULL in
> SQLite.  This goes back to a bug in the code from many years ago.  By the
> time the bug was discovered, SQLite was already in wide-spread use and so
> the decision was made to not fix the bug since doing so would cause

> compatibility problems.

it's a bit sad to carry "fundamental" baggage like, in all cases.

I can see why it would stay the default, but surely a pragma could remedy
the situation to get the best possible conformance?
I know this was discussed before on list, but I for one would welcome this
new pragma.

To come back on the subject, even if SQLite allows a null PK, there
(hopefully) can be a single row using it, and it (this null row) is still
indexed, no?
So why couldn't sqlite using the PK index to reduce the IO when doing a
"select count(*) from t_with_non_int_pk", to avoid scanning the table?

Perhaps it's a naive question, but I don't see why ATM, and would welcome
an explanation. Thanks, --DD
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Counting rows

2014-12-11 Thread Nelson, Erik - 2
Simon Slavin wrote on Thursday, December 11, 2014 10:19 AM
> I know that the internal structure of a table means that this number
> isn't simple to produce.  But is there really no faster way ?  This
> table is going to have about six times that amount soon.  I really
> can't count the rows in less than a few hours ?
> 

Might it be possible to write a trigger that keeps track of 
insertions/deletions and updates a counter in another table?

As another data point, I tested an 8 GB database on a SAN that has about 14 
million rows, count(*) takes about 7 seconds.

Erik

--
This message, and any attachments, is for the intended recipient(s) only, may 
contain information that is privileged, confidential and/or proprietary and 
subject to important terms and conditions available at 
http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended 
recipient, please delete this message.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Counting rows

2014-12-11 Thread RSmith



On 2014/12/11 17:19, Simon Slavin wrote:

In my table which had about 300 million (sic.) rows I did this

SELECT count(*) FROM myTable;

to count the number of rows.  After half an hour it was still processing and I 
had to kill it.

I know that the internal structure of a table means that this number isn't 
simple to produce.  But is there really no faster way ?  This table is going to 
have about six times that amount soon.  I really can't count the rows in less 
than a few hours ?



Not so strange I think...

The highest number of rows I have tried to do maintenance of that sort on was only about 50 million though DB size was around 150GB, 
and the row-count on that took some time to establish, but in the order of minutes, not hours and certainly not days. I have here 
and now only a 10-million row DB to run some quick tests on a machine without SSD or anything good - seems to take around 1 min 20s 
on the first attempt to do a count() and around 33s on the next attempts (I'm assuming caching doing its bit here).


Scaling that up - it becomes apparent that a ~300-mil row DB with 30x the rows than what I am testing should take around 30x the 
time, which is 1m20 x 30 which is around 40 minutes assuming similar hardware.


You probably stopped it just shy of the goal. Either way, 30 minutes and 45 minutes are /exactly/ the same amounts of time when 
measured in impatience base units.


I have no idea how to make it faster or in any way how to improve the speed on 
a query that simple.

Knowing you (a bit) I already know you have thought about all of this a lot and you won't be asking if solutions were easy to come 
by, so my usual advice probably won't help much, other than to ask - do you really need to know the row-count? Is knowing it is 
around 300mil not enough? Any other query you might envision on this data-set will probably take in the order of hours for simple 
ones and days for anything joined. Best is to have a lone-standing machine churn through it over time and getting some results from 
time to time as a long-term project. (I know you are already well-aware of this).


Cheers and best of luck!
Ryan

(PS: I know the above isn't really helpful or qualified as an "answer", sorry 
about that - you are dealing with a special beast indeed).



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to speed up database open

2014-12-11 Thread Richard Hipp
On Thu, Dec 11, 2014 at 10:58 AM, Paul  wrote:

>
> I have yet to try and test if dropping stat tables worth the effort.
>

Most of the work is involved in loading sqlite_stat4.  On the other hand,
most of the benefit comes from sqlite_stat1.  So consider compiling without
SQLITE_ENABLE_STAT4.  You will still probably get good query plans, but the
startup time should be reduced.


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Select count(*)

2014-12-11 Thread Gabriel Corneanu
I asked a similar question some time ago...
See here:
https://www.mail-archive.com/sqlite-users@sqlite.org/msg77488.html

Gabriel
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Counting rows

2014-12-11 Thread Paul Sanderson
would count _rowid_ from mytable be quicker
Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
http://sandersonforensics.com/forum/content.php?168-Reconnoitre - VSC
processing made easy



On 11 December 2014 at 15:19, Simon Slavin  wrote:
> In my table which had about 300 million (sic.) rows I did this
>
> SELECT count(*) FROM myTable;
>
> to count the number of rows.  After half an hour it was still processing and 
> I had to kill it.
>
> I know that the internal structure of a table means that this number isn't 
> simple to produce.  But is there really no faster way ?  This table is going 
> to have about six times that amount soon.  I really can't count the rows in 
> less than a few hours ?
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to speed up database open

2014-12-11 Thread Paul
Hi Ryan, thanks for reply.

> 
> On 2014/12/11 13:51, Paul wrote:
> > In my specific case I need to open database as fast as possible.
> > Usual working cycle: open -> select small data set -> close.
> > It is irrelevant how much time it takes to open database when
> > data is being added or updated, since it happens not too often.
> 
> /Snipped for brevity/
> 
> Hi Paul,
> 
> You seem to know your way around systems so I will not waste time on details. 
> The core of the problem is that it takes time to open 
> a database and file and extra cpu cycles because upon opening an SQLite 
> database much cpu and I/O time is spent reading and 
> interpreting the enitre schema (which sound complex even though the actual 
> data might be small) and then checking for hot-journals, 
> opening accompanying file objects (possibly WAL journals etc.) and some basic 
> maintenance. It then loads the stat tables and not 
> only read the data but use it to set up certain Query planner adjustments 
> which eats a few more cycles (This is something I think 
> happen on startup, but I might be wrong). Also, I think start-up routines is 
> surely not an area of waste but probably not a great 
> focus of fine-tuning performance and optimizations (Richard or Dan might shed 
> more light if I am wrong about this). Ether way, once 
> it is open, the speed is lightning quick, as you have noticed. The point 
> being: It is the opening that eats the cpu time.
> 
> Next point is that you cannot do much about that in terms of your explanation 
> of how you access data, and I won't try to dissuade 
> you from the way you use it. One thing that I notice as a definitive 
> possibility is simply dropping all stat tables from your system 
> and at least saving those reading and adjusting steps. The reason why I 
> suggest this is that you have already done the research and 
> noticed the time degradation due to it, but more importantly, there is no 
> need.
> 
> The stat tables help the Query planner (NGQP as it is officially known these 
> days) to make decisions on making queries on large 
> datasets a bit faster (sometimes a LOT faster). You however do not have any 
> large datasets or performance-bending queries, you have 
> no need for this, your need is more to save those cycles at file-open time. 
> You need something we do not often see in database 
> designs: Quantity over quality - and dropping the stat tables (and the 
> maintenance routines causing them to exist) should do you a 
> favour.
> 
> Best of luck with the implementation!
> Ryan
> 

Thanks for confirmation of my mental model of internals of sqlite :)
Sqlite does all the preparations needed for most optimal performance. And I 
agree that this
is the best choice in overwhelming number of cases. And my case is very very 
specific.
I have yet to try and test if dropping stat tables worth the effort. Some 
databases in fact can grow
pretty big, up to few hundred of megabytes. It is yet unknown how optimal will 
they perform.
But so far, ignoring the CPU overhead, we have a big gain in a disk performance 
area. 
Sqlite performs much much better than original storage.


Best regards,
Paul
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Counting rows

2014-12-11 Thread Simon Slavin
In my table which had about 300 million (sic.) rows I did this

SELECT count(*) FROM myTable;

to count the number of rows.  After half an hour it was still processing and I 
had to kill it.

I know that the internal structure of a table means that this number isn't 
simple to produce.  But is there really no faster way ?  This table is going to 
have about six times that amount soon.  I really can't count the rows in less 
than a few hours ?

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to speed up database open

2014-12-11 Thread RSmith


On 2014/12/11 13:51, Paul wrote:

In my specific case I need to open database as fast as possible.
Usual working cycle: open -> select small data set -> close.
It is irrelevant how much time it takes to open database when
data is being added or updated, since it happens not too often.


/Snipped for brevity/

Hi Paul,

You seem to know your way around systems so I will not waste time on details. The core of the problem is that it takes time to open 
a database and file and extra cpu cycles because upon opening an SQLite database much cpu and I/O time is spent reading and 
interpreting the enitre schema (which sound complex even though the actual data might be small) and then checking for hot-journals, 
opening accompanying file objects (possibly WAL journals etc.) and some basic maintenance. It then loads the stat tables and not 
only read the data but use it to set up certain Query planner adjustments which eats a few more cycles (This is something I think 
happen on startup, but I might be wrong).  Also, I think start-up routines is surely not an area of waste but probably not a great 
focus of fine-tuning performance and optimizations (Richard or Dan might shed more light if I am wrong about this).  Ether way, once 
it is open, the speed is lightning quick, as you have noticed. The point being: It is the opening that eats the cpu time.


Next point is that you cannot do much about that in terms of your explanation of how you access data, and I won't try to dissuade 
you from the way you use it. One thing that I notice as a definitive possibility is simply dropping all stat tables from your system 
and at least saving those reading and adjusting steps. The reason why I suggest this is that you have already done the research and 
noticed the time degradation due to it, but more importantly, there is no need.


The stat tables help the Query planner (NGQP as it is officially known these days) to make decisions on making queries on large 
datasets a bit faster (sometimes a LOT faster). You however do not have any large datasets or performance-bending queries, you have 
no need for this, your need is more to save those cycles at file-open time. You need something we do not often see in database 
designs: Quantity over quality - and dropping the stat tables (and the maintenance routines causing them to exist) should do you a 
favour.


Best of luck with the implementation!
Ryan




___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to speed up database open

2014-12-11 Thread Paul
PART 2

INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','676 1 1','0 330 330','0 327 
330',X'0408040253be558403a9');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','676 1 1','0 661 661','0 655 
661',X'04080402547bf6900b13');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1093 5 1','676 720 723','1 691 
723',X'0409040253aaffe802ac');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1093 5 1','676 915 917','1 805 
917',X'0409040253bfe67d03d2');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1093 5 1','676 955 956','1 824 
956',X'0409040253bff7c203fa');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1093 5 1','676 962 966','1 826 
966',X'0409040253bff8040404');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1093 5 1','676 974 975','1 831 
975',X'0409040253c00551040d');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1093 6 1','676 979 982','1 832 
982',X'0409040253c005520414');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1093 3 1','676 992 992','1 836 
992',X'0409040253c0116e041f');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1093 5 1','676 999 999','1 840 
999',X'0409040253c02e080426');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1093 4 1','676 1320 1323','1 
1069 1323',X'04090402541ace35076c');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1093 6 1','676 1375 1379','1 
1101 1379',X'04090402542c0ede0817');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1093 5 1','676 1419 1421','1 
1121 1421',X'040904025432eca90867');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1093 5 1','676 1428 1432','1 
1124 1432',X'04090402543309980873');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1093 5 1','676 1438 1438','1 
1129 1438',X'040904025433fb84087d');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1093 5 1','676 1622 1623','1 
1246 1623',X'040904025451099009d5');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1093 5 1','676 1635 1637','1 
1251 1637',X'04090402545763c60a07');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1093 1 1','676 1654 1654','1 
1260 1654',X'04090402545786690a1c');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1093 7 1','676 1668 1671','1 
1273 1671',X'04090402546614990a75');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1002 1 1','1769 1985 1985','2 
1559 1985',X'040104020353980b10020e');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1002 1 1','1769 2316 2316','2 
1887 2316',X'040104020354047722065a');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','1002 1 1','1769 2647 2647','2 
2218 2647',X'040104020354597abf0a2c');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','158 1 1','2771 2802 2802','3 
2373 2802',X'04020401271152e64c8f7b');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL_idx','44 37 1','2929 2932 2956','4 
2503 2956',X'0402040227145487fdf10a65');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','2 1','17 17','13 17',X'0301080e');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','2 1','110 110','99 
110',X'03010864');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','1 1','330 330','316 
330',X'030201013d03');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','1 1','661 661','643 
661',X'0302090284');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','1 1','992 992','974 
992',X'03020903cf');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','1 1','1323 1323','1300 
1323',X'0302090515');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','2 1','1549 1549','1526 
1549',X'03020805f7');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','2 1','1652 1652','1624 
1652',X'0302080659');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','1 1','1654 1654','1625 
1654',X'030201065a03');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','2 1','1738 1738','1704 
1738',X'03020806a9');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','2 1','1982 1982','1944 
1982',X'030201079903');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','1 1','1985 1985','1946 
1985',X'030202079b2714');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','2 1','1991 1992','1952 
1992',X'03020207a12711');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','2 1','2015 2015','1974 
2015',X'03020107b703');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','2 1','2023 2024','1981 
2024',X'03020207be2711');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','2 1','2025 2025','1982 
2025',X'03020107bf03');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','1 1','2316 2316','2269 
2316',X'03020908de');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','2 1','2555 2555','2508 
2555',X'03020809cd');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','1 1','2647 2647','2598 
2647',X'0302010a2703');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','2 1','2703 2703','2654 
2703',X'0302080a5f');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','2 1','2850 2850','2800 
2850',X'0302080af1');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','2 1','2856 2856','2805 
2856',X'0302010af603');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','2 1','2870 2870','2817 
2870',X'0302010b0203');
INSERT INTO sqlite_stat4 VALUES('LLL','LLL','2 1','2956 2957','2900 
2957',X'0302020b582711');
INSERT INTO sqlite_stat4 VALUES('MMM','MMM_idx1','43

Re: [sqlite] How to speed up database open

2014-12-11 Thread Paul
 Hello, Richard, thanks for quick reply.

Unfortunately, no, there is no way. On our servers we have big number
of entities that represent client data. Data for different clients can
be read at any given point of time by clients and by a bunch of daemons 
responsible for different maintenance jobs. Recently we have switched 
our central index file from hand made, transation-less format to sqlite.
We have daemon that provides API both for clients and for maintenance daemons.
So, there is roughly 500 requests per second to the daemon. Most of requests
require simple data to be extracted, but since now it takes more time to read
that data, we observe considerable CPU usage growth. This growth is not
sever for us at all. But I am a performance paranoid type of a person :D
So that's why I am curios if maybe anything can be tuned inside of sqlite.
And seeing how removing stat table reduces time 4x times I thought
maybe there are other ways around.

Unfortunately database structure is a of commercial secret.
But I hope modified content of sqlite_stat* will shed some light.

PART 1

INSERT INTO sqlite_stat1 VALUES('AAA','AAA','4711 2 2 1');
INSERT INTO sqlite_stat1 VALUES('BBB','BBB_idx','18249 3');
INSERT INTO sqlite_stat1 VALUES('BBB','BBB','18249 22 1');
INSERT INTO sqlite_stat1 VALUES('CCC','CCC_idx','54 3 1');
INSERT INTO sqlite_stat1 VALUES('DDD',NULL,'478');
INSERT INTO sqlite_stat1 VALUES('EEE','EEE_idx','836 1');
INSERT INTO sqlite_stat1 VALUES('FFF',NULL,'16');
INSERT INTO sqlite_stat1 VALUES('GGG','GGG','7 1');
INSERT INTO sqlite_stat1 VALUES('HHH',NULL,'13');
INSERT INTO sqlite_stat1 VALUES('III','III_idx','2918 2 1');
INSERT INTO sqlite_stat1 VALUES('JJJ','JJJ','2915 1 1');
INSERT INTO sqlite_stat1 VALUES('KKK','KKK_idx','7070 3');
INSERT INTO sqlite_stat1 VALUES('KKK','KKK','7070 1');
INSERT INTO sqlite_stat1 VALUES('LLL','LLL_idx','2973 595 2 1');
INSERT INTO sqlite_stat1 VALUES('LLL','LLL','2973 2 1');
INSERT INTO sqlite_stat1 VALUES('MMM','MMM_idx1','7942 3 3');
INSERT INTO sqlite_stat1 VALUES('MMM','MMM_idx2','7942 1324 1');
INSERT INTO sqlite_stat1 VALUES('MMM','MMM_idx3','7942 3 1');
INSERT INTO sqlite_stat1 VALUES('NNN','NNN','223 2 1');
INSERT INTO sqlite_stat1 VALUES('OOO',NULL,'6');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','14 14 1','306 306 315','216 216 
315',X'0402010200d9020098');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','12 12 1','445 445 449','286 286 
449',X'04020101011f0270');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','11 11 1','520 520 523','313 313 
523',X'04020101013a0267');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','9 9 1','600 600 602','341 341 
602',X'0402010101560261');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','11 11 1','702 702 709','386 386 
709',X'040201020183020096');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','11 11 1','797 797 802','432 432 
802',X'0402010201b1020094');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','1 1 1','1047 1047 1047','557 557 
1047',X'04020101022e0209');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','2 2 1','1571 1571 1571','946 946 
1571',X'0402010103b30206');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','12 12 1','1733 1733 1739','1090 
1090 1739',X'040201020443020094');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','15 15 1','1882 1882 1889','1168 
1168 1889',X'0402010104910277');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','9 9 1','1916 1916 1917','1183 1183 
1917',X'0402010104a00261');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','11 11 1','2040 2040 2042','1250 
1250 2042',X'0402010104e30276');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','4 4 1','2092 2092 2095','1277 1277 
2095',X'0402010204fe020223');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','11 11 1','2210 2210 2220','1363 
1363 2220',X'040201020554020234');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','9 9 1','2606 2606 2611','1621 1621 
2611',X'0402010206560200a3');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','4 4 1','2618 2618 2619','1625 1625 
2619',X'04020102065a020093');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','9 9 1','2932 2932 2935','1801 1801 
2935',X'04020101070b0270');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','1 1 1','3143 3143 3143','1930 1930 
3143',X'04020102078f020163');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','11 11 1','3276 3276 3285','1992 
1992 3285',X'0402010207d302024d');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','1 1 1','3667 3667 3667','2262 2262 
3667',X'0402010208e4020163');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','11 11 1','3727 3727 3731','2292 
2292 3731',X'0402010109020275');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','11 11 1','3950 3950 3955','2436 
2436 3955',X'0402010209970200f4');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','4 4 1','4189 4189 4191','2578 2578 
4191',X'040201020a2b020089');
INSERT INTO sqlite_stat4 VALUES('AAA','AAA','11 11 1','4649 4649 4659','2836 
2836 4659',X'040201020b46020340');
INSERT INTO sqlite_stat4 VALUES('BBB','BBB_idx','15 1','204 216','88 
216',X'030601526e1cde35d6764d25');
INSERT INTO sqlite_stat4 V

Re: [sqlite] backup fails despite large timeout

2014-12-11 Thread Stephen Chrzanowski
That'd depend on the size of the database, the medias speed you're writing
to, and what kind of actions are happening during the backup.

If your file size is in the GB range, and you're transferring over a
100mbit switch, if you have a SINGLE write per minute, your backup is going
to restart.  Reading is one thing and shouldn't restart the backup, but any
writes will cause the backup to start from byte #1 and repeat the process.

On Wed, Dec 10, 2014 at 11:58 AM, Greg Janée  wrote:

> Hi, I'm using the following code to backup a 300MB database nightly:
>
> #!/bin/bash
> sqlite3 {dbfile} < .timeout 60
> .backup {backupfile}
> EOF
>
> This works most of the time, but about one out of 10 attempts results in a
> "database is locked" error.  When I look at the server logs, I see only
> very light activity at the time the backup was attempted--- maybe only one
> transaction every 5-10s for example.  And these transactions are all very
> short, in the millisecond range.  Shouldn't a 10min timeout be way more
> than sufficient for the backup to succeed?
>
> Thanks,
> -Greg
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] replace many rows with one

2014-12-11 Thread Simon Slavin

On 10 Dec 2014, at 3:40pm, RSmith  wrote:

> INSERT INTO s2merged SELECT a, b, sum(theCount) FROM s2 GROUP BY a,b;

Thanks to Martin, Hick and R for this solution.  It was just what I was looking 
for.

> Not sure if your theCount field already contains totals or if it just has 
> 1's...  how did duplication happen? 

The existing rows contain totals.  Or maybe I should call them subtotals.  The 
data is being massaged from one format to another.  I did a bunch of stuff when 
it was text files, then imported it into SQLite and did a bunch more on it as 
rows and columns.  Eventually it'll end up in SQLite.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] backup fails despite large timeout

2014-12-11 Thread Greg Janée
Hi, I'm using the following code to backup a 300MB database nightly:

#!/bin/bash
sqlite3 {dbfile} 

Re: [sqlite] How to speed up database open

2014-12-11 Thread Paul
Hello, Simon.

> 
> On 11 Dec 2014, at 11:51am, Paul  wrote:
> 
> > I understand, that having them is a must for a decent performance. 
> > In my specific case I have millions of individual database files. 
> > This is one, among other reasons that I can't keep them open all the time. 
> > Just too many of them. These databases are being opened frequently. 
> > Let's say 500 times per second. In most cases, just to query a single row. 
> > Ironically, querying takes only a handful of microseconds, and most 
> > CPU time is spent reading same database structure over and over again.
> > 
> > Can you please make some advice, what can be done to reduce this overhead?
> 
> The problem with this is that it cannot be solved by SQLite's programmers 
> because most of the time is taken by operating system calls. Merely opening a 
> file (which you no doubt know is not done by sqlite_open() but delayed until 
> the first access) is a time-consuming procedure. Once SQLite has access to 
> the data it is, as you have shown, very fast.

But my test shows that opening is actually stunningly fast.
On the other hand, parsing database structure is slow and CPU bound.
75% of the parsing time is dedicated to reading stat tables.
System calls take a fraction of CPU time, to be more specific: 10%.

> 
> You explain that you have millions of individual database files. Is that the 
> only reason you can't open the database and keep it open, or are there others 
> ? Also, do all these separate databases have the same tables with the same 
> columns in ?

This is not the only reason, but most important one. 
Fitting all database connections in the RAM would be impossible.
Yes, database files are all have same structure.

> 
> My normal advice would be that before you start querying you merge your 
> millions of separate database files into one big one. Judging by the degree 
> of technical information in your question you don't need me to suggest ways 
> of programming or scripting this, or of keeping a merged central copy 
> up-to-date. The only question is whether it is appropriate to your 
> circumstances.

Unfortunately, it is not possible in our case. We have separate directories as 
a mean of encapsulation of an entity whose central index is based on sqlite.
Also, having one single index will reduce concurrency dramatically.

Thanks,
Paul

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to speed up database open

2014-12-11 Thread Richard Hipp
On Thu, Dec 11, 2014 at 6:51 AM, Paul  wrote:

>
> Hello.
>
> In my specific case I need to open database as fast as possible.
> Usual working cycle: open -> select small data set -> close.
> It is irrelevant how much time it takes to open database when
> data is being added or updated, since it happens not too often.
> But for selects it's a different story, selects are frequent.
> Database structure is pretty complex, but I would say not too much.
> 21 tables (including sqlite_stat1 and sqlite_stat4) and 11 indices.
>
> For me, it takes 1.2 ms to read database structure. And the process
> of reading database structure is strictly CPU bound. When I run
> profiler I see 90% of time is spent in sqlite3InitOne() function and
> 64% inside of loadStatTbl() that is being called by sqlite3InitOne().
> Times are measured performing hundred thousands of cycles:
> OPEN -> SELECT -> CLOSE
> This procdure is not I/O bound, database is small (2MiB) and easily
> fits in the file system cache. I have double-checked with iostat.
>
> Weird thing is that when I drop either sqlite_stat1 or sqlite_stat4
> or both of them, average time drops from 1.2 ms to 0.4 ms. Why is this
> so expensive to read those tables?
>

Can you send us a copy of your schema and the STAT tables?  You can
generate the content we need by running "sqlite3 YOURDATABASE .fullschema"
and capturing the output.  With that, we can reproduce your problems and
perhaps offer some hope of (minor) improvement.

But really, the process of opening the database connection does require
reading and parsing the entire database schema and then running "SELECT *
FROM sqlite_stat1; SELECT * FROM sqlite_stat4;" queries to extract the STAT
data.  That will all take *some* time.  Is there no way that you can open
the database connections in advance and having the standing by?


>
> I understand, that having them is a must for a decent performance.
> In my specific case I have millions of individual database files.
> This is one, among other reasons that I can't keep them open all the time.
> Just too many of them. These databases are being opened frequently.
> Let's say 500 times per second. In most cases, just to query a single row.
> Ironically, querying takes only a handful of microseconds, and most
> CPU time is spent reading same database structure over and over again.
>
> Can you please make some advice, what can be done to reduce this overhead?
> Even reducing it on the account of deleting stat tables, that I do not
> consider an option at all, is not enough. 400 microseconds for database
> initialization is a lot, if it takes only 5 microseconds to query data.
>
> What specifically I mean by saying initialization:
> 1) sqlite3_open(...);
> 2) sqlite3_exec(..." PRAGMA page_size = 4096; "...);
> 3) sqlite3_exec(..." PRAGMA temp_store = MEMORY; "...);
> 4) sqlite3_exec(..." PRAGMA cache_size = 1; "...);
> 5) sqlite3_exec(..." PRAGMA foreign_keys = ON; "...);
> 6) sqlite3_exec(..." PRAGMA synchronous = NORMAL; "...);
>
> Sequence is actually irrelevant. And if I skip all the pragmas,
> initialization time will be delayed until the first query, so I guess
> there is nothing specific about these pragmas.
>
> Thanks,
> Paul
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to speed up database open

2014-12-11 Thread Simon Slavin

On 11 Dec 2014, at 11:51am, Paul  wrote:

> I understand, that having them is a must for a decent performance. 
> In my specific case I have millions of individual database files. 
> This is one, among other reasons that I can't keep them open all the time. 
> Just too many of them. These databases are being opened frequently. 
> Let's say 500 times per second. In most cases, just to query a single row. 
> Ironically, querying takes only a handful of microseconds, and most 
> CPU time is spent reading same database structure over and over again.
> 
> Can you please make some advice, what can be done to reduce this overhead?

The problem with this is that it cannot be solved by SQLite's programmers 
because most of the time is taken by operating system calls.  Merely opening a 
file (which you no doubt know is not done by sqlite_open() but delayed until 
the first access) is a time-consuming procedure.  Once SQLite has access to the 
data it is, as you have shown, very fast.

You explain that you have millions of individual database files.  Is that the 
only reason you can't open the database and keep it open, or are there others ? 
 Also, do all these separate databases have the same tables with the same 
columns in ?

My normal advice would be that before you start querying you merge your 
millions of separate database files into one big one.  Judging by the degree of 
technical information in your question you don't need me to suggest ways of 
programming or scripting this, or of keeping a merged central copy up-to-date.  
The only question is whether it is appropriate to your circumstances.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] How to speed up database open

2014-12-11 Thread Paul

Hello.

In my specific case I need to open database as fast as possible.
Usual working cycle: open -> select small data set -> close.
It is irrelevant how much time it takes to open database when
data is being added or updated, since it happens not too often.
But for selects it's a different story, selects are frequent.
Database structure is pretty complex, but I would say not too much.
21 tables (including sqlite_stat1 and sqlite_stat4) and 11 indices.

For me, it takes 1.2 ms to read database structure. And the process
of reading database structure is strictly CPU bound. When I run
profiler I see 90% of time is spent in sqlite3InitOne() function and 
64% inside of loadStatTbl() that is being called by sqlite3InitOne().
Times are measured performing hundred thousands of cycles:
OPEN -> SELECT -> CLOSE
This procdure is not I/O bound, database is small (2MiB) and easily 
fits in the file system cache. I have double-checked with iostat.

Weird thing is that when I drop either sqlite_stat1 or sqlite_stat4
or both of them, average time drops from 1.2 ms to 0.4 ms. Why is this
so expensive to read those tables? 

I understand, that having them is a must for a decent performance. 
In my specific case I have millions of individual database files. 
This is one, among other reasons that I can't keep them open all the time. 
Just too many of them. These databases are being opened frequently. 
Let's say 500 times per second. In most cases, just to query a single row. 
Ironically, querying takes only a handful of microseconds, and most 
CPU time is spent reading same database structure over and over again.

Can you please make some advice, what can be done to reduce this overhead?
Even reducing it on the account of deleting stat tables, that I do not
consider an option at all, is not enough. 400 microseconds for database 
initialization is a lot, if it takes only 5 microseconds to query data.

What specifically I mean by saying initialization:
1) sqlite3_open(...);
2) sqlite3_exec(..." PRAGMA page_size = 4096; "...);
3) sqlite3_exec(..." PRAGMA temp_store = MEMORY; "...);
4) sqlite3_exec(..." PRAGMA cache_size = 1; "...);
5) sqlite3_exec(..." PRAGMA foreign_keys = ON; "...);
6) sqlite3_exec(..." PRAGMA synchronous = NORMAL; "...);

Sequence is actually irrelevant. And if I skip all the pragmas,
initialization time will be delayed until the first query, so I guess
there is nothing specific about these pragmas.

Thanks,
Paul
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Online/Hot backup of WAL journalling mode database

2014-12-11 Thread Simon Slavin

On 10 Dec 2014, at 10:40pm, Nick  wrote:

> All the processes would have automatic checkpointing disabled. Just the 
> backup process would perform the checkpoint.

I don't know enough about the internals of SQLite to be sure, but various parts 
of me are concerned that this is a bad idea.  I don't know what WAL mode would 
be like without checkpointing but there has to be a reason for checkpointing 
and disabling it between backups sounds bad.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Online/Hot backup of WAL journalling mode database

2014-12-11 Thread Dan Kennedy

On 12/11/2014 05:49 AM, Nick wrote:

On 10 Dec 2014, at 07:35, Dan Kennedy wrote:


Strictly speaking the database file may not be well-formed even if there is no 
ongoing checkpoint. If:

  a) process A opens a read transaction,
  b) process B opens and commits a write transaction to the database,
  c) process C checkpoints the db,

then the db file considered without the *-wal file may be corrupt. The problem 
comes about because process C can only checkpoint frames up until the start of 
B's transaction. And there is an optimization that will prevent it from copying 
any earlier frames for which there exists a frame in B's transaction that 
corresponds to the same database page. So it effectively copis only a subset of 
the modifications made by earlier transactions into the db file - not 
necessarily creating a valid db file.

Can this corruption be detected by running PRAGMA quick_check / 
integrity_check? Having the occasional backup db corrupted would be tolerable.


In many cases, but not generally. There would exist cases where a part 
of a committed transaction was lost, or the values in unindexed columns 
where replaced, that sort of thing.








Regards
Nick
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users