[sqlite] WAL file growth concern

2010-10-25 Thread Bob Smith
On Sat Oct 23 20:57:56 GMT 2010, H. Phil Duby wrote:

> Give the described conditions, I do not think you need to worry about the
> WAL file growing without bound.  I think that each increase should be
> smaller than the previous one, and will stop growing all together [for your
> stress test] when the WAL file has increased to the point that all of the
> [WAL] writes [during your forced lock conditions time frame] fit in the
> expanded file

After thinking about that and also after some testing, I concur. I ran
my read/write stress test again inserting 2 million rows over 9 hours
while doing my reads. The WAL file occasionally kept growing even in
the 8th hour, but the growth really slowed. It grew up through 12MB
during the first 2 hours or so and then only grew another 700K during
the next 2.5 hours and only grew another 400K over the next 4.5 hours
to a total of 13.1MB through the 9 hour test. Every once in a while a
read is going to take an extra long time from the norm or the writes
are going to go faster than the norm (this a somewhat controlled test,
but on a a general purpose machine - other processes do exist who get
time slices...) and the file grows a little bit every once in a while
after it initially settles down. I do believe on a longer run it would
pretty much stop growing all together. Very good.

The reason I looked into this in the first place and put together my
more controlled test is my colleague was doing some testing with my
full system and reported that his WAL file had grown to 306MB and his
system had become sluggish. After my initial results, I thought the
culprit causing the huge WAL file was the simultaneous reads and
writes However, it turns out this was not the culprit. We also
periodically delete the oldest X rows from the database in a purge. He
had deleted 750K rows in one shot. That is definitely what grew the
WAL file so large.

I have not yet determined why his system got sluggish afterwards. It
may not have anything to do with sqlite at all. So, I won't dwell on
it here. I will ask one question, however. I know that reads are
expected to slow down as the WAL file gets larger.  Is that just in
terms of real data/pages in the WAL file or does the actual size of
the file have an impact? For example, I have a 300MB WAL file due to a
previous operation. However, everything has been checkpointed, and we
are back to really only having ~1MB of actual data being written into
the WAL file and then checkpointed. The rest (vast majority) of the
file is not being used. Would we expect performance degradation here
as compared to a WAL file with a real physical size of ~1MB at this
point?

Considering things like bulk deletes (and updates) potentially really
growing a WAL file to be quite large and having a system that is
constantly running and inserting data into the database over a long
period of time with reads also coming in,  I wonder about the value of
adding an optional feature to sqlite. What if there was an API to to
specify a max desired limit to the size of the WAL file? Whenever a
checkpoint was 100% successful and it was determined that the entire
WAL has been transferred into the database and synced and  no readers
are making use of the WAL, then in addition to the writer rewinding
the WAL back the beginning, the WAL file was truncated IF this option
was configured and the size of the WAL file was greater than the
specified value? This seems like it would be simple to implement
without costing anything by default to those who don't configure it...
If I was to use such a feature and was to do the default 1000 page
checkpoint (which seems to correspond to a little over a 1MB WAL file
size), I would make the limit something like 50MB. Under normal
conditions, the limit would never be reached anyways. But, in the case
where a large WAL file did get created at some point, this would be
used to get it truncated. Thoughts?

Best Regards,

Bob
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] WAL file growth concern

2010-10-25 Thread Nicolas Williams
On Fri, Oct 22, 2010 at 09:56:22PM -0400, Richard Hipp wrote:
> On many (most?) filesystems, it is faster to overwrite an existing area of a
> file than it is to extend the file by writing past the end.  That's why
> SQLite doesn't truncate the WAL file on each checkpoint - so that subsequent
> writes will be overwriting an existing file region and thus go faster.

I think that assumption is getting harder to make.  For one thing,
filesystems nowadays aggregate transactions into large writes, which
means that old blocks aren't overwritten, but replaced -- COW goes
hand-in-hand with such aggregation.

For ZFS your assumption is wrong because of ZFS' variable block size
support[*].  I don't know if there are filesystems other than ZFS where
file data block size varies.  But for filesystems that aggregate writes
I think you'd find that overwriting performs about as well as appending
(assuming there's no O_APPEND synchronization going on).

Does the WAL store modified pages?

Nico

[*] In ZFS files have a single data block until the file size exceeds
the host filesystem's "recordsize", from which point the file will
consist of two or more data blocks, all of that size.  Block sizes
are all the powers of two between nine and seventeen (512 bytes to
128KB).  Thus overwriting a 1KB SQLite3 page in the middle of the
file with 128KB recordsize will result in a read-modify-write of the
modified block.  Though likely the application will have already
caused that block to be in memory, in which case there's no RMW, but
the disparity between application "page size" and ZFS recordsize
obviously has a significant cost.  SQLite3 users should set the
SQLite3 page size and host ZFS dataset recordsize so they match.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] WAL file growth concern

2010-10-23 Thread H. Phil Duby
On Sat, Oct 23, 2010 at 9:26 AM, Bob Smith  wrote:

> On Sat, Oct 23, 2010 at 5:56 AM, Richard Hipp  wrote:
> >
> > Is that what you are seeing?
> >
> > Which filesystem are you using?
> >
> >
>
> Thanks Richard, this does seem to be what I am seeing. I am definitely
> seeing  more writes during the subsequent cycles and an explanation
> that this is likely due to the writes being able to go faster to the
> existing parts of the WAL definitely makes sense. I think that is it.
>
> I have run and seen this scenario on two machines. One has a linux
> ext4 filesystem and the other has a linux ext3 filesystem.
>
> The WAL mode is a great addition to sqlite. Being able to do
> simultaneous reads while writing is fantastic. I have also seen nice
> improvements with writing speed. WAL mode has lots of value and the
> sqlite developers  have done a fantastic job.
>
> In my scenario of having to support a constant stream of steady writes
> (granted, I would hope not with the frequency of my stress test) along
> with an unknown frequency of reads that will come in, I do have
> concerns about the WAL file consistently growing over a long period of
> time (days and weeks). I would love to be able to continue using the
> WAL mode, so I need to see what I can do to limit the growth or  to
> periodically make sure all connections to the database are closed at
> the same time to allow the WAL file to periodically truncate. Any
> other ideas?
>
> Best Regards,
>
> Bob
>
Give the described conditions, I do not think you need to worry about the
WAL file growing without bound.  I think that each increase should be
smaller than the previous one, and will stop growing all together [for your
stress test] when the WAL file has increased to the point that all of the
[WAL] writes [during your forced lock conditions time frame] fit in the
expanded file.  How big, and how long to get there, will depend on the
relative speeds of [over] writing the existing file, versus expanding and
writing.  Of course there are always edge cases.  If a read / lock lasts
long enough for *another* read to get a lock, then a progressive serious can
keep a lock indefinitely, and the WAL file will be forced to continue to
expand.  The only way to *prevent* that expansion, is to stop writing.
Other solutions get outside of SQLITE, like having application logic that
starts rejecting new read requests at some point, until all existing read
connection have closed, and SQLITE has finished the [long delayed]
checkpoint.
--
Phil
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] WAL file growth concern

2010-10-23 Thread Bob Smith
On Sat, Oct 23, 2010 at 5:56 AM, Richard Hipp  wrote:
>
> Is that what you are seeing?
>
> Which filesystem are you using?
>
>

Thanks Richard, this does seem to be what I am seeing. I am definitely
seeingĀ  more writes during the subsequent cycles and an explanation
that this is likely due to the writes being able to go faster to the
existing parts of the WAL definitely makes sense. I think that is it.

I have run and seen this scenario on two machines. One has a linux
ext4 filesystem and the other has a linux ext3 filesystem.

The WAL mode is a great addition to sqlite. Being able to do
simultaneous reads while writing is fantastic. I have also seen nice
improvements with writing speed. WAL mode has lots of value and the
sqlite developers  have done a fantastic job.

In my scenario of having to support a constant stream of steady writes
(granted, I would hope not with the frequency of my stress test) along
with an unknown frequency of reads that will come in, I do have
concerns about the WAL file consistently growing over a long period of
time (days and weeks). I would love to be able to continue using the
WAL mode, so I need to see what I can do to limit the growth orĀ  to
periodically make sure all connections to the database are closed at
the same time to allow the WAL file to periodically truncate. Any
other ideas?

Best Regards,

Bob
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] WAL file growth concern

2010-10-23 Thread Max Vlasov
On Sat, Oct 23, 2010 at 5:56 AM, Richard Hipp  wrote:

>
> Is that what you are seeing?
>
> Which filesystem are you using?
>
>
>
>
I think the misfeature even easier to reproduce than with the scenario Bob
wrote. If your admin has a "repeated query" feature, you even don't need
writing a single line of code.

sqlite 3.7.3 dll, windows 7 64bit

CREATE TABLE [TestTable] ([Id] INTEGER PRIMARY KEY AUTOINCREMENT)

1. One client repeated query, non-transacted loop, so takes several minutes
for 10,000 repeats

INSERT INTO TestTable DEFAULT VALUES

as sqlite itself does checkpointing from time to time and threshold is about
1000 pages, we have -wal growing to about 1,024K and stay at this size.

2. Another client,
minimize cache if you want to repeat the test afterwards: PRAGMA
cache_size=10;
do cross join after some inserts in the first client

SELECT Count(Null) FROM TestTable, TestTable

if this query is long enough (5 seconds and more) we will see -wal file
growing steadily until client 2 finishes. In my tests I con confirm growing
to 8M, 23M

Max Vlasov
maxerist.net
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] WAL file growth concern

2010-10-22 Thread Richard Hipp
On Fri, Oct 22, 2010 at 2:54 PM, Bob Smith  wrote:

> What I have determined (via using sqlite3_wal_hook, catching some
> statistics
> in the callback as well as calling sqlite3_wal_checkpoint from the callback
> myself) is that each time I am writing during a read operation (so a
> checkpoint can't complete), I am able to write more records than the last
> time on average during this period. So, more records written during the
> time
> a checkpoint can't happen is going to relate to the WAL file needing to
> grow.
> 
> I am not sure why more writes are progressively able to occur each time
> these reads are happening, but this is the cause of my continued WAL file
> growth in this scenario.
>

On many (most?) filesystems, it is faster to overwrite an existing area of a
file than it is to extend the file by writing past the end.  That's why
SQLite doesn't truncate the WAL file on each checkpoint - so that subsequent
writes will be overwriting an existing file region and thus go faster.

So apparently, as long as you are overwriting existing parts of the WAL, the
writes go faster and you quickly get to the end of the WAL where writes
start to slow down.  But the WAL got a little bigger so more writes go fast
on the next cycle, and the WAL grows a little more.  And so forth.

Is that what you are seeing?

Which filesystem are you using?


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] WAL file growth concern

2010-10-22 Thread Andrey Cherezov
From: Bob Smith 
Date: 2010-10-21 22:55:05 GMT

> I might possibly have found an issue/concern with the way sqlite handles
> doing new writes to the WAL file during a time that checkpoints are unable
> to checkpoint data from the WAL file back into the database due to a reader
> being in the process of reading data from the WAL file. The result is the
> WAL file growing unbounded.

Having same issue with the SQLite (all vesions since July) under Windows 
(XP/Vista/7). My server process opens SQLite database on startup, one thread 
sometimes UPDATEs some records, other threads sometimes SELECTs (using one 
shared handle), there no any INSERTs. WAL growing up to 400Mb! 
(expecting <4Mb = 4Kb page *1000).

sqlite3_wal_checkpoint() or "pragma wal_checkpoint" helps only if I close 
DB connection explicit after this operation.

--
Andrey Cherezov
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] WAL file growth concern

2010-10-22 Thread Bob Smith
I have been able to do some more investigation here. What I was seeing
originally made me suspicious that a writer was skipping over parts of the
WAL file during the time that a checkpoint is unable to proceed due to a
reader. After further investigation, however, I have determined that this is
NOT the case in my scenario.

What I am seeing may be of interest, however, so I will explain what I have
determined is going on in my scenario.

This is a stress test scenario. I am inserting rows as quick as my C code
allows. I am writing an average of around 80 rows a second. This writing is
going on constantly.  A checkpoint occurs / wants to occur every few seconds
on average with this amount of activity and the size of the rows.

Then every 30 seconds from another process, I do a read query of the last
10,000 rows and I have it such that this query takes roughly 15 seconds (I
put some delay in the processing to mimic a longer operation such as from
ruby code from a web server). This read process opens a db connection, does
the query, closes the connection, and exits the process.

The WAL file grows just about every time the read process kicks in. It does
not grow at all in between invocations of the read process.

What I have determined (via using sqlite3_wal_hook, catching some statistics
in the callback as well as calling sqlite3_wal_checkpoint from the callback
myself) is that each time I am writing during a read operation (so a
checkpoint can't complete), I am able to write more records than the last
time on average during this period. So, more records written during the time
a checkpoint can't happen is going to relate to the WAL file needing to
grow.

I am periodically doing the same type of read query each time I do one and
it is taking about the same amount of time as measured in my calling
process. I really don't see an increase in the time it takes to do the reads
(for example in my last run reads took 18sec, 16sec, 16sec, 16sec, 16sec,
15sec, 16sec, 17sec, 16sec, 16sec, 14sec). However, I am measuring more
writes being able to complete from my writer during these times
progressively. The current read query is reading the last 10,000 rows based
on the internal rowid primary key.  Since I am getting more writes every
time during this period, that is why the WAL file grows every time during
this period.

I am not sure why more writes are progressively able to occur each time
these reads are happening, but this is the cause of my continued WAL file
growth in this scenario.

In a production system, I would not expect to be constantly pounding so many
events into the system non stop all the time.   Still, I would like to be
able to handle the stress case so that I know things are solid.

I can also adjust the threshold to checkpoint in larger chunks than the
default. This actually won't solve this (and will make the standard size WAL
file larger), but I do think it can  reduce the impacts of the phenomenon in
my scenario.

Is this type of scenario just not right for WAL mode?

Best Regards,

Bob
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] WAL file growth concern

2010-10-21 Thread Bob Smith
> Are you *sure* you aren't accidentally holding  a read transaction open 
> somewhere?
> Do you have any other clues on how we can isolate the problem?  A test case 
> that will we can run here, perhaps?

A am pretty certain that I am not accidentally holding a read
transaction open. I have my writing process that stays running. I
start up independent read processes
which open a connection, do a short sequence of reads, close the
connection, and then exit the process altogether. For this test, I
wait a bit in between kicking off
read processes to ensure the writer will kick off a checkpoint (with
my writing frequency, the checkpoints kick off pretty quickly).

I will triple check this and then look into sending you a stand alone
test case that you can run. I will target this for tomorrow. Where
should I send it?

I am running on Ubuntu 9.10, Linux 2.6.31-22-generic, i686

In order to make use of order by and limit support for delete which I
use for efficient purging of my oldest X records (which is not even
coming into play in my simple
read/write test discussed here), I had to build sqlite 3.7.3 from the
full source using -DSQLITE_ENABLE_UPDATE_DELETE_LIMIT=1

Best Regards,

Bob
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] WAL file growth concern

2010-10-21 Thread Richard Hipp
On Thu, Oct 21, 2010 at 6:55 PM, Bob Smith  wrote:

> I have been using WAL mode for a few months and have been quite happy with
> the write performance increases.
>
> I might possibly have found an issue/concern with the way sqlite handles
> doing new writes to the WAL file  during a time that checkpoints are unable
> to checkpoint data from the WAL file back into the database due to a reader
> being in the process of reading data from the WAL file. The result is the
> WAL file growing unbounded.
>

I followed the procedure you outlined with the exception that I did 20K
inserts rather than 1K inserts (so that fewer transactions were required).
I ran this for 100,000 transactions.  the resulting database file reached
1.536GB but the WAL never went above 1.542MB.

This is on SuSE 10.1, Linux 2.6.16.13-4.smp on i686 athlon.

It is also hard to imagine how the WAL could skip over a block of frames in
the middle and still continue to work.  If the writer really were skipping
over parts of the WAL file, the next checkpoint would go haywire when it
reached the part of the WAL that is skipped, and the later transactions
would never make it into the database.

Are you *sure* you aren't accidentally holding  a read transaction open
somewhere?

Do you have any other clues on how we can isolate the problem?  A test case
that will we can run here, perhaps?

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] WAL file growth concern

2010-10-21 Thread Bob Smith
I have been using WAL mode for a few months and have been quite happy with
the write performance increases.

I might possibly have found an issue/concern with the way sqlite handles
doing new writes to the WAL file  during a time that checkpoints are unable
to checkpoint data from the WAL file back into the database due to a reader
being in the process of reading data from the WAL file. The result is the
WAL file growing unbounded.

I have an application running on Linux which is an event logger. I am
storing the events in a table in a sqlite database. I have been using
version 2.7.2 until this week when I moved to 2.7.3  This application stays
up and running for long periods of time. The database is opened at startup
and not closed until the application is shutdown. This application is the
only writer to the database. The transaction sizes when writing to the
database are quite small. A single transaction writes only a single row to
the database. The size of a given row is pretty small (less than 1 KB).
Other processes/applications read from the database while writes are
ongoing. In the testing I discuss below, I do reads by starting a process
that opens a database connection,  reads a set amount of data, gracefully
closes the database connection making sure all statements are finalized, and
then exits.

I am currently using the default checkpointing and it works well when I am
only writing. My WAL file never grows above 1MB when I am only writing.

I startup the app and start writing to the DB. The WAL file grows up to
around 1MB. During one of the subsequent writes, a checkpoint is done. All
data is checkpointed / written to the main DB. The next write will start
writing to the beginning of the WAL file again.  (well, I believe this is
what is happening based on my understanding of
http://www.sqlite.org/wal.html and based on what I am seeing while testing)

This process occurs over and over again during the course of many writers.
The WAL file remains at right around 1MB.

Then a read operation comes in. If it is still going on at the time a
checkpoint needs to occur, the checkpoint won't be able to do  it's job. We
expect the WAL file to now grow larger here based on writes that are done
while the read is still in progress.  Let's say it grows to 1.5MB while the
read transaction is ongoing and the writing process is concurrently writing.

The read operation now completes. A subsequent write will kick off a
checkpoint and all data will make it to the main db.

Subsequent writes now start writing to the WAL file at the beginning of the
file. Writes can go on and on, kicking off checkpoints over time. The WAL
file will stay at 1.5 MB over many more writes and checkpoints over time.

All of the above matches the understanding I got from reading
http://www.sqlite.org/wal.html and matches what I have seen in testing.

So far so good.

Now consider another read operation that comes in while we are still doing
writes. The read is still going on when  a checkpoint is kicked off (we are
up to 1MB of live data in the WAL file in my case). The checkpoint once
again can't do  its thing. OK, so the WAL file is going to get written to
beyond the 1MB "mark".

At this point what I would have expected is that the WAL file gets first
written to between the 1MB and 1.5MB "mark"  while new writes are being
written while the read transaction is in progress. If less than .5 MB is
written, then the  WAL file would not grow. What I have seen in my testing,
however, leads me to believe that at this point, new writes actually start
at after the 1.5 MB "mark".  The WAL file now grows again. Let's say the
write frequency and read transaction duration are exactly as they were the
last time. Instead of the WAL file remaining at 1.5MB, the WAL file now
grows to 2MB. This seems to be a concern/issue with how these writes to the
WAL file are being done under this condition.

If I repeat this over and over again over time while we are constantly
writing and periodically doing reads (I spread  the reads out far enough so
that I know a checkpoint will successfully complete in between two sets of
reads),  the WAL file will continue to grow indefinitely. During this test,
my write frequency remains constant (or even slows over  time). I do
periodic reads, spread well out, all reading the same amount of data and
taking about the same amount of time. The reads all complete in a timely
fashion and clean up / close their connection afterwards.  As such, I would
not expect my WAL file to keep growing indefinitely, but it does.

My only solution has been to make sure that my writing application
periodically closes and re-opens the database while making certain all
readers close the database after their read queries. Then, If I happen to
close the database from my writer at a point in time when no readers have
the database open, then the WAL file gets truncated and is back to size 0
when my writer reopens the DB.  This solution is not perfect