Re: Worried about single-db performance

2010-09-07 Thread Matthew Bentham

On 06/09/2010 15:18, Bert Huijben wrote:




-Original Message-
From: Matthew Bentham [mailto:mj...@artvps.com]
Sent: maandag 6 september 2010 15:07
To: Justin Erenkrantz
Cc: Bert Huijben; Greg Stein; Johan Corveleyn; Subversion Development
Subject: Re: Worried about single-db performance

On 04/09/2010 17:33, Justin Erenkrantz wrote:

Aha.  Adding exclusive locking into our pragma
[http://www.sqlite.org/pragma.html] calls in svn_sqlite__open:

PRAGMA locking_mode=exclusive;

brings the time for svn st down from 0.680 to 0.310 seconds.  And,
yes, the I/O percentages drop dramatically:

~/Applications/svn-trunk/bin/svn st  0.37s user 0.31s system 99% cpu

0.684 total

~/Applications/svn-trunk/bin/svn st  0.26s user 0.05s system 98% cpu

0.308 total


I *think* we'd be okay with having Sqlite holding its read/write

locks

for the duration of our database connection rather than constantly
giving it up and refetching it between every read and write

operation.

   As I read the sqlite docs, we should still be able to have shared
readers in this model - but, it'd create the case where there were
idle shared readers (due to network I/O?) would block an attempted
writer.  With a normal locking mode, a writer could intercept a

reader

if it were idle and not active.  However, I'd think our other locks
would interfere with this anyway...so I think it'd be safe.

Thoughts?  -- justin


I think it's essential to use exclusive locking for performance
reasons,
without it we will get just as many individual file ops as in 1.6 (and
it's the number of file ops which causes the performance problem on
windows).


Did you actually try a shared lock before suggesting this?


I haven't re-run those tests on single-db.  The results I linked to 
compare locking_mode=NORMAL and locking_mode=EXCLUSIVE on otherwise 
identical code.



Getting a shared lock actually gives me better performance on this read only
operation then an exclusive lock and it doesn't block out other clients
(which would be a breaking change from 1.6)


I understand locking_mode=EXCLUSIVE to allow shared read-only access. 
Don't we block write access when other clients are reading already?  Or 
are you worried about where we're releasing the database connection?


I'm surprised locking_mode=NORMAL could ever have better performance 
given that the number of lock operations must be strictly greater and 
everything else is the same.



Getting an exclusive lock on every operation would completely disable
Subversions most popular client: TortoiseSVN.


I didn't realise this, you are of course right that that would make it 
unacceptable.  I don't really understand why it would break TortoiseSVN, 
does it take write access and then not release it somehow?


Matthew


Re: Worried about single-db performance

2010-09-07 Thread Matthew Bentham

On 07/09/2010 13:02, Bert Huijben wrote:




-Original Message-
From: Matthew Bentham [mailto:mj...@artvps.com]
Sent: dinsdag 7 september 2010 13:48
To: Bert Huijben
Cc: 'Justin Erenkrantz'; 'Greg Stein'; 'Johan Corveleyn'; 'Subversion
Development'
Subject: Re: Worried about single-db performance




I didn't realise this, you are of course right that that would make it
unacceptable.  I don't really understand why it would break
TortoiseSVN,
does it take write access and then not release it somehow?


SQLite needs a shared *read* lock to *read*. See
http://www.sqlite.org/atomiccommit.html.
(Invoking 'svn status' never obtains a write lock; see that document)

SQLite only upgrades that read lock to a write (or actually reserved) lock
when you perform a db operation that has to change the database. Further on
(E.g. too many changes, but look at the documentation for more reasons) this
is upgraded to an exclusive lock that blocks all readers and writers out of
the db, but it tries to keep this time as short as possible.

Your original suggestion is just to make any *reader* block any other
*reader*. Which breaks the subversion world. (Just running svn update in 1.6
has about 5 simultaneous independent readers in some phases of update. Most
GUI subversion clients I know use multiple client instances at the same
time, so they would all have to be rewritten if we obtain an exclusive lock
for reading).


Sorry, I didn't mean we should take exclusive locks for every 
transaction, just that we should use PRAGMA locking_mode=EXCLUSIVE. 
According to the documentation (http://www.sqlite.org/pragma.html) that 
makes transactions obtain a shared lock for reading which is upgraded to 
an exclusive lock for writing, and not released until the database 
connection is closed.  I've tried it a couple of times in svn.exe and it 
always improves performance (over locking_mode=NORMAL) and hasn't caused 
me problems.  Admittedly I haven't tried within the last couple of weeks 
and I'm afraid I don't have time right now.


Am I misreading the documentation?  It says The first time the database 
is read in EXCLUSIVE mode, a shared lock is obtained and held. The first 
time the database is written, an exclusive lock is obtained and held.


Matthew


Re: Worried about single-db performance

2010-09-07 Thread Branko Čibej
 On 07.09.2010 14:29, Matthew Bentham wrote:
 On 07/09/2010 13:02, Bert Huijben wrote:


 -Original Message-
 From: Matthew Bentham [mailto:mj...@artvps.com]
 Sent: dinsdag 7 september 2010 13:48
 To: Bert Huijben
 Cc: 'Justin Erenkrantz'; 'Greg Stein'; 'Johan Corveleyn'; 'Subversion
 Development'
 Subject: Re: Worried about single-db performance


 I didn't realise this, you are of course right that that would make it
 unacceptable.  I don't really understand why it would break
 TortoiseSVN,
 does it take write access and then not release it somehow?

 SQLite needs a shared *read* lock to *read*. See
 http://www.sqlite.org/atomiccommit.html.
 (Invoking 'svn status' never obtains a write lock; see that document)

 SQLite only upgrades that read lock to a write (or actually reserved)
 lock
 when you perform a db operation that has to change the database.
 Further on
 (E.g. too many changes, but look at the documentation for more
 reasons) this
 is upgraded to an exclusive lock that blocks all readers and writers
 out of
 the db, but it tries to keep this time as short as possible.

 Your original suggestion is just to make any *reader* block any other
 *reader*. Which breaks the subversion world. (Just running svn update
 in 1.6
 has about 5 simultaneous independent readers in some phases of
 update. Most
 GUI subversion clients I know use multiple client instances at the same
 time, so they would all have to be rewritten if we obtain an
 exclusive lock
 for reading).

 Sorry, I didn't mean we should take exclusive locks for every
 transaction, just that we should use PRAGMA locking_mode=EXCLUSIVE.
 According to the documentation (http://www.sqlite.org/pragma.html)
 that makes transactions obtain a shared lock for reading which is
 upgraded to an exclusive lock for writing, and not released until the
 database connection is closed.  I've tried it a couple of times in
 svn.exe and it always improves performance (over locking_mode=NORMAL)
 and hasn't caused me problems.  Admittedly I haven't tried within the
 last couple of weeks and I'm afraid I don't have time right now.

 Am I misreading the documentation?  It says The first time the
 database is read in EXCLUSIVE mode, a shared lock is obtained and
 held. The first time the database is written, an exclusive lock is
 obtained and held.

That and held is the problem, IMO. A long-term connection that mostly
reads but just happens to write something once will not drop the
exclusive lock until the database connection is closed.

-- Brane


Re: Worried about single-db performance

2010-09-06 Thread Matthew Bentham

On 04/09/2010 17:33, Justin Erenkrantz wrote:

Aha.  Adding exclusive locking into our pragma
[http://www.sqlite.org/pragma.html] calls in svn_sqlite__open:

PRAGMA locking_mode=exclusive;

brings the time for svn st down from 0.680 to 0.310 seconds.  And,
yes, the I/O percentages drop dramatically:

~/Applications/svn-trunk/bin/svn st  0.37s user 0.31s system 99% cpu 0.684 total
~/Applications/svn-trunk/bin/svn st  0.26s user 0.05s system 98% cpu 0.308 total

I *think* we'd be okay with having Sqlite holding its read/write locks
for the duration of our database connection rather than constantly
giving it up and refetching it between every read and write operation.
  As I read the sqlite docs, we should still be able to have shared
readers in this model - but, it'd create the case where there were
idle shared readers (due to network I/O?) would block an attempted
writer.  With a normal locking mode, a writer could intercept a reader
if it were idle and not active.  However, I'd think our other locks
would interfere with this anyway...so I think it'd be safe.

Thoughts?  -- justin


I think it's essential to use exclusive locking for performance reasons, 
without it we will get just as many individual file ops as in 1.6 (and 
it's the number of file ops which causes the performance problem on 
windows).


I got the same results as you pre single-db in this message (near the 
end of it), and the other message from me in that thread:


http://svn.haxx.se/dev/archive-2010-02/0239.shtml

For fun you can try 'locking_mode=MEMORY' which makes it go really 
really fast (but unsafe wrt atomic operations under certain termination 
conditions).


Matthew


RE: Worried about single-db performance

2010-09-06 Thread Bert Huijben


 -Original Message-
 From: Matthew Bentham [mailto:mj...@artvps.com]
 Sent: maandag 6 september 2010 15:07
 To: Justin Erenkrantz
 Cc: Bert Huijben; Greg Stein; Johan Corveleyn; Subversion Development
 Subject: Re: Worried about single-db performance
 
 On 04/09/2010 17:33, Justin Erenkrantz wrote:
  Aha.  Adding exclusive locking into our pragma
  [http://www.sqlite.org/pragma.html] calls in svn_sqlite__open:
 
  PRAGMA locking_mode=exclusive;
 
  brings the time for svn st down from 0.680 to 0.310 seconds.  And,
  yes, the I/O percentages drop dramatically:
 
  ~/Applications/svn-trunk/bin/svn st  0.37s user 0.31s system 99% cpu
 0.684 total
  ~/Applications/svn-trunk/bin/svn st  0.26s user 0.05s system 98% cpu
 0.308 total
 
  I *think* we'd be okay with having Sqlite holding its read/write
 locks
  for the duration of our database connection rather than constantly
  giving it up and refetching it between every read and write
 operation.
As I read the sqlite docs, we should still be able to have shared
  readers in this model - but, it'd create the case where there were
  idle shared readers (due to network I/O?) would block an attempted
  writer.  With a normal locking mode, a writer could intercept a
 reader
  if it were idle and not active.  However, I'd think our other locks
  would interfere with this anyway...so I think it'd be safe.
 
  Thoughts?  -- justin
 
 I think it's essential to use exclusive locking for performance
 reasons,
 without it we will get just as many individual file ops as in 1.6 (and
 it's the number of file ops which causes the performance problem on
 windows).

Did you actually try a shared lock before suggesting this?

Getting a shared lock actually gives me better performance on this read only
operation then an exclusive lock and it doesn't block out other clients
(which would be a breaking change from 1.6)

Getting an exclusive lock on every operation would completely disable
Subversions most popular client: TortoiseSVN. 

That is not something we can decide in just a few mails.

Bert



RE: Worried about single-db performance

2010-09-04 Thread Bert Huijben


 -Original Message-
 From: justin.erenkra...@gmail.com [mailto:justin.erenkra...@gmail.com]
 On Behalf Of Justin Erenkrantz
 Sent: zaterdag 4 september 2010 8:33
 To: Greg Stein
 Cc: Johan Corveleyn; Subversion Development
 Subject: Re: Worried about single-db performance
 
 On Fri, Sep 3, 2010 at 8:39 AM, Greg Stein gst...@gmail.com wrote:
  It should already be faster. Obviously, that's not the case.
 
 I just spent a little bit time with Shark and gdb.  A cold run of 'svn
 st' against Subversion trunk checkouts for 1.6 yields 0.402 seconds
 and 1.7 is 0.919 seconds.  Hot runs for 1.6 are about 0.055 seconds
 with 1.7 at 0.750 seconds.
 
 One striking difference in the perf profile between 1.6  trunk is
 that we seem to do a larger amount of stat() calls in 1.7.
 
 From looking at the traces and code, I *think*
 svn_wc__db_pdh_parse_local_abspath's call to svn_io_check_special_path
 may be in play here:

SQLite also does a stat call per statement, unless there is already a shared
lock open, just to check if there is no other process that opened a
transaction.
(On Windows this specific stat to check for other processes operating on the
same db is the performance killer for svn status: Just this stat takes more
than 50% of the total processing).

Bert



Re: Worried about single-db performance

2010-09-04 Thread Branko Čibej
 On 04.09.2010 11:23, Bert Huijben wrote:

 -Original Message-
 From: justin.erenkra...@gmail.com [mailto:justin.erenkra...@gmail.com]
 On Behalf Of Justin Erenkrantz
 Sent: zaterdag 4 september 2010 8:33
 To: Greg Stein
 Cc: Johan Corveleyn; Subversion Development
 Subject: Re: Worried about single-db performance

 On Fri, Sep 3, 2010 at 8:39 AM, Greg Stein gst...@gmail.com wrote:
 It should already be faster. Obviously, that's not the case.
 I just spent a little bit time with Shark and gdb.  A cold run of 'svn
 st' against Subversion trunk checkouts for 1.6 yields 0.402 seconds
 and 1.7 is 0.919 seconds.  Hot runs for 1.6 are about 0.055 seconds
 with 1.7 at 0.750 seconds.

 One striking difference in the perf profile between 1.6  trunk is
 that we seem to do a larger amount of stat() calls in 1.7.

 From looking at the traces and code, I *think*
 svn_wc__db_pdh_parse_local_abspath's call to svn_io_check_special_path
 may be in play here:
 SQLite also does a stat call per statement, unless there is already a shared
 lock open, just to check if there is no other process that opened a
 transaction.
 (On Windows this specific stat to check for other processes operating on the
 same db is the performance killer for svn status: Just this stat takes more
 than 50% of the total processing).

Hmmm ... easy solution then, just fork off a process that opens the
database and these stats should magically vanish ... :)

-- Brane



Re: Worried about single-db performance

2010-09-04 Thread Justin Erenkrantz
On Sat, Sep 4, 2010 at 2:23 AM, Bert Huijben b...@qqmail.nl wrote:
 SQLite also does a stat call per statement, unless there is already a shared
 lock open, just to check if there is no other process that opened a
 transaction.
 (On Windows this specific stat to check for other processes operating on the
 same db is the performance killer for svn status: Just this stat takes more
 than 50% of the total processing).

Aha.  Adding exclusive locking into our pragma
[http://www.sqlite.org/pragma.html] calls in svn_sqlite__open:

PRAGMA locking_mode=exclusive;

brings the time for svn st down from 0.680 to 0.310 seconds.  And,
yes, the I/O percentages drop dramatically:

~/Applications/svn-trunk/bin/svn st  0.37s user 0.31s system 99% cpu 0.684 total
~/Applications/svn-trunk/bin/svn st  0.26s user 0.05s system 98% cpu 0.308 total

I *think* we'd be okay with having Sqlite holding its read/write locks
for the duration of our database connection rather than constantly
giving it up and refetching it between every read and write operation.
 As I read the sqlite docs, we should still be able to have shared
readers in this model - but, it'd create the case where there were
idle shared readers (due to network I/O?) would block an attempted
writer.  With a normal locking mode, a writer could intercept a reader
if it were idle and not active.  However, I'd think our other locks
would interfere with this anyway...so I think it'd be safe.

Thoughts?  -- justin


Worried about single-db performance

2010-09-03 Thread Johan Corveleyn
Hi devs,

From what I understand about the performance problems of WC-1 vs.
WC-NG, and what I'm reading on this list, I expect(ed) a huge
performance boost from WC-NG for certain client operations (especially
on Windows, where the locking of WC-1 is quite problematic). Also, I
knew I had to wait for single-db to see any real performance benifits.
So after you guys switched on single-db, I eagerly gave trunk a
spin... Now I'm a little worried, because I don't see any great speed
increases (quite the contrary). Some details below.

Maybe it's just too early to be looking at this (maybe it's a simple
matter of optimizing the data model, adding indexes, optimizing some
code paths, ...). So it's fine if you guys say: chill, just wait a
couple more weeks. I just need to know whether I should be worried or
not :-).

Some details ...

Setup:
- Win XP 32-bit client machine, with antivirus switched off.
- Single-db client: tr...@992042 (yesterday), release build with VCE2008
- 1.6 client: 1.6.5 binary from tigris.org that I still had lying around.
- Medium size working copy (944 dirs, 10286 files), once checked out
with the 1.6 client (WC-1), once checked out with the trunk-single-db
client.
- 1st run means after reboot, 2nd run means immediately after 1st run.

Numbers:

1) Status (svn st)

1.6 client 1st run:
real0m41.593s
user0m0.015s
sys 0m0.015s

1.6 client 2nd run:
real0m1.203s
user0m0.015s
sys 0m0.031s

Single-db client 1st run:
real0m34.984s
user0m0.015s
sys 0m0.031s

Single-db client 2nd run:
real0m6.938s
user0m0.015s
sys 0m0.031s


2) Update (no changes, wc already up to date) (svn up)

1.6 client 1st run:
real0m38.484s
user0m0.015s
sys 0m0.015s

1.6 client 2nd run:
real0m1.141s
user0m0.015s
sys 0m0.015s

Single-db client 1st run:
real0m31.375s
user0m0.015s
sys 0m0.031s

Single-db client 2nd run:
real0m5.468s
user0m0.031s
sys 0m0.015s


Anyone able to take away my worries :-) ?

Cheers,
-- 
Johan


Re: Worried about single-db performance

2010-09-03 Thread Greg Stein
On Fri, Sep 3, 2010 at 06:09, Johan Corveleyn jcor...@gmail.com wrote:
 Hi devs,

 From what I understand about the performance problems of WC-1 vs.
 WC-NG, and what I'm reading on this list, I expect(ed) a huge
 performance boost from WC-NG for certain client operations (especially
 on Windows, where the locking of WC-1 is quite problematic). Also, I
 knew I had to wait for single-db to see any real performance benifits.
 So after you guys switched on single-db, I eagerly gave trunk a
 spin... Now I'm a little worried, because I don't see any great speed
 increases (quite the contrary). Some details below.

 Maybe it's just too early to be looking at this (maybe it's a simple
 matter of optimizing the data model, adding indexes, optimizing some
 code paths, ...). So it's fine if you guys say: chill, just wait a
 couple more weeks. I just need to know whether I should be worried or
 not :-).

It should already be faster. Obviously, that's not the case.

My expectation is that it would be faster, and then we'd do some perf
improvements to make it even faster. Sounds like we definitely have to
do some of those other improvements.

We have a schema change to make, and once that is done, then we can
start looking at the performance. There could be lots of SQL queries
that need to be optimized. I also *know* that we issue way too many
queries. There should be ways to avoid a lot of those queries.

I'd like to avoid any caching, and rely on SQLite to maintain
in-memory caches. My gut says we just need to reduce the number of
queries.

Cheers,
-g