Re: Some more benchmarks

2014-07-02 Thread Stefan Guggisberg
On Tue, Jul 1, 2014 at 8:37 PM, Jukka Zitting jukka.zitt...@gmail.com wrote:
 Hi,

 On Tue, Jul 1, 2014 at 9:38 AM, Jukka Zitting jukka.zitt...@gmail.com wrote:
 I also tried including MongoMK results, but the benchmark got stuck in
 ConcurrentReadTest. I'll re-try today and will file a bug if I can
 reproduce the problem.

 I guess it was a transient problem. Here are the results with
 Oak-Mongo included:

 Summary (90%, lower is better)

 Benchmark  Jackrabbit  Oak-Mongo  Oak-Tar
 -
 ReadPropertyTest   45  44
 SetPropertyTest  1179   2398  119
 SmallFileReadTest  47  97
 SmallFileWriteTest182530   43
 ConcurrentReadTest   1201   1247  710
 ConcurrentReadWriteTest  1900   2321  775
 ConcurrentWriteReadTest  1009354  108
 ConcurrentWriteTest   532553  101

wow, very impressive, congrats!

cheers
stefan


 I updated the gist at
 https://gist.github.com/jukka/078bd524aa0ba36b184b with full details.

 The general message here is to use TarMK for maximum single-node
 performance and MongoMK for scalability and throughput across multiple
 cluster nodes.

 BR,

 Jukka Zitting


Re: Some more benchmarks

2014-07-01 Thread Jukka Zitting
Hi,

I'm resurrecting this thread with some new findings. I re-ran many of
the benchmarks we've been following, pitting Jackrabbit 2.8.0 against
Oak 1.0.1 with TarMK. The results look pretty nice:

Summary (90%, lower is better)

Benchmark  Jackrabbit  Oak-Tar
--
ReadPropertyTest   454
SetPropertyTest  1179  119
SmallFileReadTest  477
SmallFileWriteTest182   43
ConcurrentReadTest   1201  710
ConcurrentReadWriteTest  1900  775
ConcurrentWriteReadTest  1009  108
ConcurrentWriteTest   532  101

See https://gist.github.com/jukka/078bd524aa0ba36b184b for details.

I also tried including MongoMK results, but the benchmark got stuck in
ConcurrentReadTest. I'll re-try today and will file a bug if I can
reproduce the problem.

BR,

Jukka Zitting


Re: Some more benchmarks

2014-07-01 Thread Jukka Zitting
Hi,

On Tue, Jul 1, 2014 at 9:38 AM, Jukka Zitting jukka.zitt...@gmail.com wrote:
 I also tried including MongoMK results, but the benchmark got stuck in
 ConcurrentReadTest. I'll re-try today and will file a bug if I can
 reproduce the problem.

I guess it was a transient problem. Here are the results with
Oak-Mongo included:

Summary (90%, lower is better)

Benchmark  Jackrabbit  Oak-Mongo  Oak-Tar
-
ReadPropertyTest   45  44
SetPropertyTest  1179   2398  119
SmallFileReadTest  47  97
SmallFileWriteTest182530   43
ConcurrentReadTest   1201   1247  710
ConcurrentReadWriteTest  1900   2321  775
ConcurrentWriteReadTest  1009354  108
ConcurrentWriteTest   532553  101

I updated the gist at
https://gist.github.com/jukka/078bd524aa0ba36b184b with full details.

The general message here is to use TarMK for maximum single-node
performance and MongoMK for scalability and throughput across multiple
cluster nodes.

BR,

Jukka Zitting


Re: Some more benchmarks

2013-09-28 Thread Jukka Zitting
Hi,

On Tue, Sep 24, 2013 at 11:19 PM, Jukka Zitting jukka.zitt...@gmail.com wrote:
 On Tue, Sep 24, 2013 at 10:47 PM, Jukka Zitting jukka.zitt...@gmail.com 
 wrote:
 The concurrent read and read/write test cases look like more attention
 is needed on the test code, as it's currently hard to interpret the
 results. I'll see what I can do there.

 It turns out most of the reported time was going to login() calls (see
 OAK-634). I refactored the tests in revision 1526092 so that the login
 calls won't affect the performance measurements.

There were a few other systemic issues with the concurrency benchmarks
(like the background threads running at lower priority and doing less
work). I made some further improvements, and the numbers now look like
this:

# ConcurrentReadTest 90%
Jackrabbit  4132
Oak-Default 2031
Oak-Mongo   2116
Oak-Segment 2258
Oak-Tar 2580

# ConcurrentReadWriteTest90%
Jackrabbit  3192
Oak-Default 3600
Oak-Mongo   4596
Oak-Segment 2605
Oak-Tar 2876

# ConcurrentWriteReadTest90%
Jackrabbit  2770
Oak-Default  875
Oak-Mongo   1243
Oak-Segment  565
Oak-Tar  405

# ConcurrentWriteTest90%
Jackrabbit   597
Oak-Default 2141
Oak-Mongo   1166
Oak-Segment  558
Oak-Tar  348

Full details in https://gist.github.com/jukka/6748243.

BR,

Jukka Zitting


Re: Some more benchmarks

2013-09-24 Thread Jukka Zitting
Hi,

Updating this thread with the latest numbers. No major changes on
these benchmarks:

# ReadPropertyTest   90%
Jackrabbit48
Oak-Default   39
Oak-Mongo 41
Oak-Segment   41
Oak-Tar   42

# SmallFileReadTest  90%
Jackrabbit91
Oak-Default   19
Oak-Mongo 19
Oak-Segment   18
Oak-Tar   18

# SmallFileWriteTest 90%
Jackrabbit   386
Oak-Default  425
Oak-Mongo963
Oak-Segment  180
Oak-Tar   88

Full details in https://gist.github.com/jukka/6693063.

I'll add a few more benchmarks to my test script.

BR,

Jukka Zitting


Re: Some more benchmarks

2013-09-24 Thread Jukka Zitting
Hi,

On Tue, Sep 24, 2013 at 8:08 PM, Jukka Zitting jukka.zitt...@gmail.com wrote:
 I'll add a few more benchmarks to my test script.

Here are results for three more benchmarks:

# SetPropertyTest   90%
Jackrabbit  740
Oak-Default 916
Oak-Mongo  5973
Oak-Segment2386
Oak-Tar 728

# ConcurrentReadTest90%
Jackrabbit25656
Oak-Default   32840
Oak-Mongo 33928
Oak-Segment   35522
Oak-Tar   35354

# ConcurrentReadWriteTest   90%
Jackrabbit19280
Oak-Default   39289
Oak-Mongo 48078
Oak-Segment   37384
Oak-Tar   37165

The SetPropertyTest measures extremely small transactions (single
property change), which makes the networking overhead in Oak-Mongo and
Oak-Segment more prominent. Other than that the results are fairly
good.

The concurrent read and read/write test cases look like more attention
is needed on the test code, as it's currently hard to interpret the
results. I'll see what I can do there.

BR,

Jukka Zitting


Re: Some more benchmarks

2013-09-24 Thread Jukka Zitting
Hi,

On Tue, Sep 24, 2013 at 10:47 PM, Jukka Zitting jukka.zitt...@gmail.com wrote:
 The concurrent read and read/write test cases look like more attention
 is needed on the test code, as it's currently hard to interpret the
 results. I'll see what I can do there.

It turns out most of the reported time was going to login() calls (see
OAK-634). I refactored the tests in revision 1526092 so that the login
calls won't affect the performance measurements. As a result the
numbers look much better:

# ConcurrentReadTest90%
Jackrabbit  447
Oak-Default 286
Oak-Mongo   240
Oak-Segment 245
Oak-Tar 252

# ConcurrentReadWriteTest   90%
Jackrabbit  383
Oak-Default 263
Oak-Mongo   270
Oak-Segment 280
Oak-Tar 268

BR,

Jukka Zitting


Re: Some more benchmarks

2013-07-04 Thread Jukka Zitting
Hi,

On Wed, Jul 3, 2013 at 5:34 PM, Thomas Mueller muel...@adobe.com wrote:
 I think it's because all binaries are loaded from the backend (no
 caching). I bumped the blob cache size from 8 MB to 16 MB, let's see if
 this helps.

Yes, that could be it (I'll run the tests again soon to confirm). The
working set of the test is about 10MB in size and scanned linearly, so
each iteration would end up flushing a cache that's less than 10MB in
size.

BR,

Jukka Zitting


Re: Some more benchmarks

2013-07-04 Thread Jukka Zitting
Hi,

On Thu, Jul 4, 2013 at 1:12 PM, Jukka Zitting jukka.zitt...@gmail.com wrote:
 On Wed, Jul 3, 2013 at 5:34 PM, Thomas Mueller muel...@adobe.com wrote:
 I think it's because all binaries are loaded from the backend (no
 caching). I bumped the blob cache size from 8 MB to 16 MB, let's see if
 this helps.

 Yes, that could be it (I'll run the tests again soon to confirm).

Indeed the numbers look great now:

  # SmallFileReadTest  min 10% 50% 90% max   N
  Oak-Mongo 14  15  16  17  563790

BR,

Jukka Zitting


Re: Some more benchmarks

2013-07-03 Thread Thomas Mueller
Hi,

Thanks a lot!

 I've only included the 90th percentile

I usually look at N first :-)

There is one strange result: SmallFileWriteTest; Oak-Segment: 90%=257,
max=14763 - Maybe the warmup phase is too short, or the test isn't that
great?

As for SmallFileReadTest and SmallFileWriteTest with Oak-Mongo: I think I
know what the problem is; it doesn't seem to be related to BLOB handling
at all (actually performance is the same without the BLOB), but partially
related to the split documents that should be added in the near future.
Also, it seems to be partially related to what the test does (repeatedly
adding and removing the same nodes).

Regards,
Thomas


On 7/2/13 10:11 PM, Jukka Zitting jukka.zitt...@gmail.com wrote:

Hi,

On Fri, May 31, 2013 at 3:14 PM, Jukka Zitting jukka.zitt...@gmail.com
wrote:
 On Fri, Apr 26, 2013 at 2:12 PM, Jukka Zitting
jukka.zitt...@gmail.com wrote:
 On Wed, Mar 27, 2013 at 11:41 AM, Jukka Zitting
jukka.zitt...@gmail.com wrote:
 Here's a few more simple benchmark results to show where we are:

 Updated numbers with latest Oak:

 And another one:

Here we go again:

# ReadPropertyTest   90%
Jackrabbit48
Oak-Default   38
Oak-Mongo 39
Oak-Segment   41
Oak-Tar   40

# SmallFileReadTest  90%
Jackrabbit94
Oak-Default  258
Oak-Mongo421
Oak-Segment   23
Oak-Tar   20

# SmallFileWriteTest 90%
Jackrabbit   424
Oak-Default  349
Oak-Mongo   1376
Oak-Segment  257
Oak-Tar  116

For simplicy I've only included the 90th percentile figure (smaller is
better). See https://gist.github.com/jukka/5912460 for the full
details.

The ReadPropertyTest figures were again lagging behind those of
Jackrabbit, but my changes earlier today got us back to the same
range. However, we've still regressed somewhat from the level we
reached in early June.

BR,

Jukka Zitting



Re: Some more benchmarks

2013-07-03 Thread Jukka Zitting
Hi,

On Wed, Jul 3, 2013 at 11:22 AM, Thomas Mueller muel...@adobe.com wrote:
 I've only included the 90th percentile

 I usually look at N first :-)

It's also a good measure. I like the 90th percentile better as it
filters out outliers that may otherwise weight pretty heavily on the
total or average execution time. Of course, as you note below, it's
good to pay attention also to such cases.

 There is one strange result: SmallFileWriteTest; Oak-Segment: 90%=257,
 max=14763 - Maybe the warmup phase is too short, or the test isn't that
 great?

Good catch. I ran the benchmark a few more times, and the max time is
always pretty high. It shouldn't be about the warmup phase, as the
time in this test should be governed by the blob I/O. I'll try to
figure out what's causing such worst case behavior.

 As for SmallFileReadTest and SmallFileWriteTest with Oak-Mongo: I think I
 know what the problem is; it doesn't seem to be related to BLOB handling
 at all (actually performance is the same without the BLOB), but partially
 related to the split documents that should be added in the near future.
 Also, it seems to be partially related to what the test does (repeatedly
 adding and removing the same nodes).

Right. The semantics of the SmallFileWriteTest should be the same if
the test root name was different for each test iteration, which should
avoid the slit document edge case. I adjusted the test (see patch
below), and the numbers do look a bit better but not radically so:

  # SmallFileWriteTest min 10% 50% 90% max   N
  Oak-Mongo577 591 70811981585  33

I don't know what's dragging the performance in the SmallFileReadTest,
as there the nodes are created just once at the beginning of the
benchmark.

BR,

Jukka Zitting


diff --git 
a/oak-run/src/main/java/org/apache/jackrabbit/oak/benchmark/SmallFileWriteTest.java
b/oak-run/src/main/java/org/apache/jackrabbit/oak/benchmark/SmallFileWriteTest.java
index 7d15b00..c5f2ec8 100644
--- 
a/oak-run/src/main/java/org/apache/jackrabbit/oak/benchmark/SmallFileWriteTest.java
+++ 
b/oak-run/src/main/java/org/apache/jackrabbit/oak/benchmark/SmallFileWriteTest.java
@@ -32,6 +32,8 @@ public class SmallFileWriteTest extends AbstractTest {

 private Node root;

+private long count = 0;
+
 @Override
 public void beforeSuite() throws RepositoryException {
 session = loginWriter();
@@ -39,7 +41,7 @@ public class SmallFileWriteTest extends AbstractTest {

 @Override
 public void beforeTest() throws RepositoryException {
-root = session.getRootNode().addNode(SmallFileWriteTest,
nt:folder);
+root = session.getRootNode().addNode(SmallFileWriteTest +
count++, nt:folder);
 session.save();
 }


Re: Some more benchmarks

2013-07-03 Thread Jukka Zitting
Hi,

On Wed, Jul 3, 2013 at 11:54 AM, Jukka Zitting jukka.zitt...@gmail.com wrote:
 On Wed, Jul 3, 2013 at 11:22 AM, Thomas Mueller muel...@adobe.com wrote:
 I usually look at N first :-)

 It's also a good measure.

Actually not that good, as only the lower limit on the amount of time
over which those N iterations happen is defined, so it's for example
not possible to compute an accurate mean execution time from the
reported N. Also, the N figure also covers the before/afterTest()
methods, which are not included in the other statistics and that which
aren't really within the scope of the functionality that a benchmark
intends to measure. The reason I originally included N in the output
was to given an idea about the statistical significance of the other
figures.

Perhaps we should replace the median (50%) or the 10th percentile (not
a very useful figure) with a more exactly calculated mean execution
time, as that would better represent the information for which N
currently only acts as a rough proxy.

BR,

Jukka Zitting


Re: Some more benchmarks

2013-07-03 Thread Thomas Mueller
Hi,

I don't know what's dragging the performance in the SmallFileReadTest,

I think it's because all binaries are loaded from the backend (no
caching). I bumped the blob cache size from 8 MB to 16 MB, let's see if
this helps.

Regards,
Thomas



Re: Some more benchmarks

2013-07-02 Thread Jukka Zitting
Hi,

On Fri, May 31, 2013 at 3:14 PM, Jukka Zitting jukka.zitt...@gmail.com wrote:
 On Fri, Apr 26, 2013 at 2:12 PM, Jukka Zitting jukka.zitt...@gmail.com 
 wrote:
 On Wed, Mar 27, 2013 at 11:41 AM, Jukka Zitting jukka.zitt...@gmail.com 
 wrote:
 Here's a few more simple benchmark results to show where we are:

 Updated numbers with latest Oak:

 And another one:

Here we go again:

# ReadPropertyTest   90%
Jackrabbit48
Oak-Default   38
Oak-Mongo 39
Oak-Segment   41
Oak-Tar   40

# SmallFileReadTest  90%
Jackrabbit94
Oak-Default  258
Oak-Mongo421
Oak-Segment   23
Oak-Tar   20

# SmallFileWriteTest 90%
Jackrabbit   424
Oak-Default  349
Oak-Mongo   1376
Oak-Segment  257
Oak-Tar  116

For simplicy I've only included the 90th percentile figure (smaller is
better). See https://gist.github.com/jukka/5912460 for the full
details.

The ReadPropertyTest figures were again lagging behind those of
Jackrabbit, but my changes earlier today got us back to the same
range. However, we've still regressed somewhat from the level we
reached in early June.

BR,

Jukka Zitting


Re: Some more benchmarks

2013-06-06 Thread Jukka Zitting
Hi,

On Fri, May 31, 2013 at 3:14 PM, Jukka Zitting jukka.zitt...@gmail.com wrote:
 It looks like we have a performance regression in ReadPropertyTest.
 Quick profiling shows a lot of the time seems to be going to
 MemoryNodeBuilder$ConnectedHead.update(), which is weird since we're
 only reading and thus the related MNB head should be unconnected. I'll
 investigate.

Revision 1490258 fixed the issue. The updated results are:

Apache Jackrabbit Oak 0.9-SNAPSHOT
# ReadPropertyTest   min 10% 50% 90% max   N
Jackrabbit40  41  41  42  971448
Oak-Default   11  12  12  14  194804
Oak-Mongo 17  18  18  20 1233128
Oak-Segment   94  94  96  98 136 622
Oak-Tar   11  11  12  13  175121

BR,

Jukka Zitting


Re: Some more benchmarks

2013-06-03 Thread Thomas Mueller
Hi,

A bit weird is, when I run the tests separately I get different numbers:

java -mx1g -jar target/oak-run-0.9-SNAPSHOT.jar benchmark
SmallFileReadTest Oak-Tar
# SmallFileReadTest  min 10% 50% 90% max
N
Oak-Tar   53  54  55  57  72
1085


java -mx1g -jar target/oak-run-0.9-SNAPSHOT.jar benchmark
SmallFileReadTest Oak-Mongo
# SmallFileReadTest  min 10% 50% 90% max
N
Oak-Mongo102 104 113 122 310
528

In your case, the N was 304 versus 3574 (more than 10 times different), in
my case it was 528 versus 1085 (factor 2).

How did you run the test? I will try the same command line and post my
results.

Regards,
Thomas





On 5/31/13 2:14 PM, Jukka Zitting jukka.zitt...@gmail.com wrote:

Hi,

On Fri, Apr 26, 2013 at 2:12 PM, Jukka Zitting jukka.zitt...@gmail.com
wrote:
 On Wed, Mar 27, 2013 at 11:41 AM, Jukka Zitting
jukka.zitt...@gmail.com wrote:
 Here's a few more simple benchmark results to show where we are:

 Updated numbers with latest Oak:

And another one:

Apache Jackrabbit Oak 0.9-SNAPSHOT
# ReadPropertyTest   min 10% 50% 90% max
 N
Jackrabbit41  41  42  43  90
  1428
Oak-Default   58  58  59  60  69
  1018
Oak-Mongo 66  67  67  68  74
   889
Oak-Segment  278 279 281 285 321
   213
Oak-Tar  114 114 115 117 136
   520
# SmallFileReadTest  min 10% 50% 90% max
 N
Jackrabbit56  57  61  84 194
   895
Oak-Default   57  57  59 304 353
   594
Oak-Mongo148 148 158 406 479
   304
Oak-Segment   33  33  36  37  73
  1701
Oak-Tar   15  15  16  18  31
  3574
# SmallFileWriteTest min 10% 50% 90% max
 N
Jackrabbit   184 196 248 4442084
   115
Oak-Default  136 138 181 4331789
   162
Oak-Mongo595 617 79510201075
31
Oak-Segment  156 161 172 225 660
   100
Oak-Tar  101 102 108 116 270
   167

(also available at https://gist.github.com/jukka/5684506 if the above
gets mangled with a variable-width font)

It looks like we have a performance regression in ReadPropertyTest.
Quick profiling shows a lot of the time seems to be going to
MemoryNodeBuilder$ConnectedHead.update(), which is weird since we're
only reading and thus the related MNB head should be unconnected. I'll
investigate.

BR,

Jukka Zitting



Re: Some more benchmarks

2013-06-03 Thread Thomas Mueller
Hi,

I was not talking about differences in hardware. I know using different
hardware will result in different numbers.

I was worried about results would be different if you run one test alone
versus if you run all tests. That would indicate a problem in the
benchmark (framework) itself.

But luckily, that doesn't seem to be the case. I updated the code and ran
the tests again, and now the results are different. It seems there were
changes recently that improved SmallFileWriteTest for Oak-Tar about 5
times; that's great. The result I got now are:

https://gist.github.com/anonymous/5697099


Specially the SmallFileWriteTest seems slow with Oak. The problem doesn't
seem to be actual blob handling; the profiling result shows the bottleneck
is with the (few) nodes. If I change the blob size to 0 (that is, 100
nodes with the same zero-length blob each, instead of 100 nodes with 10 KB
each), I get basically the same result, with both MongoMK and the Oak-Tar.
Maybe it's a first sign of slow many child nodes?

Regards,
Thomas






On 6/3/13 10:27 AM, Jukka Zitting jukka.zitt...@gmail.com wrote:

Hi,

On Mon, Jun 3, 2013 at 11:09 AM, Thomas Mueller muel...@adobe.com wrote:
 A bit weird is, when I run the tests separately I get different numbers:

The results depend on the hardware you're using, so in general numbers
from two different environments are not directly comparable.

 In your case, the N was 304 versus 3574 (more than 10 times different),
in
 my case it was 528 versus 1085 (factor 2).

Even relative numbers across fixtures can be different depending on
the varying IO/CPU/memory access costs on different environments. For
example an SSD disk will reduce the relative advantage of the TarMK
that cheats by mmapping the entire repository to memory.

 How did you run the test? I will try the same command line and post my
 results.

I'm using a ec2 m1.medium instance to keep the environment stable over
time. It would be nice to keep track of results also on different
hardware.

The command line I've used so far is simply:

$ java -jar oak-run-*.jar benchmark \
  ReadPropertyTest SmallFileReadTest SmallFileWriteTest \
  Jackrabbit Oak-Default Oak-Mongo Oak-Segment Oak-Tar

BR,

Jukka Zitting



Re: Some more benchmarks

2013-06-03 Thread Jukka Zitting
Hi,

On Mon, Jun 3, 2013 at 12:51 PM, Thomas Mueller muel...@adobe.com wrote:
 I was not talking about differences in hardware. I know using different
 hardware will result in different numbers.

 I was worried about results would be different if you run one test alone
 versus if you run all tests. That would indicate a problem in the
 benchmark (framework) itself.

 But luckily, that doesn't seem to be the case.

OK, good. The fixture code tries to make sure that the previous
repository instance is fully shut down before starting a new one, and
the warm-up period built into the test suite should take care of any
remaining startup artifacts.

 Specially the SmallFileWriteTest seems slow with Oak. The problem doesn't
 seem to be actual blob handling; the profiling result shows the bottleneck
 is with the (few) nodes. If I change the blob size to 0 (that is, 100
 nodes with the same zero-length blob each, instead of 100 nodes with 10 KB
 each), I get basically the same result, with both MongoMK and the Oak-Tar.
 Maybe it's a first sign of slow many child nodes?

At least the TarMK should have no problems with the 100 child nodes
(see the Wikipedia import test results :-). Instead I assume (though
haven't profiled in detail) that much of the time is going to the
still unoptimized getEffectiveNodeType() calls in
NodeImpl.internalSetProperty(). Optimizing that is on my TODO.

BR,

Jukka Zitting


Re: Some more benchmarks

2013-06-03 Thread Thomas Mueller
Hi,

At least the TarMK should have no problems with the 100 child nodes
(see the Wikipedia import test results :-).

Yes, I also thought 100 child nodes shouldn't be a problem. The profiling
data I have so far doesn't show a clear bottleneck. I really wonder what
the problem is in this case.

Regards,
Thomas



Re: Some more benchmarks

2013-05-31 Thread Jukka Zitting
Hi,

On Fri, May 31, 2013 at 3:52 PM, Michael C. Moore mo...@adobe.com wrote:
 Can you briefly explain the test results or point me to a wiki or link that 
 has the explanation?

I just committed the description to a README file, see
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-run/README.md

BR,

Jukka Zitting


RE: Some more benchmarks

2013-05-31 Thread Michael C. Moore
Great, thanks!

-Original Message-
From: Jukka Zitting [mailto:jukka.zitt...@gmail.com] 
Sent: Friday, May 31, 2013 9:28 AM
To: Oak devs
Subject: Re: Some more benchmarks

Hi,

On Fri, May 31, 2013 at 3:52 PM, Michael C. Moore mo...@adobe.com wrote:
 Can you briefly explain the test results or point me to a wiki or link that 
 has the explanation?

I just committed the description to a README file, see 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-run/README.md

BR,

Jukka Zitting


Re: Some more benchmarks

2013-04-29 Thread Lukas Eder
Hello,

I'm interested in estimating performance (and load) impacts of ACL
checking on read access. I'm specifically interested in a comparison where
paths like /a, /a/b, /a/b/c, /a/.../y/z are accessed, and ACL has to be
evaluated upwards on the path. Since such a test is more high-level and
may suffer from many side-effects, it's probably more of a load test than
a performance test.

Are there any test results available with respect to ACL, comparing
Jackrabbit with Oak?
Are there any load test results available comparing Jackrabbit with Oak?
Can you point me to the code of these benchmarks?

Cheers
Lukas

On 4/26/13 1:12 PM, Jukka Zitting jukka.zitt...@gmail.com wrote:

Hi,

On Wed, Mar 27, 2013 at 11:41 AM, Jukka Zitting jukka.zitt...@gmail.com
wrote:
 Here's a few more simple benchmark results to show where we are:

Updated numbers with latest Oak:

# ReadPropertyTest   min 10% 50% 90% max
 N
Jackrabbit34  35  37  60 110
  1333
Oak-Default8   9   9  20  76
  4972
Oak-Mongo 10  10  11  34  38
  4501
Oak-Segment   13  13  14  37  44
  3482
# SmallFileReadTest  min 10% 50% 90% max
 N
Jackrabbit50  52  76 117 622
   764
Oak-Default   51  53  77 390 496
   483
Oak-Mongo159 160 184 517 657
   259
Oak-Segment   15  16  17  40  86
  2813
# SmallFileWriteTest min 10% 50% 90% max
 N
Jackrabbit   181 200 250 4691088
   105
Oak-Default  169 180 232 429 923
   107
Oak-Mongo698 727 88610511066
26
Oak-Segment  221 247 262 337 651
77

Overall that's pretty nice progress. Apart from a few exceptions,
we're now better (sometimes significantly so) or on par with
Jackrabbit 2.x in these benchmarks.

BR,

Jukka Zitting



Re: Some more benchmarks

2013-04-29 Thread Jukka Zitting
Hi,

On Mon, Apr 29, 2013 at 10:17 AM, Lukas Eder mar09...@adobe.com wrote:
 Are there any test results available with respect to ACL, comparing
 Jackrabbit with Oak?

Not yet. See the o.a.j.oak.benchmark package in oak-run for some of
the existing benchmarks I've been using so far. It should be fairly
straightforward to use one of the existing classes as a baseline for
building a simple ACL benchmark.

BR,

Jukka Zitting


Re: Some more benchmarks

2013-04-26 Thread Jukka Zitting
Hi,

On Wed, Mar 27, 2013 at 11:41 AM, Jukka Zitting jukka.zitt...@gmail.com wrote:
 Here's a few more simple benchmark results to show where we are:

Updated numbers with latest Oak:

# ReadPropertyTest   min 10% 50% 90% max   N
Jackrabbit34  35  37  60 1101333
Oak-Default8   9   9  20  764972
Oak-Mongo 10  10  11  34  384501
Oak-Segment   13  13  14  37  443482
# SmallFileReadTest  min 10% 50% 90% max   N
Jackrabbit50  52  76 117 622 764
Oak-Default   51  53  77 390 496 483
Oak-Mongo159 160 184 517 657 259
Oak-Segment   15  16  17  40  862813
# SmallFileWriteTest min 10% 50% 90% max   N
Jackrabbit   181 200 250 4691088 105
Oak-Default  169 180 232 429 923 107
Oak-Mongo698 727 88610511066  26
Oak-Segment  221 247 262 337 651  77

Overall that's pretty nice progress. Apart from a few exceptions,
we're now better (sometimes significantly so) or on par with
Jackrabbit 2.x in these benchmarks.

BR,

Jukka Zitting


Re: Some more benchmarks

2013-04-02 Thread Michael Dürig



On 28.3.13 15:19, Angela Schreiber wrote:

hi michael


With the resolution of OAK-690, I made tree instances stable across save
and refresh operations.


does that mean that the AuthorizableImpl could hold a Tree instance
instead of re-accessing it over and over again using a lookup by id?
if that was the case, could you please create an issue asking for
that refactoring such that we don't forget it? the fix should be
fairly trivial as the Tree gets passed to the constructor for
validation but is currently not kept as field.


https://issues.apache.org/jira/browse/OAK-733

Michael



regards
angela


Re: Some more benchmarks

2013-04-02 Thread Angela Schreiber

oh... already fixed it before lunch and committed modifications
before re-checking mails (using OAK-690 and an appropriate comment).

thanks anyway... test passed and i assume it's fine. the internal method 
AuthorizableImpl#getTree will now return the Tree associated

with the authorizable as long it has not been disconneted and
throw IllegalStateException otherwise to avoid odd behavior.

angela

On 4/2/13 12:46 PM, Michael Dürig wrote:



On 28.3.13 15:19, Angela Schreiber wrote:

hi michael


With the resolution of OAK-690, I made tree instances stable across save
and refresh operations.


does that mean that the AuthorizableImpl could hold a Tree instance
instead of re-accessing it over and over again using a lookup by id?
if that was the case, could you please create an issue asking for
that refactoring such that we don't forget it? the fix should be
fairly trivial as the Tree gets passed to the constructor for
validation but is currently not kept as field.


https://issues.apache.org/jira/browse/OAK-733

Michael



regards
angela


Re: Some more benchmarks

2013-03-28 Thread Michael Dürig



On 27.3.13 14:41, Jukka Zitting wrote:

Profiling the getProperty calls shows the following distribution of time spent:

 NodeImpl.getProperty()
   61% NodeDelegate.getProperty() via perform()
   31% ItemImpl.isStale() via checkStatus()
8% other stuff

The status check would be an obvious area of improvement, especially
since we're dealing with a read-only session that's never refreshed.


With the resolution of OAK-690, I made tree instances stable across save 
and refresh operations. There is thus no need any more for trees to be 
re-loaded in ItemDelegate and I removed the respective logic already.


These changes improve the situation somewhat and might open some 
additional room for optimizing the status checks (especially in the case 
of read only sessions).


# ReadPropertyTest min 10% 50% 90% max   N
Jackrabbit   8   8   9  10 1316623
Oak-Default 22  22  23  24  422559
Oak-Default (OAK-690)   15  15  16  17  433654

The second to last line is without the changes done for OAK-690 while 
the last line includes those changes.


Michael


Re: Some more benchmarks

2013-03-28 Thread Michael Dürig



On 27.3.13 14:41, Jukka Zitting wrote:

Drilling down to the NodeDelegate.getProperty() method, we have the
following distribution of time:

 NodeDelegate.getProperty()
   95% NodeDelegate.getChildLocation()
5% TreeImpl.internalGetProperty() via NodeLocation.getProperty()

See why I haven't been too excited about the Location concept...


This is caused by the getChildLocation taking a relative path instead of 
relying on the client to navigate the hierarchy as necessary. This 
effectively duplicates the effort of interpreting paths: once in 
NodeDelegate.getChildLocation() and once in TreeLocation.getChild(). See 
OAK-426 for some discussion.


I did a quick check and changed TeeLocation.getChild() to only take a 
name instead of a relative path:


# ReadPropertyTest min 10% 50% 90% max   N
Oak-Default (before)12  13  14  15 1284166
Oak-Default (after)  8   8   8  10  996896

As said earlier, I suggest to change TreeLocation.getChild() to only 
take names, not relative paths.


Michael


Re: Some more benchmarks

2013-03-27 Thread Michael Dürig



On 27.3.13 11:54, Jukka Zitting wrote:

Hi,




Do we need to explicitly validate all paths that get passed to us?
Especially in cases like getProperty(), where in the vast majority of
the cases the given path matches an existing property (whose path by
definition is valid), it would make more sense to skip such validation
entirely, or at least postpone it to the rare cases where a matching
property was not found.


FWIW, the relevant discussion is here: 
https://issues.apache.org/jira/browse/OAK-108


IIUC you propose to not validate paths in the read case but rely on the 
downstream code to fail. Might be worth a try. However we'd need 
different path parsing then for the read an the write case since 
circumventing path validation for the write case is most certainly not 
the right thing to do.


Michael



BR,

Jukka Zitting



Re: Some more benchmarks

2013-03-27 Thread Jukka Zitting
Hi,

On Wed, Mar 27, 2013 at 2:12 PM, Michael Dürig mdue...@apache.org wrote:
 IIUC you propose to not validate paths in the read case but rely on the
 downstream code to fail. Might be worth a try. However we'd need different
 path parsing then for the read an the write case since circumventing path
 validation for the write case is most certainly not the right thing to do.

We already have the NameValidator that ensures that all (non-hidden)
names stored in the repository are valid. As a consequence also all
existing repository paths are valid.

BR,

Jukka Zitting


Re: Some more benchmarks

2013-03-27 Thread Michael Dürig



On 27.3.13 12:21, Jukka Zitting wrote:

Hi,

On Wed, Mar 27, 2013 at 2:12 PM, Michael Dürig mdue...@apache.org wrote:

IIUC you propose to not validate paths in the read case but rely on the
downstream code to fail. Might be worth a try. However we'd need different
path parsing then for the read an the write case since circumventing path
validation for the write case is most certainly not the right thing to do.


We already have the NameValidator that ensures that all (non-hidden)
names stored in the repository are valid. As a consequence also all
existing repository paths are valid.


That's right. The easiest thing is to try it out, remove pre-emptive 
path validation and see what breaks. I have the vague memory that there 
were some overly picky TCK tests which required us to put this upfront 
validation in. However, a lot of time has past since then so it might be 
a good idea to have another look.


Michael


Re: Some more benchmarks

2013-03-27 Thread Jukka Zitting
Hi,

On Wed, Mar 27, 2013 at 1:54 PM, Jukka Zitting jukka.zitt...@gmail.com wrote:
 Quick benchmarking of the Oak-Default run shows
 NamePathMapperImpl.getOakPath() calling JcrPathParser.validate()
 taking about 20% of the time in this test.

Updated numbers after the latest OAK-108 change:

# ReadPropertyTest   min 10% 50% 90% max   N
before56  58  61 120 132 802
after 53  54  55  56  721089

Profiling the getProperty calls shows the following distribution of time spent:

NodeImpl.getProperty()
  61% NodeDelegate.getProperty() via perform()
  31% ItemImpl.isStale() via checkStatus()
   8% other stuff

The status check would be an obvious area of improvement, especially
since we're dealing with a read-only session that's never refreshed.

Drilling down to the NodeDelegate.getProperty() method, we have the
following distribution of time:

NodeDelegate.getProperty()
  95% NodeDelegate.getChildLocation()
   5% TreeImpl.internalGetProperty() via NodeLocation.getProperty()

See why I haven't been too excited about the Location concept...

BR,

Jukka Zitting