Re: Some more benchmarks
On Tue, Jul 1, 2014 at 8:37 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: Hi, On Tue, Jul 1, 2014 at 9:38 AM, Jukka Zitting jukka.zitt...@gmail.com wrote: I also tried including MongoMK results, but the benchmark got stuck in ConcurrentReadTest. I'll re-try today and will file a bug if I can reproduce the problem. I guess it was a transient problem. Here are the results with Oak-Mongo included: Summary (90%, lower is better) Benchmark Jackrabbit Oak-Mongo Oak-Tar - ReadPropertyTest 45 44 SetPropertyTest 1179 2398 119 SmallFileReadTest 47 97 SmallFileWriteTest182530 43 ConcurrentReadTest 1201 1247 710 ConcurrentReadWriteTest 1900 2321 775 ConcurrentWriteReadTest 1009354 108 ConcurrentWriteTest 532553 101 wow, very impressive, congrats! cheers stefan I updated the gist at https://gist.github.com/jukka/078bd524aa0ba36b184b with full details. The general message here is to use TarMK for maximum single-node performance and MongoMK for scalability and throughput across multiple cluster nodes. BR, Jukka Zitting
Re: Some more benchmarks
Hi, I'm resurrecting this thread with some new findings. I re-ran many of the benchmarks we've been following, pitting Jackrabbit 2.8.0 against Oak 1.0.1 with TarMK. The results look pretty nice: Summary (90%, lower is better) Benchmark Jackrabbit Oak-Tar -- ReadPropertyTest 454 SetPropertyTest 1179 119 SmallFileReadTest 477 SmallFileWriteTest182 43 ConcurrentReadTest 1201 710 ConcurrentReadWriteTest 1900 775 ConcurrentWriteReadTest 1009 108 ConcurrentWriteTest 532 101 See https://gist.github.com/jukka/078bd524aa0ba36b184b for details. I also tried including MongoMK results, but the benchmark got stuck in ConcurrentReadTest. I'll re-try today and will file a bug if I can reproduce the problem. BR, Jukka Zitting
Re: Some more benchmarks
Hi, On Tue, Jul 1, 2014 at 9:38 AM, Jukka Zitting jukka.zitt...@gmail.com wrote: I also tried including MongoMK results, but the benchmark got stuck in ConcurrentReadTest. I'll re-try today and will file a bug if I can reproduce the problem. I guess it was a transient problem. Here are the results with Oak-Mongo included: Summary (90%, lower is better) Benchmark Jackrabbit Oak-Mongo Oak-Tar - ReadPropertyTest 45 44 SetPropertyTest 1179 2398 119 SmallFileReadTest 47 97 SmallFileWriteTest182530 43 ConcurrentReadTest 1201 1247 710 ConcurrentReadWriteTest 1900 2321 775 ConcurrentWriteReadTest 1009354 108 ConcurrentWriteTest 532553 101 I updated the gist at https://gist.github.com/jukka/078bd524aa0ba36b184b with full details. The general message here is to use TarMK for maximum single-node performance and MongoMK for scalability and throughput across multiple cluster nodes. BR, Jukka Zitting
Re: Some more benchmarks
Hi, On Tue, Sep 24, 2013 at 11:19 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: On Tue, Sep 24, 2013 at 10:47 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: The concurrent read and read/write test cases look like more attention is needed on the test code, as it's currently hard to interpret the results. I'll see what I can do there. It turns out most of the reported time was going to login() calls (see OAK-634). I refactored the tests in revision 1526092 so that the login calls won't affect the performance measurements. There were a few other systemic issues with the concurrency benchmarks (like the background threads running at lower priority and doing less work). I made some further improvements, and the numbers now look like this: # ConcurrentReadTest 90% Jackrabbit 4132 Oak-Default 2031 Oak-Mongo 2116 Oak-Segment 2258 Oak-Tar 2580 # ConcurrentReadWriteTest90% Jackrabbit 3192 Oak-Default 3600 Oak-Mongo 4596 Oak-Segment 2605 Oak-Tar 2876 # ConcurrentWriteReadTest90% Jackrabbit 2770 Oak-Default 875 Oak-Mongo 1243 Oak-Segment 565 Oak-Tar 405 # ConcurrentWriteTest90% Jackrabbit 597 Oak-Default 2141 Oak-Mongo 1166 Oak-Segment 558 Oak-Tar 348 Full details in https://gist.github.com/jukka/6748243. BR, Jukka Zitting
Re: Some more benchmarks
Hi, Updating this thread with the latest numbers. No major changes on these benchmarks: # ReadPropertyTest 90% Jackrabbit48 Oak-Default 39 Oak-Mongo 41 Oak-Segment 41 Oak-Tar 42 # SmallFileReadTest 90% Jackrabbit91 Oak-Default 19 Oak-Mongo 19 Oak-Segment 18 Oak-Tar 18 # SmallFileWriteTest 90% Jackrabbit 386 Oak-Default 425 Oak-Mongo963 Oak-Segment 180 Oak-Tar 88 Full details in https://gist.github.com/jukka/6693063. I'll add a few more benchmarks to my test script. BR, Jukka Zitting
Re: Some more benchmarks
Hi, On Tue, Sep 24, 2013 at 8:08 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: I'll add a few more benchmarks to my test script. Here are results for three more benchmarks: # SetPropertyTest 90% Jackrabbit 740 Oak-Default 916 Oak-Mongo 5973 Oak-Segment2386 Oak-Tar 728 # ConcurrentReadTest90% Jackrabbit25656 Oak-Default 32840 Oak-Mongo 33928 Oak-Segment 35522 Oak-Tar 35354 # ConcurrentReadWriteTest 90% Jackrabbit19280 Oak-Default 39289 Oak-Mongo 48078 Oak-Segment 37384 Oak-Tar 37165 The SetPropertyTest measures extremely small transactions (single property change), which makes the networking overhead in Oak-Mongo and Oak-Segment more prominent. Other than that the results are fairly good. The concurrent read and read/write test cases look like more attention is needed on the test code, as it's currently hard to interpret the results. I'll see what I can do there. BR, Jukka Zitting
Re: Some more benchmarks
Hi, On Tue, Sep 24, 2013 at 10:47 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: The concurrent read and read/write test cases look like more attention is needed on the test code, as it's currently hard to interpret the results. I'll see what I can do there. It turns out most of the reported time was going to login() calls (see OAK-634). I refactored the tests in revision 1526092 so that the login calls won't affect the performance measurements. As a result the numbers look much better: # ConcurrentReadTest90% Jackrabbit 447 Oak-Default 286 Oak-Mongo 240 Oak-Segment 245 Oak-Tar 252 # ConcurrentReadWriteTest 90% Jackrabbit 383 Oak-Default 263 Oak-Mongo 270 Oak-Segment 280 Oak-Tar 268 BR, Jukka Zitting
Re: Some more benchmarks
Hi, On Wed, Jul 3, 2013 at 5:34 PM, Thomas Mueller muel...@adobe.com wrote: I think it's because all binaries are loaded from the backend (no caching). I bumped the blob cache size from 8 MB to 16 MB, let's see if this helps. Yes, that could be it (I'll run the tests again soon to confirm). The working set of the test is about 10MB in size and scanned linearly, so each iteration would end up flushing a cache that's less than 10MB in size. BR, Jukka Zitting
Re: Some more benchmarks
Hi, On Thu, Jul 4, 2013 at 1:12 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: On Wed, Jul 3, 2013 at 5:34 PM, Thomas Mueller muel...@adobe.com wrote: I think it's because all binaries are loaded from the backend (no caching). I bumped the blob cache size from 8 MB to 16 MB, let's see if this helps. Yes, that could be it (I'll run the tests again soon to confirm). Indeed the numbers look great now: # SmallFileReadTest min 10% 50% 90% max N Oak-Mongo 14 15 16 17 563790 BR, Jukka Zitting
Re: Some more benchmarks
Hi, Thanks a lot! I've only included the 90th percentile I usually look at N first :-) There is one strange result: SmallFileWriteTest; Oak-Segment: 90%=257, max=14763 - Maybe the warmup phase is too short, or the test isn't that great? As for SmallFileReadTest and SmallFileWriteTest with Oak-Mongo: I think I know what the problem is; it doesn't seem to be related to BLOB handling at all (actually performance is the same without the BLOB), but partially related to the split documents that should be added in the near future. Also, it seems to be partially related to what the test does (repeatedly adding and removing the same nodes). Regards, Thomas On 7/2/13 10:11 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: Hi, On Fri, May 31, 2013 at 3:14 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: On Fri, Apr 26, 2013 at 2:12 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: On Wed, Mar 27, 2013 at 11:41 AM, Jukka Zitting jukka.zitt...@gmail.com wrote: Here's a few more simple benchmark results to show where we are: Updated numbers with latest Oak: And another one: Here we go again: # ReadPropertyTest 90% Jackrabbit48 Oak-Default 38 Oak-Mongo 39 Oak-Segment 41 Oak-Tar 40 # SmallFileReadTest 90% Jackrabbit94 Oak-Default 258 Oak-Mongo421 Oak-Segment 23 Oak-Tar 20 # SmallFileWriteTest 90% Jackrabbit 424 Oak-Default 349 Oak-Mongo 1376 Oak-Segment 257 Oak-Tar 116 For simplicy I've only included the 90th percentile figure (smaller is better). See https://gist.github.com/jukka/5912460 for the full details. The ReadPropertyTest figures were again lagging behind those of Jackrabbit, but my changes earlier today got us back to the same range. However, we've still regressed somewhat from the level we reached in early June. BR, Jukka Zitting
Re: Some more benchmarks
Hi, On Wed, Jul 3, 2013 at 11:22 AM, Thomas Mueller muel...@adobe.com wrote: I've only included the 90th percentile I usually look at N first :-) It's also a good measure. I like the 90th percentile better as it filters out outliers that may otherwise weight pretty heavily on the total or average execution time. Of course, as you note below, it's good to pay attention also to such cases. There is one strange result: SmallFileWriteTest; Oak-Segment: 90%=257, max=14763 - Maybe the warmup phase is too short, or the test isn't that great? Good catch. I ran the benchmark a few more times, and the max time is always pretty high. It shouldn't be about the warmup phase, as the time in this test should be governed by the blob I/O. I'll try to figure out what's causing such worst case behavior. As for SmallFileReadTest and SmallFileWriteTest with Oak-Mongo: I think I know what the problem is; it doesn't seem to be related to BLOB handling at all (actually performance is the same without the BLOB), but partially related to the split documents that should be added in the near future. Also, it seems to be partially related to what the test does (repeatedly adding and removing the same nodes). Right. The semantics of the SmallFileWriteTest should be the same if the test root name was different for each test iteration, which should avoid the slit document edge case. I adjusted the test (see patch below), and the numbers do look a bit better but not radically so: # SmallFileWriteTest min 10% 50% 90% max N Oak-Mongo577 591 70811981585 33 I don't know what's dragging the performance in the SmallFileReadTest, as there the nodes are created just once at the beginning of the benchmark. BR, Jukka Zitting diff --git a/oak-run/src/main/java/org/apache/jackrabbit/oak/benchmark/SmallFileWriteTest.java b/oak-run/src/main/java/org/apache/jackrabbit/oak/benchmark/SmallFileWriteTest.java index 7d15b00..c5f2ec8 100644 --- a/oak-run/src/main/java/org/apache/jackrabbit/oak/benchmark/SmallFileWriteTest.java +++ b/oak-run/src/main/java/org/apache/jackrabbit/oak/benchmark/SmallFileWriteTest.java @@ -32,6 +32,8 @@ public class SmallFileWriteTest extends AbstractTest { private Node root; +private long count = 0; + @Override public void beforeSuite() throws RepositoryException { session = loginWriter(); @@ -39,7 +41,7 @@ public class SmallFileWriteTest extends AbstractTest { @Override public void beforeTest() throws RepositoryException { -root = session.getRootNode().addNode(SmallFileWriteTest, nt:folder); +root = session.getRootNode().addNode(SmallFileWriteTest + count++, nt:folder); session.save(); }
Re: Some more benchmarks
Hi, On Wed, Jul 3, 2013 at 11:54 AM, Jukka Zitting jukka.zitt...@gmail.com wrote: On Wed, Jul 3, 2013 at 11:22 AM, Thomas Mueller muel...@adobe.com wrote: I usually look at N first :-) It's also a good measure. Actually not that good, as only the lower limit on the amount of time over which those N iterations happen is defined, so it's for example not possible to compute an accurate mean execution time from the reported N. Also, the N figure also covers the before/afterTest() methods, which are not included in the other statistics and that which aren't really within the scope of the functionality that a benchmark intends to measure. The reason I originally included N in the output was to given an idea about the statistical significance of the other figures. Perhaps we should replace the median (50%) or the 10th percentile (not a very useful figure) with a more exactly calculated mean execution time, as that would better represent the information for which N currently only acts as a rough proxy. BR, Jukka Zitting
Re: Some more benchmarks
Hi, I don't know what's dragging the performance in the SmallFileReadTest, I think it's because all binaries are loaded from the backend (no caching). I bumped the blob cache size from 8 MB to 16 MB, let's see if this helps. Regards, Thomas
Re: Some more benchmarks
Hi, On Fri, May 31, 2013 at 3:14 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: On Fri, Apr 26, 2013 at 2:12 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: On Wed, Mar 27, 2013 at 11:41 AM, Jukka Zitting jukka.zitt...@gmail.com wrote: Here's a few more simple benchmark results to show where we are: Updated numbers with latest Oak: And another one: Here we go again: # ReadPropertyTest 90% Jackrabbit48 Oak-Default 38 Oak-Mongo 39 Oak-Segment 41 Oak-Tar 40 # SmallFileReadTest 90% Jackrabbit94 Oak-Default 258 Oak-Mongo421 Oak-Segment 23 Oak-Tar 20 # SmallFileWriteTest 90% Jackrabbit 424 Oak-Default 349 Oak-Mongo 1376 Oak-Segment 257 Oak-Tar 116 For simplicy I've only included the 90th percentile figure (smaller is better). See https://gist.github.com/jukka/5912460 for the full details. The ReadPropertyTest figures were again lagging behind those of Jackrabbit, but my changes earlier today got us back to the same range. However, we've still regressed somewhat from the level we reached in early June. BR, Jukka Zitting
Re: Some more benchmarks
Hi, On Fri, May 31, 2013 at 3:14 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: It looks like we have a performance regression in ReadPropertyTest. Quick profiling shows a lot of the time seems to be going to MemoryNodeBuilder$ConnectedHead.update(), which is weird since we're only reading and thus the related MNB head should be unconnected. I'll investigate. Revision 1490258 fixed the issue. The updated results are: Apache Jackrabbit Oak 0.9-SNAPSHOT # ReadPropertyTest min 10% 50% 90% max N Jackrabbit40 41 41 42 971448 Oak-Default 11 12 12 14 194804 Oak-Mongo 17 18 18 20 1233128 Oak-Segment 94 94 96 98 136 622 Oak-Tar 11 11 12 13 175121 BR, Jukka Zitting
Re: Some more benchmarks
Hi, A bit weird is, when I run the tests separately I get different numbers: java -mx1g -jar target/oak-run-0.9-SNAPSHOT.jar benchmark SmallFileReadTest Oak-Tar # SmallFileReadTest min 10% 50% 90% max N Oak-Tar 53 54 55 57 72 1085 java -mx1g -jar target/oak-run-0.9-SNAPSHOT.jar benchmark SmallFileReadTest Oak-Mongo # SmallFileReadTest min 10% 50% 90% max N Oak-Mongo102 104 113 122 310 528 In your case, the N was 304 versus 3574 (more than 10 times different), in my case it was 528 versus 1085 (factor 2). How did you run the test? I will try the same command line and post my results. Regards, Thomas On 5/31/13 2:14 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: Hi, On Fri, Apr 26, 2013 at 2:12 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: On Wed, Mar 27, 2013 at 11:41 AM, Jukka Zitting jukka.zitt...@gmail.com wrote: Here's a few more simple benchmark results to show where we are: Updated numbers with latest Oak: And another one: Apache Jackrabbit Oak 0.9-SNAPSHOT # ReadPropertyTest min 10% 50% 90% max N Jackrabbit41 41 42 43 90 1428 Oak-Default 58 58 59 60 69 1018 Oak-Mongo 66 67 67 68 74 889 Oak-Segment 278 279 281 285 321 213 Oak-Tar 114 114 115 117 136 520 # SmallFileReadTest min 10% 50% 90% max N Jackrabbit56 57 61 84 194 895 Oak-Default 57 57 59 304 353 594 Oak-Mongo148 148 158 406 479 304 Oak-Segment 33 33 36 37 73 1701 Oak-Tar 15 15 16 18 31 3574 # SmallFileWriteTest min 10% 50% 90% max N Jackrabbit 184 196 248 4442084 115 Oak-Default 136 138 181 4331789 162 Oak-Mongo595 617 79510201075 31 Oak-Segment 156 161 172 225 660 100 Oak-Tar 101 102 108 116 270 167 (also available at https://gist.github.com/jukka/5684506 if the above gets mangled with a variable-width font) It looks like we have a performance regression in ReadPropertyTest. Quick profiling shows a lot of the time seems to be going to MemoryNodeBuilder$ConnectedHead.update(), which is weird since we're only reading and thus the related MNB head should be unconnected. I'll investigate. BR, Jukka Zitting
Re: Some more benchmarks
Hi, I was not talking about differences in hardware. I know using different hardware will result in different numbers. I was worried about results would be different if you run one test alone versus if you run all tests. That would indicate a problem in the benchmark (framework) itself. But luckily, that doesn't seem to be the case. I updated the code and ran the tests again, and now the results are different. It seems there were changes recently that improved SmallFileWriteTest for Oak-Tar about 5 times; that's great. The result I got now are: https://gist.github.com/anonymous/5697099 Specially the SmallFileWriteTest seems slow with Oak. The problem doesn't seem to be actual blob handling; the profiling result shows the bottleneck is with the (few) nodes. If I change the blob size to 0 (that is, 100 nodes with the same zero-length blob each, instead of 100 nodes with 10 KB each), I get basically the same result, with both MongoMK and the Oak-Tar. Maybe it's a first sign of slow many child nodes? Regards, Thomas On 6/3/13 10:27 AM, Jukka Zitting jukka.zitt...@gmail.com wrote: Hi, On Mon, Jun 3, 2013 at 11:09 AM, Thomas Mueller muel...@adobe.com wrote: A bit weird is, when I run the tests separately I get different numbers: The results depend on the hardware you're using, so in general numbers from two different environments are not directly comparable. In your case, the N was 304 versus 3574 (more than 10 times different), in my case it was 528 versus 1085 (factor 2). Even relative numbers across fixtures can be different depending on the varying IO/CPU/memory access costs on different environments. For example an SSD disk will reduce the relative advantage of the TarMK that cheats by mmapping the entire repository to memory. How did you run the test? I will try the same command line and post my results. I'm using a ec2 m1.medium instance to keep the environment stable over time. It would be nice to keep track of results also on different hardware. The command line I've used so far is simply: $ java -jar oak-run-*.jar benchmark \ ReadPropertyTest SmallFileReadTest SmallFileWriteTest \ Jackrabbit Oak-Default Oak-Mongo Oak-Segment Oak-Tar BR, Jukka Zitting
Re: Some more benchmarks
Hi, On Mon, Jun 3, 2013 at 12:51 PM, Thomas Mueller muel...@adobe.com wrote: I was not talking about differences in hardware. I know using different hardware will result in different numbers. I was worried about results would be different if you run one test alone versus if you run all tests. That would indicate a problem in the benchmark (framework) itself. But luckily, that doesn't seem to be the case. OK, good. The fixture code tries to make sure that the previous repository instance is fully shut down before starting a new one, and the warm-up period built into the test suite should take care of any remaining startup artifacts. Specially the SmallFileWriteTest seems slow with Oak. The problem doesn't seem to be actual blob handling; the profiling result shows the bottleneck is with the (few) nodes. If I change the blob size to 0 (that is, 100 nodes with the same zero-length blob each, instead of 100 nodes with 10 KB each), I get basically the same result, with both MongoMK and the Oak-Tar. Maybe it's a first sign of slow many child nodes? At least the TarMK should have no problems with the 100 child nodes (see the Wikipedia import test results :-). Instead I assume (though haven't profiled in detail) that much of the time is going to the still unoptimized getEffectiveNodeType() calls in NodeImpl.internalSetProperty(). Optimizing that is on my TODO. BR, Jukka Zitting
Re: Some more benchmarks
Hi, At least the TarMK should have no problems with the 100 child nodes (see the Wikipedia import test results :-). Yes, I also thought 100 child nodes shouldn't be a problem. The profiling data I have so far doesn't show a clear bottleneck. I really wonder what the problem is in this case. Regards, Thomas
Re: Some more benchmarks
Hi, On Fri, May 31, 2013 at 3:52 PM, Michael C. Moore mo...@adobe.com wrote: Can you briefly explain the test results or point me to a wiki or link that has the explanation? I just committed the description to a README file, see https://github.com/apache/jackrabbit-oak/blob/trunk/oak-run/README.md BR, Jukka Zitting
RE: Some more benchmarks
Great, thanks! -Original Message- From: Jukka Zitting [mailto:jukka.zitt...@gmail.com] Sent: Friday, May 31, 2013 9:28 AM To: Oak devs Subject: Re: Some more benchmarks Hi, On Fri, May 31, 2013 at 3:52 PM, Michael C. Moore mo...@adobe.com wrote: Can you briefly explain the test results or point me to a wiki or link that has the explanation? I just committed the description to a README file, see https://github.com/apache/jackrabbit-oak/blob/trunk/oak-run/README.md BR, Jukka Zitting
Re: Some more benchmarks
Hello, I'm interested in estimating performance (and load) impacts of ACL checking on read access. I'm specifically interested in a comparison where paths like /a, /a/b, /a/b/c, /a/.../y/z are accessed, and ACL has to be evaluated upwards on the path. Since such a test is more high-level and may suffer from many side-effects, it's probably more of a load test than a performance test. Are there any test results available with respect to ACL, comparing Jackrabbit with Oak? Are there any load test results available comparing Jackrabbit with Oak? Can you point me to the code of these benchmarks? Cheers Lukas On 4/26/13 1:12 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: Hi, On Wed, Mar 27, 2013 at 11:41 AM, Jukka Zitting jukka.zitt...@gmail.com wrote: Here's a few more simple benchmark results to show where we are: Updated numbers with latest Oak: # ReadPropertyTest min 10% 50% 90% max N Jackrabbit34 35 37 60 110 1333 Oak-Default8 9 9 20 76 4972 Oak-Mongo 10 10 11 34 38 4501 Oak-Segment 13 13 14 37 44 3482 # SmallFileReadTest min 10% 50% 90% max N Jackrabbit50 52 76 117 622 764 Oak-Default 51 53 77 390 496 483 Oak-Mongo159 160 184 517 657 259 Oak-Segment 15 16 17 40 86 2813 # SmallFileWriteTest min 10% 50% 90% max N Jackrabbit 181 200 250 4691088 105 Oak-Default 169 180 232 429 923 107 Oak-Mongo698 727 88610511066 26 Oak-Segment 221 247 262 337 651 77 Overall that's pretty nice progress. Apart from a few exceptions, we're now better (sometimes significantly so) or on par with Jackrabbit 2.x in these benchmarks. BR, Jukka Zitting
Re: Some more benchmarks
Hi, On Mon, Apr 29, 2013 at 10:17 AM, Lukas Eder mar09...@adobe.com wrote: Are there any test results available with respect to ACL, comparing Jackrabbit with Oak? Not yet. See the o.a.j.oak.benchmark package in oak-run for some of the existing benchmarks I've been using so far. It should be fairly straightforward to use one of the existing classes as a baseline for building a simple ACL benchmark. BR, Jukka Zitting
Re: Some more benchmarks
Hi, On Wed, Mar 27, 2013 at 11:41 AM, Jukka Zitting jukka.zitt...@gmail.com wrote: Here's a few more simple benchmark results to show where we are: Updated numbers with latest Oak: # ReadPropertyTest min 10% 50% 90% max N Jackrabbit34 35 37 60 1101333 Oak-Default8 9 9 20 764972 Oak-Mongo 10 10 11 34 384501 Oak-Segment 13 13 14 37 443482 # SmallFileReadTest min 10% 50% 90% max N Jackrabbit50 52 76 117 622 764 Oak-Default 51 53 77 390 496 483 Oak-Mongo159 160 184 517 657 259 Oak-Segment 15 16 17 40 862813 # SmallFileWriteTest min 10% 50% 90% max N Jackrabbit 181 200 250 4691088 105 Oak-Default 169 180 232 429 923 107 Oak-Mongo698 727 88610511066 26 Oak-Segment 221 247 262 337 651 77 Overall that's pretty nice progress. Apart from a few exceptions, we're now better (sometimes significantly so) or on par with Jackrabbit 2.x in these benchmarks. BR, Jukka Zitting
Re: Some more benchmarks
On 28.3.13 15:19, Angela Schreiber wrote: hi michael With the resolution of OAK-690, I made tree instances stable across save and refresh operations. does that mean that the AuthorizableImpl could hold a Tree instance instead of re-accessing it over and over again using a lookup by id? if that was the case, could you please create an issue asking for that refactoring such that we don't forget it? the fix should be fairly trivial as the Tree gets passed to the constructor for validation but is currently not kept as field. https://issues.apache.org/jira/browse/OAK-733 Michael regards angela
Re: Some more benchmarks
oh... already fixed it before lunch and committed modifications before re-checking mails (using OAK-690 and an appropriate comment). thanks anyway... test passed and i assume it's fine. the internal method AuthorizableImpl#getTree will now return the Tree associated with the authorizable as long it has not been disconneted and throw IllegalStateException otherwise to avoid odd behavior. angela On 4/2/13 12:46 PM, Michael Dürig wrote: On 28.3.13 15:19, Angela Schreiber wrote: hi michael With the resolution of OAK-690, I made tree instances stable across save and refresh operations. does that mean that the AuthorizableImpl could hold a Tree instance instead of re-accessing it over and over again using a lookup by id? if that was the case, could you please create an issue asking for that refactoring such that we don't forget it? the fix should be fairly trivial as the Tree gets passed to the constructor for validation but is currently not kept as field. https://issues.apache.org/jira/browse/OAK-733 Michael regards angela
Re: Some more benchmarks
On 27.3.13 14:41, Jukka Zitting wrote: Profiling the getProperty calls shows the following distribution of time spent: NodeImpl.getProperty() 61% NodeDelegate.getProperty() via perform() 31% ItemImpl.isStale() via checkStatus() 8% other stuff The status check would be an obvious area of improvement, especially since we're dealing with a read-only session that's never refreshed. With the resolution of OAK-690, I made tree instances stable across save and refresh operations. There is thus no need any more for trees to be re-loaded in ItemDelegate and I removed the respective logic already. These changes improve the situation somewhat and might open some additional room for optimizing the status checks (especially in the case of read only sessions). # ReadPropertyTest min 10% 50% 90% max N Jackrabbit 8 8 9 10 1316623 Oak-Default 22 22 23 24 422559 Oak-Default (OAK-690) 15 15 16 17 433654 The second to last line is without the changes done for OAK-690 while the last line includes those changes. Michael
Re: Some more benchmarks
On 27.3.13 14:41, Jukka Zitting wrote: Drilling down to the NodeDelegate.getProperty() method, we have the following distribution of time: NodeDelegate.getProperty() 95% NodeDelegate.getChildLocation() 5% TreeImpl.internalGetProperty() via NodeLocation.getProperty() See why I haven't been too excited about the Location concept... This is caused by the getChildLocation taking a relative path instead of relying on the client to navigate the hierarchy as necessary. This effectively duplicates the effort of interpreting paths: once in NodeDelegate.getChildLocation() and once in TreeLocation.getChild(). See OAK-426 for some discussion. I did a quick check and changed TeeLocation.getChild() to only take a name instead of a relative path: # ReadPropertyTest min 10% 50% 90% max N Oak-Default (before)12 13 14 15 1284166 Oak-Default (after) 8 8 8 10 996896 As said earlier, I suggest to change TreeLocation.getChild() to only take names, not relative paths. Michael
Re: Some more benchmarks
On 27.3.13 11:54, Jukka Zitting wrote: Hi, Do we need to explicitly validate all paths that get passed to us? Especially in cases like getProperty(), where in the vast majority of the cases the given path matches an existing property (whose path by definition is valid), it would make more sense to skip such validation entirely, or at least postpone it to the rare cases where a matching property was not found. FWIW, the relevant discussion is here: https://issues.apache.org/jira/browse/OAK-108 IIUC you propose to not validate paths in the read case but rely on the downstream code to fail. Might be worth a try. However we'd need different path parsing then for the read an the write case since circumventing path validation for the write case is most certainly not the right thing to do. Michael BR, Jukka Zitting
Re: Some more benchmarks
Hi, On Wed, Mar 27, 2013 at 2:12 PM, Michael Dürig mdue...@apache.org wrote: IIUC you propose to not validate paths in the read case but rely on the downstream code to fail. Might be worth a try. However we'd need different path parsing then for the read an the write case since circumventing path validation for the write case is most certainly not the right thing to do. We already have the NameValidator that ensures that all (non-hidden) names stored in the repository are valid. As a consequence also all existing repository paths are valid. BR, Jukka Zitting
Re: Some more benchmarks
On 27.3.13 12:21, Jukka Zitting wrote: Hi, On Wed, Mar 27, 2013 at 2:12 PM, Michael Dürig mdue...@apache.org wrote: IIUC you propose to not validate paths in the read case but rely on the downstream code to fail. Might be worth a try. However we'd need different path parsing then for the read an the write case since circumventing path validation for the write case is most certainly not the right thing to do. We already have the NameValidator that ensures that all (non-hidden) names stored in the repository are valid. As a consequence also all existing repository paths are valid. That's right. The easiest thing is to try it out, remove pre-emptive path validation and see what breaks. I have the vague memory that there were some overly picky TCK tests which required us to put this upfront validation in. However, a lot of time has past since then so it might be a good idea to have another look. Michael
Re: Some more benchmarks
Hi, On Wed, Mar 27, 2013 at 1:54 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: Quick benchmarking of the Oak-Default run shows NamePathMapperImpl.getOakPath() calling JcrPathParser.validate() taking about 20% of the time in this test. Updated numbers after the latest OAK-108 change: # ReadPropertyTest min 10% 50% 90% max N before56 58 61 120 132 802 after 53 54 55 56 721089 Profiling the getProperty calls shows the following distribution of time spent: NodeImpl.getProperty() 61% NodeDelegate.getProperty() via perform() 31% ItemImpl.isStale() via checkStatus() 8% other stuff The status check would be an obvious area of improvement, especially since we're dealing with a read-only session that's never refreshed. Drilling down to the NodeDelegate.getProperty() method, we have the following distribution of time: NodeDelegate.getProperty() 95% NodeDelegate.getChildLocation() 5% TreeImpl.internalGetProperty() via NodeLocation.getProperty() See why I haven't been too excited about the Location concept... BR, Jukka Zitting