Re: Jackrabbit 2.10 vs Oak 1.2.7

2016-04-19 Thread Marcel Reutegger
Hi Domenic

I apologize for the late reply, but now I finally had time
to look at your test.

The reason why Oak on MongoDB is so slow with your test is the
write concern that your test specifies when it constructs
the DocumentNodeStore. The test sets it to FSYNCED. This is
an appropriate write concern when you only have a single MongoDB
node but comes with a very high latency. In general MongoDB is
designed to run in production as a replica set and the recommended
write concern with this deployment would be MAJORITY.

More details why Oak on MongoDB performs badly with your test
is also available in OAK-3554 [0].

So, you should either reduce the journalCommitInterval in MongoDB
or test with a replica set and MAJORITY write concern. Both
should give you a significant speedup compared to your current
test setup.

Regards
 Marcel

[0] 
https://issues.apache.org/jira/browse/OAK-3554?focusedCommentId=14991306&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14991306


On 06/04/16 16:20, "Domenic DiTano" wrote:

Hi Marcel,

I uploaded all the source to github along with a summary spreadsheet.  I
would appreciate any time you have to review.

https://github.com/Domenic-Ansys/Jackrabbit2-Oak-Tests

As you stated the move is a non goal, but in comparison to Jackrabbit 2 I
am also finding in my tests that create, update, and copy are all faster
in Jackrabbit 2 (10k nodes).  Any input would be appreciated...

Also, will MySql will not be listed as "Experimental" at some point?

Thanks,
Domenic


-Original Message-
From: Marcel Reutegger [mailto:mreut...@adobe.com]
Sent: Thursday, March 31, 2016 6:14 AM
To: oak-dev@jackrabbit.apache.org<mailto:oak-dev@jackrabbit.apache.org>
Subject: Re: Jackrabbit 2.10 vs Oak 1.2.7

Hi Domenic,

On 30/03/16 14:34, "Domenic DiTano" wrote:
"In contrast to Jackrabbit 2, a move of a large subtree is an expensive
operation in Oak"
So should I avoid doing a move of a large number of items using Oak?
If we are using Oak then should we avoid operations with a large number
of items in general?

In general it is fine to have a large change set with Oak. With Oak you
can even have change sets that do not fit into the heap.

  As a FYI - there are other benefits for us to move to Oak, but our
application uses executes JCR operations with a large number of items
quite often.  I am worried about the performance.

The move method is pretty simple - should I be doing it differently?

public static long moveNodes(Session session, Node node, String
newNodeName)
throws Exception{
 long start = System.currentTimeMillis();
session.move(node.getPath(), "/"+newNodeName);
 session.save();
 long end = System.currentTimeMillis();
 return end-start;
}

No, this is fine. As mentioned earlier, with Oak a move operation is not
cheap and is basically implemented as copy to new location and delete at
the old location.

A cheap move operation was considered a non-goal when Oak was designed:
https://wiki.apache.org/jackrabbit/Goals%20and%20non%20goals%20for%20Jackr
a
bbit%203


Regards
Marcel



Re: Jackrabbit 2.10 vs Oak 1.2.7

2016-04-10 Thread Michael Marth
;would appreciate any time you have to review.
>> >
>> >https://github.com/Domenic-Ansys/Jackrabbit2-Oak-Tests
>> >
>> >As you stated the move is a non goal, but in comparison to Jackrabbit 2 I
>> >am also finding in my tests that create, update, and copy are all faster
>> >in Jackrabbit 2 (10k nodes).  Any input would be appreciated...
>> >
>> >Also, will MySql will not be listed as "Experimental" at some point?
>> >
>> >Thanks,
>> >Domenic
>> >
>> >
>> >-Original Message-
>> >From: Marcel Reutegger [mailto:mreut...@adobe.com]
>> >Sent: Thursday, March 31, 2016 6:14 AM
>> >To: oak-dev@jackrabbit.apache.org
>> >Subject: Re: Jackrabbit 2.10 vs Oak 1.2.7
>> >
>> >Hi Domenic,
>> >
>> >On 30/03/16 14:34, "Domenic DiTano" wrote:
>> >>"In contrast to Jackrabbit 2, a move of a large subtree is an expensive
>> >>operation in Oak"
>> >>So should I avoid doing a move of a large number of items using Oak?
>> >>If we are using Oak then should we avoid operations with a large number
>> >>of items in general?
>> >
>> >In general it is fine to have a large change set with Oak. With Oak you
>> >can even have change sets that do not fit into the heap.
>> >
>> >>  As a FYI - there are other benefits for us to move to Oak, but our
>> >>application uses executes JCR operations with a large number of items
>> >>quite often.  I am worried about the performance.
>> >>
>> >>The move method is pretty simple - should I be doing it differently?
>> >>
>> >>public static long moveNodes(Session session, Node node, String
>> >>newNodeName)
>> >>throws Exception{
>> >>  long start = System.currentTimeMillis();
>> >>  session.move(node.getPath(), "/"+newNodeName);
>> >> session.save();
>> >>  long end = System.currentTimeMillis();
>> >>  return end-start;
>> >>}
>> >
>> >No, this is fine. As mentioned earlier, with Oak a move operation is not
>> >cheap and is basically implemented as copy to new location and delete at
>> >the old location.
>> >
>> >A cheap move operation was considered a non-goal when Oak was designed:
>> >
>> https://wiki.apache.org/jackrabbit/Goals%20and%20non%20goals%20for%20Jackr
>> >a
>> >bbit%203
>> >
>> >
>> >Regards
>> > Marcel
>>
>
>
>
>-- 
>Domenic DiTano
>ANSYS, Inc.
>Tel: 1.724.514.3624
>domenic.dit...@ansys.com
>www.ansys.com


Re: Jackrabbit 2.10 vs Oak 1.2.7

2016-04-08 Thread Domenic DiTano
Hi Michael,

First thank you for your response.

My POV:
"You are essentially testing how fast Oak or JR can put nodes into
MySQL/Postgres/Mongo. IMO Oak’s design does not suggest that there should
be fundamental differences between JR and Oak for this isolated case. (*)"

Are you saying there should not be a difference for this test case between
oak/jcr?  I understand your point that I am testing how fast Oak/JR put's
things into a database, but from my perspective I am doing simple JCR
operations like creating/updating/moving a reasonable number of nodes and
JR seems to be performing significantly better.  I also ran the tests at
100 nodes and in general Jackrabbit 2's performance in particular around
copy, updates, and moves are generally better (I understand why for
moves) .  Is this expected?

FYI 1000 and 10 node creation these are realistic use cases as our
application generates very large datasets (it is common to see 500gb/1000
files or more get added to a repo in one user session).

"To explain:
Re 1: in reality you would usually have many reading threads for each
writing thread. Oak’s MVCC design caters for performance for such test
cases.
Can you point me to any test cases where I can see the configuration for
something like this?

Re 2: If you have many cluster nodes the MVCC becomes even more pronounced
(not only different threads but different processes).
Also, if you have observation listeners and many cluster nodes then I
expect to see substantial differences between Oak and JR.

Are there any performance metrics out there for Oak that use
DocumentNodestore/Filedatastore that someone could share?  If I am
understanding correctly, I need to add nodes/horizontally scale for Oak's
performance to improve.  My overall goal here is to determine whether it
benefits us to upgrade from JR, but is it fair to compare the two?  FYI our
application can be deployed as one or mult nodes on premise or in a cloud.

thanks,
Domenic

On Thu, Apr 7, 2016 at 11:04 AM, Michael Marth  wrote:

> Hi Domenic,
>
> My POV:
> You are essentially testing how fast Oak or JR can put nodes into
> MySQL/Postgres/Mongo. IMO Oak’s design does not suggest that there should
> be fundamental differences between JR and Oak for this isolated case. (*)
>
> However, where Oak is expected to outperform JR is when
> 1) the test case reflects realistic usage patterns and
> 2) horizontal scalability becomes a topic.
>
> To explain:
> Re 1: in reality you would usually have many reading threads for each
> writing thread. Oak’s MVCC design caters for performance for such test
> cases.
> Re 2: If you have many cluster nodes the MVCC becomes even more pronounced
> (not only different threads but different processes). Also, if you have
> observation listeners and many cluster nodes then I expect to see
> substantial differences between Oak and JR.
>
> Cheers
> Michael
>
> (*) with the notable exception of TarMK which I expect to outperform
> anything on any test case ;)
>
>
>
> On 06/04/16 16:20, "Domenic DiTano"  wrote:
>
> >Hi Marcel,
> >
> >I uploaded all the source to github along with a summary spreadsheet.  I
> >would appreciate any time you have to review.
> >
> >https://github.com/Domenic-Ansys/Jackrabbit2-Oak-Tests
> >
> >As you stated the move is a non goal, but in comparison to Jackrabbit 2 I
> >am also finding in my tests that create, update, and copy are all faster
> >in Jackrabbit 2 (10k nodes).  Any input would be appreciated...
> >
> >Also, will MySql will not be listed as "Experimental" at some point?
> >
> >Thanks,
> >Domenic
> >
> >
> >-Original Message-
> >From: Marcel Reutegger [mailto:mreut...@adobe.com]
> >Sent: Thursday, March 31, 2016 6:14 AM
> >To: oak-dev@jackrabbit.apache.org
> >Subject: Re: Jackrabbit 2.10 vs Oak 1.2.7
> >
> >Hi Domenic,
> >
> >On 30/03/16 14:34, "Domenic DiTano" wrote:
> >>"In contrast to Jackrabbit 2, a move of a large subtree is an expensive
> >>operation in Oak"
> >>So should I avoid doing a move of a large number of items using Oak?
> >>If we are using Oak then should we avoid operations with a large number
> >>of items in general?
> >
> >In general it is fine to have a large change set with Oak. With Oak you
> >can even have change sets that do not fit into the heap.
> >
> >>  As a FYI - there are other benefits for us to move to Oak, but our
> >>application uses executes JCR operations with a large number of items
> >>quite often.  I am worried about the performance.
> >>
> >>The move method is pretty simple - should I be doing it differently?
&

Re: Jackrabbit 2.10 vs Oak 1.2.7

2016-04-07 Thread Michael Marth
Hi Domenic,

My POV:
You are essentially testing how fast Oak or JR can put nodes into 
MySQL/Postgres/Mongo. IMO Oak’s design does not suggest that there should be 
fundamental differences between JR and Oak for this isolated case. (*)

However, where Oak is expected to outperform JR is when
1) the test case reflects realistic usage patterns and
2) horizontal scalability becomes a topic.

To explain:
Re 1: in reality you would usually have many reading threads for each writing 
thread. Oak’s MVCC design caters for performance for such test cases.
Re 2: If you have many cluster nodes the MVCC becomes even more pronounced (not 
only different threads but different processes). Also, if you have observation 
listeners and many cluster nodes then I expect to see substantial differences 
between Oak and JR.

Cheers
Michael

(*) with the notable exception of TarMK which I expect to outperform anything 
on any test case ;)



On 06/04/16 16:20, "Domenic DiTano"  wrote:

>Hi Marcel,
>
>I uploaded all the source to github along with a summary spreadsheet.  I
>would appreciate any time you have to review.
>
>https://github.com/Domenic-Ansys/Jackrabbit2-Oak-Tests
>
>As you stated the move is a non goal, but in comparison to Jackrabbit 2 I
>am also finding in my tests that create, update, and copy are all faster
>in Jackrabbit 2 (10k nodes).  Any input would be appreciated...
>
>Also, will MySql will not be listed as "Experimental" at some point?
>
>Thanks,
>Domenic
>
>
>-Original Message-
>From: Marcel Reutegger [mailto:mreut...@adobe.com]
>Sent: Thursday, March 31, 2016 6:14 AM
>To: oak-dev@jackrabbit.apache.org
>Subject: Re: Jackrabbit 2.10 vs Oak 1.2.7
>
>Hi Domenic,
>
>On 30/03/16 14:34, "Domenic DiTano" wrote:
>>"In contrast to Jackrabbit 2, a move of a large subtree is an expensive
>>operation in Oak"
>>So should I avoid doing a move of a large number of items using Oak?
>>If we are using Oak then should we avoid operations with a large number
>>of items in general?
>
>In general it is fine to have a large change set with Oak. With Oak you
>can even have change sets that do not fit into the heap.
>
>>  As a FYI - there are other benefits for us to move to Oak, but our
>>application uses executes JCR operations with a large number of items
>>quite often.  I am worried about the performance.
>>
>>The move method is pretty simple - should I be doing it differently?
>>
>>public static long moveNodes(Session session, Node node, String
>>newNodeName)
>>throws Exception{
>>  long start = System.currentTimeMillis();
>>  session.move(node.getPath(), "/"+newNodeName);
>> session.save();
>>  long end = System.currentTimeMillis();
>>  return end-start;
>>}
>
>No, this is fine. As mentioned earlier, with Oak a move operation is not
>cheap and is basically implemented as copy to new location and delete at
>the old location.
>
>A cheap move operation was considered a non-goal when Oak was designed:
>https://wiki.apache.org/jackrabbit/Goals%20and%20non%20goals%20for%20Jackr
>a
>bbit%203
>
>
>Regards
> Marcel


RE: Jackrabbit 2.10 vs Oak 1.2.7

2016-04-06 Thread Domenic DiTano
Hi Marcel,

I uploaded all the source to github along with a summary spreadsheet.  I
would appreciate any time you have to review.

https://github.com/Domenic-Ansys/Jackrabbit2-Oak-Tests

As you stated the move is a non goal, but in comparison to Jackrabbit 2 I
am also finding in my tests that create, update, and copy are all faster
in Jackrabbit 2 (10k nodes).  Any input would be appreciated...

Also, will MySql will not be listed as "Experimental" at some point?

Thanks,
Domenic


-Original Message-
From: Marcel Reutegger [mailto:mreut...@adobe.com]
Sent: Thursday, March 31, 2016 6:14 AM
To: oak-dev@jackrabbit.apache.org
Subject: Re: Jackrabbit 2.10 vs Oak 1.2.7

Hi Domenic,

On 30/03/16 14:34, "Domenic DiTano" wrote:
>"In contrast to Jackrabbit 2, a move of a large subtree is an expensive
>operation in Oak"
>So should I avoid doing a move of a large number of items using Oak?
>If we are using Oak then should we avoid operations with a large number
>of items in general?

In general it is fine to have a large change set with Oak. With Oak you
can even have change sets that do not fit into the heap.

>  As a FYI - there are other benefits for us to move to Oak, but our
>application uses executes JCR operations with a large number of items
>quite often.  I am worried about the performance.
>
>The move method is pretty simple - should I be doing it differently?
>
>public static long moveNodes(Session session, Node node, String
>newNodeName)
>throws Exception{
>   long start = System.currentTimeMillis();
>   session.move(node.getPath(), "/"+newNodeName);
> session.save();
>   long end = System.currentTimeMillis();
>   return end-start;
>}

No, this is fine. As mentioned earlier, with Oak a move operation is not
cheap and is basically implemented as copy to new location and delete at
the old location.

A cheap move operation was considered a non-goal when Oak was designed:
https://wiki.apache.org/jackrabbit/Goals%20and%20non%20goals%20for%20Jackr
a
bbit%203


Regards
 Marcel


Re: Jackrabbit 2.10 vs Oak 1.2.7

2016-03-31 Thread Marcel Reutegger
Hi Domenic,

On 30/03/16 14:34, "Domenic DiTano" wrote:
>"In contrast to Jackrabbit 2, a move of a large subtree is an expensive
>operation in Oak"
>So should I avoid doing a move of a large number of items using Oak?  If
>we
>are using Oak then should we avoid operations with a large number of items
>in general?

In general it is fine to have a large change set with Oak. With
Oak you can even have change sets that do not fit into the heap.

>  As a FYI - there are other benefits for us to move to Oak, but
>our application uses executes JCR operations with a large number of items
>quite often.  I am worried about the performance.
>
>The move method is pretty simple - should I be doing it differently?
>
>public static long moveNodes(Session session, Node node, String
>newNodeName)
>throws Exception{
>   long start = System.currentTimeMillis();
>   session.move(node.getPath(), "/"+newNodeName);
> session.save();
>   long end = System.currentTimeMillis();
>   return end-start;
>}

No, this is fine. As mentioned earlier, with Oak a move
operation is not cheap and is basically implemented as
copy to new location and delete at the old location.

A cheap move operation was considered a non-goal when
Oak was designed:
https://wiki.apache.org/jackrabbit/Goals%20and%20non%20goals%20for%20Jackra
bbit%203


Regards
 Marcel



RE: Jackrabbit 2.10 vs Oak 1.2.7

2016-03-30 Thread Domenic DiTano
Hi Marcel,

Thanks for your input, it is really appreciated - yes I can put all the code
in Github (I will need a day or two).

I did forget to mention that I ran the tests with the following settings:
-Xms1024m -Xmx2048m
-Doak.queryLimitInMemory=50
-Doak.queryLimitReads=10
-Dupdate.limit=25
-Doak.fastQuerySize=true

As for the data, there is one issue in the second Jackrabbit 2 run there
were 100 files uploaded as opposed to 1000.  No I did not mix up the other
results, I ran these tests about 10 times and these results were pretty
consistent.  I ran them on my local laptop, so I would assume that you would
get better results with a dedicated machine.

"In contrast to Jackrabbit 2, a move of a large subtree is an expensive
operation in Oak"
So should I avoid doing a move of a large number of items using Oak?  If we
are using Oak then should we avoid operations with a large number of items
in general?  As a FYI - there are other benefits for us to move to Oak, but
our application uses executes JCR operations with a large number of items
quite often.  I am worried about the performance.

The move method is pretty simple - should I be doing it differently?

public static long moveNodes(Session session, Node node, String newNodeName)
throws Exception{
long start = System.currentTimeMillis();
session.move(node.getPath(), "/"+newNodeName);
 session.save();
long end = System.currentTimeMillis();
return end-start;
}

Thanks,
Domenic


-Original Message-
From: Marcel Reutegger [mailto:mreut...@adobe.com]
Sent: Wednesday, March 30, 2016 4:42 AM
To: oak-dev@jackrabbit.apache.org
Subject: Re: Jackrabbit 2.10 vs Oak 1.2.7

Hi,

On 29/03/16 14:55, "Domenic DiTano" wrote:
>Sending the data again, I hope this is makes it clearer.  I do not mind
>sharing the source, assuming you just want the code that does the
>creating, deleting etc of nodes (attached) How I created the Document
>stores is in the previous email, but if you  want I can send that also.

yes, I'm just interested in the test code. Can you please make it available,
e.g. over github?

some comments on the results:

In contrast to Jackrabbit 2, a move of a large subtree is an expensive
operation in Oak. With Jackrabbit 2, both the content update as well as the
index update is rather cheap when a subtree is moved. With Oak, the cost
depends on the number of items you move.

Some of the results for Jackrabbit 2 with 10k nodes are better than with
just 1k. Did you mix up numbers?

As mentioned before you potentially get a speedup with Oak when you tweak
the update.limit for large change sets with 10k nodes.

Regards
 Marcel

>All milliseconds...
>
>Oak:
>Create 1000 (Mysql,PostGress,Mongo) 3444,2483,8497 Query 1000
>(Mysql,PostGress,Mongo) 2,19,2 Upload 100 files (Mysql,PostGress,Mongo)
>1455,1130,845 Move 1000 (Mysql,PostGress,Mongo) 96349,2404,14428 Copied
>1000 (Mysql,PostGress,Mongo) 2246,556,4432 Delete 1000
>(Mysql,PostGress,Mongo) 92923,1523,7667 Update 1000
>(Mysql,PostGress,Mongo) 48647,1055,4640 Read 1000
>(Mysql,PostGress,Mongo) 98,111,142
>
>
>Jackrabbit 2:
>Create 1000 (Mysql) 3022
>Query 1000 (Mysql) 143
>Upload 100 files (Mysql) 1105
>Move 1000 (Mysql) 16
>Copied 1000 (Mysql) 764
>Delete 1000 (Mysql) 1481
>Update 1000 (Mysql) 1139
>Read 1000 (Mysql) 12
>
>
>Oak:
>Create 1 (Mysql,PostGress,Mongo) 31250,16475,342192 Query 1
>(Mysql,PostGress,Mongo) 4,16,2 Upload 100 files (Mysql,PostGress,Mongo)
>1146,605,753 Move 1 (Mysql,PostGress,Mongo) 741474,30339,406259
>Copied 1 (Mysql,PostGress,Mongo) 20755,7615,43670 Delete 1
>(Mysql,PostGress,Mongo) 728737,24461,43670 Update 1
>(Mysql,PostGress,Mongo) 374387,12453,41053 Read 1
>(Mysql,PostGress,Mongo) 2216,2989,968
>
>
>Jackrabbit 2:
>Create 1 (Mysql) 8507
>Query 1 (Mysql) 94
>Upload 100 files (Mysql) 744
>Move 1 (Mysql) 14
>Copied 1 (Mysql) 489
>Delete 1 (Mysql) 824
>Update 1 (Mysql) 987
>Read 1 (Mysql) 8
>
>
>On Tue, Mar 29, 2016 at 8:28 AM, Marcel Reutegger 
>wrote:
>
>Hi Domenic,
>
>the number of test cases do not match the results you provided. i.e.
>the column headers do not match data columns. can you please clarify
>how the results map to the test cases?
>
>also, do you mind sharing the test code? I'd like to better understand
>what the tests do.
>
>Regards
> Marcel
>
>On 29/03/16 14:04, "Domenic DiTano" wrote:
>
>>Sorry those images did not come through, posting the email again with
>>the raw data:
>>
>>I work with web application that has Jackrabbit 2.10 embedded and we
>>wanted to try upgrading to Oak.  Our current configuration that we use
>>for Jackrabbit 2.10 is the FileDataStore along with MySql f

Re: Jackrabbit 2.10 vs Oak 1.2.7

2016-03-30 Thread Marcel Reutegger
>>MySql:3022,143,1105,16,764,1481,1139,12
>>
>>
>>Jackrabbit 2.10 performs slightly better than Oak.
>>
>>The next set of tests were ran with Oak with the creation, move, copy,
>>delete, update, and read of 1 nodes:
>>
>>Create 1 Nodes,Query Properties,Upload 100,Move 1,Copied
>>1,Delete 1,Update 1,Read 1
>>MySql:31250,4,1146,741474,20755,728737,374387,2216
>>Postgress:16475,16,605,30339,7615,24461,12453,2989
>>Mongo:342192,2,753,406259,321040,43670,41053,968
>>
>>Postgress once again performed ok.  Mongo and MySql did not do well
>>around
>>Moves, deletes, and updates. Querying did well also as indexes were
>>created.
>>
>>In the case of Jackrabbit 2.10 (tests ran with the creation, move, copy,
>>delete, update, and read of 1 nodes):
>>Create 1 Nodes,Query Properties,Upload 100,Move 1,Copied
>>1,Delete 1,Update 1,Read 1
>>MySql:8507,94,744,14,489,824,987,8
>>
>>Jackrabbit 2.10 performed much better than Oak in general.
>>
>>Based on the results I have a few questions/comments:
>>
>>.  Are these fair comparisons between Jackrabbit and Oak?  In our
>>application it is very possible to create 1-10,000 nodes in a user
>>session.
>>.  Should I have assumed Oak would outperform Jackrabbit 2.10?
>>.  I understand MySql is experimental but Mongo is not - I would
>>assume Mongo would perform as well if not better than Postgress
>>.  The performance bottlenecks seem to be at the JDBC level for
>>MySql.  I made some configuration changes which helped performance but
>>the
>>changes would make MySql fail any ACID tests.
>>
>>Just a few notes:
>>
>>The same JCR code was used for creating, moving, deleting etc any nodes.
>>The JCR code was used for all the tests.  The tests were all run on the
>>same machine
>>
>>Used DocumentMK Builder for all DataStores:
>>
>>Mongo:
>>DocumentNodeStore storeD = new
>>DocumentMK.Builder().setPersistentCache("D:\\ekm-oak\\Mongo,size=1024,bin
>>a
>>ry=0").setMongoDB(db).setBlobStore(new
>>DataStoreBlobStore(fds)).getNodeStore();
>>
>>MySql:
>>   RDBOptions options = new
>>RDBOptions().tablePrefix(prefix).dropTablesOnClose(false);
>>DocumentNodeStore storeD = new
>>DocumentMK.Builder().setBlobStore(new
>>DataStoreBlobStore(fds)).setClusterId(1).memoryCacheSize(64 * 1024 *
>>1024).
>>
>>setPersistentCache("D:\\ekm-oak\\MySql,size=1024,binary=0").setRDBConnect
>>i
>>on(RDBDataSourceFactory.forJdbcUrl(url, userName, password),
>>options).getNodeStore();
>>PostGres:
>>RDBOptions options = new
>>RDBOptions().tablePrefix(prefix).dropTablesOnClose(false);
>>DocumentNodeStore storeD = new
>>DocumentMK.Builder().setAsyncDelay(0).setBlobStore(new
>>DataStoreBlobStore(fds)).setClusterId(1).memoryCacheSize(64 * 1024 *
>>1024).
>>
>>setPersistentCache("D:\\ekm-oak\\postGress,size=1024,binary=0").setRDBCon
>>n
>>ection(RDBDataSourceFactory.forJdbcUrl(url, userName, password),
>>options).getNodeStore();
>>
>>The repository was created the same for all three:
>>Repository repository = new Jcr(new Oak(storeD)).with(new
>>LuceneIndexEditorProvider()).with(configureSearch()).createRepository();
>>
>>Any input is welcome..
>>
>>Thanks,
>>Domenic
>>
>>-Original Message-
>>From: Marcel Reutegger [mailto:mreut...@adobe.com]
>>Sent: Tuesday, March 29, 2016 4:41 AM
>>To: oak-dev@jackrabbit.apache.org
>>Subject: Re: Jackrabbit 2.10 vs Oak 1.2.7
>>
>>Hi,
>>
>>the graphs didn't make it through to the mailing list.
>>Can you please post raw numbers or a link to the graphs?
>>
>>Without access to more data, my guess is that Oak on DocumentNodeStore is
>>slower with the bigger changes set because it internally creates a branch
>>to stage changes when it reaches a given threshold. This introduces more
>>traffic to the backend storage when save() is called, because previously
>>written data is retrieved again from the backend.
>>
>>Jackrabbit 2.10 on the other hand keeps the entire changes in memory
>>until
>>save() is called.
>>
>>You can increase the threshold for the DocumentNodeStore with a system
>>property: -Dupdate.limit=10
>>
>>The default is 10'000.
>>
>>Regards
>> Marcel
>>
>>On 29/03/16 04:19, "Domenic DiTano

Re: Jackrabbit 2.10 vs Oak 1.2.7

2016-03-29 Thread Domenic DiTano
nodes):
> >Create 1 Nodes,Query Properties,Upload 100,Move 1,Copied
> >1,Delete 1,Update 1,Read 1
> >MySql:8507,94,744,14,489,824,987,8
> >
> >Jackrabbit 2.10 performed much better than Oak in general.
> >
> >Based on the results I have a few questions/comments:
> >
> >.  Are these fair comparisons between Jackrabbit and Oak?  In our
> >application it is very possible to create 1-10,000 nodes in a user
> >session.
> >.  Should I have assumed Oak would outperform Jackrabbit 2.10?
> >.  I understand MySql is experimental but Mongo is not - I would
> >assume Mongo would perform as well if not better than Postgress
> >.  The performance bottlenecks seem to be at the JDBC level for
> >MySql.  I made some configuration changes which helped performance but the
> >changes would make MySql fail any ACID tests.
> >
> >Just a few notes:
> >
> >The same JCR code was used for creating, moving, deleting etc any nodes.
> >The JCR code was used for all the tests.  The tests were all run on the
> >same machine
> >
> >Used DocumentMK Builder for all DataStores:
> >
> >Mongo:
> >DocumentNodeStore storeD = new
> >DocumentMK.Builder().setPersistentCache("D:\\ekm-oak\\Mongo,size=1024,bina
> >ry=0").setMongoDB(db).setBlobStore(new
> >DataStoreBlobStore(fds)).getNodeStore();
> >
> >MySql:
> >   RDBOptions options = new
> >RDBOptions().tablePrefix(prefix).dropTablesOnClose(false);
> >DocumentNodeStore storeD = new
> >DocumentMK.Builder().setBlobStore(new
> >DataStoreBlobStore(fds)).setClusterId(1).memoryCacheSize(64 * 1024 *
> >1024).
> >
> >setPersistentCache("D:\\ekm-oak\\MySql,size=1024,binary=0").setRDBConnecti
> >on(RDBDataSourceFactory.forJdbcUrl(url, userName, password),
> >options).getNodeStore();
> >PostGres:
> >RDBOptions options = new
> >RDBOptions().tablePrefix(prefix).dropTablesOnClose(false);
> >DocumentNodeStore storeD = new
> >DocumentMK.Builder().setAsyncDelay(0).setBlobStore(new
> >DataStoreBlobStore(fds)).setClusterId(1).memoryCacheSize(64 * 1024 *
> >1024).
> >
> >setPersistentCache("D:\\ekm-oak\\postGress,size=1024,binary=0").setRDBConn
> >ection(RDBDataSourceFactory.forJdbcUrl(url, userName, password),
> >options).getNodeStore();
> >
> >The repository was created the same for all three:
> >Repository repository = new Jcr(new Oak(storeD)).with(new
> >LuceneIndexEditorProvider()).with(configureSearch()).createRepository();
> >
> >Any input is welcome..
> >
> >Thanks,
> >Domenic
> >
> >-Original Message-
> >From: Marcel Reutegger [mailto:mreut...@adobe.com]
> >Sent: Tuesday, March 29, 2016 4:41 AM
> >To: oak-dev@jackrabbit.apache.org
> >Subject: Re: Jackrabbit 2.10 vs Oak 1.2.7
> >
> >Hi,
> >
> >the graphs didn't make it through to the mailing list.
> >Can you please post raw numbers or a link to the graphs?
> >
> >Without access to more data, my guess is that Oak on DocumentNodeStore is
> >slower with the bigger changes set because it internally creates a branch
> >to stage changes when it reaches a given threshold. This introduces more
> >traffic to the backend storage when save() is called, because previously
> >written data is retrieved again from the backend.
> >
> >Jackrabbit 2.10 on the other hand keeps the entire changes in memory until
> >save() is called.
> >
> >You can increase the threshold for the DocumentNodeStore with a system
> >property: -Dupdate.limit=10
> >
> >The default is 10'000.
> >
> >Regards
> > Marcel
> >
> >On 29/03/16 04:19, "Domenic DiTano" wrote:
> >
> >>Hello,
> >>
> >>I work with web application that has Jackrabbit 2.10 embedded and we
> >>wanted to try upgrading to Oak.  Our current configuration that we use
> >>for Jackrabbit
> >>2.10 is the FileDataStore along with MySql for the Persistence DataStore.
> >> We wrote some test cases to measure the performance of JackRabbit
> >>2.1.0 vs latest Oak 1.2.  In the case of JackRabbit 2.10, we used what
> >>our current application configuration ­ FileDataStore along with MySql.
> >>In the case of Oak we tried many configurations but the one we settled
> >>on was a DocumentNodeStore with a FileDataStore backend.We tried all 3
> >>RDB options (Mongo, PostGress, MySql).
> >>All Test cases used the same code which 

Re: Jackrabbit 2.10 vs Oak 1.2.7

2016-03-29 Thread Marcel Reutegger
ocumentNodeStore storeD = new
>DocumentMK.Builder().setAsyncDelay(0).setBlobStore(new
>DataStoreBlobStore(fds)).setClusterId(1).memoryCacheSize(64 * 1024 *
>1024).
>
>setPersistentCache("D:\\ekm-oak\\postGress,size=1024,binary=0").setRDBConn
>ection(RDBDataSourceFactory.forJdbcUrl(url, userName, password),
>options).getNodeStore();
>
>The repository was created the same for all three:
>Repository repository = new Jcr(new Oak(storeD)).with(new
>LuceneIndexEditorProvider()).with(configureSearch()).createRepository();
>
>Any input is welcome..
>
>Thanks,
>Domenic
>
>-Original Message-
>From: Marcel Reutegger [mailto:mreut...@adobe.com]
>Sent: Tuesday, March 29, 2016 4:41 AM
>To: oak-dev@jackrabbit.apache.org
>Subject: Re: Jackrabbit 2.10 vs Oak 1.2.7
>
>Hi,
>
>the graphs didn't make it through to the mailing list.
>Can you please post raw numbers or a link to the graphs?
>
>Without access to more data, my guess is that Oak on DocumentNodeStore is
>slower with the bigger changes set because it internally creates a branch
>to stage changes when it reaches a given threshold. This introduces more
>traffic to the backend storage when save() is called, because previously
>written data is retrieved again from the backend.
>
>Jackrabbit 2.10 on the other hand keeps the entire changes in memory until
>save() is called.
>
>You can increase the threshold for the DocumentNodeStore with a system
>property: -Dupdate.limit=10
>
>The default is 10'000.
>
>Regards
> Marcel
>
>On 29/03/16 04:19, "Domenic DiTano" wrote:
>
>>Hello,
>>
>>I work with web application that has Jackrabbit 2.10 embedded and we
>>wanted to try upgrading to Oak.  Our current configuration that we use
>>for Jackrabbit
>>2.10 is the FileDataStore along with MySql for the Persistence DataStore.
>> We wrote some test cases to measure the performance of JackRabbit
>>2.1.0 vs latest Oak 1.2.  In the case of JackRabbit 2.10, we used what
>>our current application configuration ­ FileDataStore along with MySql.
>>In the case of Oak we tried many configurations but the one we settled
>>on was a DocumentNodeStore with a FileDataStore backend.We tried all 3
>>RDB options (Mongo, PostGress, MySql).
>>All Test cases used the same code which standard
>>JCR 2.0 code.   The test cases did the following:
>>
>>.
>>create 1000 & 10,000 nodes
>>.
>>move 1000 & 10,000 nodes
>>.
>>copy 1000 & 10,000 nodes
>>.
>>delete 1000 & 10,000 nodes
>>.
>>upload 100 files
>>.
>>read 1 property on 1000 & 10,000 nodes
>>.
>>update 1 property on 1000 & 10,000 nodes
>>
>>
>>The results were as follows (all results in milliseconds):
>>
>>Oak tests ran with the creation, move, copy, delete, update, and read
>>of
>>1000 nodes:
>>
>>
>>
>>Postgress seems to perform well overall.
>>
>>In the case of Jackrabbit 2.10 (tests ran with the creation, move,
>>copy, delete, update, and read of 1000 nodes):
>>
>>
>>
>>Jackrabbit 2.10 performs slightly better than Oak.
>>
>>The next set of tests were ran with Oak with the creation, move, copy,
>>delete, update, and read of 1 nodes:
>>
>>
>>
>>Postgress once again performed ok.  Mongo and MySql did not do well
>>around Moves, deletes, and updates. Querying did well also as indexes
>>were created.
>>
>>In the case of Jackrabbit 2.10 (tests ran with the creation, move,
>>copy, delete, update, and read of 1 nodes):
>>
>>
>>
>>Jackrabbit 2.10 performed much
>>better than Oak in general.
>>
>>Based on the results I have a few questions/comments:
>>
>>.
>>Are these fair comparisons between Jackrabbit and Oak?  In our
>>application it is very possible to create 1-10,000 nodes in a user
>>session.
>>.
>>Should I have assumed Oak would outperform Jackrabbit 2.10?
>>.
>>I understand MySql is experimental but Mongo is not ­ I would assume
>>Mongo would perform as well if not better than Postgress
>>.
>>The performance bottlenecks seem to be at the JDBC level for MySql.  I
>>made some configuration changes which helped performance but the
>>changes would make MySql fail any ACID tests.
>>
>>Just a few notes:
>>
>>The same JCR code was used for creating, moving, deleting etc any nodes.
>>The JCR code was used for all the tests.  The tests were all run on the
>>same machine
>>
>>Used DocumentMK Builder for all DataStores:
>>
>>

RE: Jackrabbit 2.10 vs Oak 1.2.7

2016-03-29 Thread Domenic DiTano
Sorry those images did not come through, posting the email again with the
raw data:

I work with web application that has Jackrabbit 2.10 embedded and we
wanted to try upgrading to Oak.  Our current configuration that we use for
Jackrabbit 2.10 is the FileDataStore along with MySql for the Persistence
DataStore.  We wrote some test cases to measure the performance of
JackRabbit 2.1.0 vs latest Oak 1.2.  In the case of JackRabbit 2.10, we
used what our current application configuration - FileDataStore along with
MySql.  In the case of Oak we tried many configurations but the one we
settled on was a DocumentNodeStore with a FileDataStore backend. We tried
all 3 RDB options (Mongo, PostGress, MySql).  All Test cases used the same
code which standard JCR 2.0 code.   The test cases did the following:

.   create 1000 & 10,000 nodes
.   move 1000 & 10,000 nodes
.   copy 1000 & 10,000 nodes
.   delete 1000 & 10,000 nodes
.   upload 100 files
.   read 1 property on 1000 & 10,000 nodes
.   update 1 property on 1000 & 10,000 nodes


The results were as follows (all results in milliseconds):

Oak tests ran with the creation, move, copy, delete, update, and read of
1000 nodes:

Create 1000 Nodes,Query Properties,Upload 100,Move 1000,Copied 1000,Delete
1000
MySql:3444,2,1445,96349,2246,92923,48647,98
Postgress:2483,19,1130,2404,556,1523,1055,111
Mongo:8497,2,845,14428,4432,7667,4640,142

Postgress seems to perform well overall.

In the case of Jackrabbit 2.10 (tests ran with the creation, move, copy,
delete, update, and read of 1000 nodes):
Create 1000 Nodes,Query Properties,Upload 100,Move 1000,Copied 1000,Delete
1000
MySql:3022,143,1105,16,764,1481,1139,12


Jackrabbit 2.10 performs slightly better than Oak.

The next set of tests were ran with Oak with the creation, move, copy,
delete, update, and read of 1 nodes:

Create 1 Nodes,Query Properties,Upload 100,Move 1,Copied
1,Delete 1,Update 1,Read 1
MySql:31250,4,1146,741474,20755,728737,374387,2216
Postgress:16475,16,605,30339,7615,24461,12453,2989
Mongo:342192,2,753,406259,321040,43670,41053,968

Postgress once again performed ok.  Mongo and MySql did not do well around
Moves, deletes, and updates. Querying did well also as indexes were
created.

In the case of Jackrabbit 2.10 (tests ran with the creation, move, copy,
delete, update, and read of 1 nodes):
Create 1 Nodes,Query Properties,Upload 100,Move 1,Copied
1,Delete 1,Update 1,Read 1
MySql:8507,94,744,14,489,824,987,8

Jackrabbit 2.10 performed much better than Oak in general.

Based on the results I have a few questions/comments:

.   Are these fair comparisons between Jackrabbit and Oak?  In our
application it is very possible to create 1-10,000 nodes in a user
session.
.   Should I have assumed Oak would outperform Jackrabbit 2.10?
.   I understand MySql is experimental but Mongo is not - I would
assume Mongo would perform as well if not better than Postgress
.   The performance bottlenecks seem to be at the JDBC level for
MySql.  I made some configuration changes which helped performance but the
changes would make MySql fail any ACID tests.

Just a few notes:

The same JCR code was used for creating, moving, deleting etc any nodes.
The JCR code was used for all the tests.  The tests were all run on the
same machine

Used DocumentMK Builder for all DataStores:

Mongo:
DocumentNodeStore storeD = new
DocumentMK.Builder().setPersistentCache("D:\\ekm-oak\\Mongo,size=1024,bina
ry=0").setMongoDB(db).setBlobStore(new
DataStoreBlobStore(fds)).getNodeStore();

MySql:
   RDBOptions options = new
RDBOptions().tablePrefix(prefix).dropTablesOnClose(false);
DocumentNodeStore storeD = new
DocumentMK.Builder().setBlobStore(new
DataStoreBlobStore(fds)).setClusterId(1).memoryCacheSize(64 * 1024 *
1024).

setPersistentCache("D:\\ekm-oak\\MySql,size=1024,binary=0").setRDBConnecti
on(RDBDataSourceFactory.forJdbcUrl(url, userName, password),
options).getNodeStore();
PostGres:
RDBOptions options = new
RDBOptions().tablePrefix(prefix).dropTablesOnClose(false);
DocumentNodeStore storeD = new
DocumentMK.Builder().setAsyncDelay(0).setBlobStore(new
DataStoreBlobStore(fds)).setClusterId(1).memoryCacheSize(64 * 1024 *
1024).

setPersistentCache("D:\\ekm-oak\\postGress,size=1024,binary=0").setRDBConn
ection(RDBDataSourceFactory.forJdbcUrl(url, userName, password),
options).getNodeStore();

The repository was created the same for all three:
Repository repository = new Jcr(new Oak(storeD)).with(new
LuceneIndexEditorProvider()).with(configureSearch()).createRepository();

Any input is welcome..

Thanks,
Domenic

-Original Message-
From: Marcel Reutegger [mailto:mreut...@adobe.com]
Sent: Tuesday, March 29, 2016 4:41 AM
To: oak-dev@jackrabbit.apache.org
Subject: Re: Jackrabbit 2.10 vs Oak 1.2.7

Hi,

the graphs didn

Re: Jackrabbit 2.10 vs Oak 1.2.7

2016-03-29 Thread Marcel Reutegger
Hi,

the graphs didn't make it through to the mailing list.
Can you please post raw numbers or a link to the graphs?

Without access to more data, my guess is that Oak on
DocumentNodeStore is slower with the bigger changes set
because it internally creates a branch to stage changes
when it reaches a given threshold. This introduces more
traffic to the backend storage when save() is called,
because previously written data is retrieved again from
the backend.

Jackrabbit 2.10 on the other hand keeps the entire changes
in memory until save() is called.

You can increase the threshold for the DocumentNodeStore
with a system property: -Dupdate.limit=10

The default is 10'000.

Regards
 Marcel

On 29/03/16 04:19, "Domenic DiTano" wrote:

>Hello,
> 
>I work with web application that has Jackrabbit 2.10 embedded and we
>wanted to try upgrading to Oak.  Our current configuration that we use
>for Jackrabbit
>2.10 is the FileDataStore along with MySql for the Persistence DataStore.
> We wrote some test cases to measure the performance of JackRabbit 2.1.0
>vs latest Oak 1.2.  In the case of JackRabbit 2.10, we used what our
>current application configuration ­ FileDataStore along with MySql.  In
>the case of Oak we tried many configurations but the one we settled on
>was a DocumentNodeStore with a FileDataStore backend.We tried all 3 RDB
>options (Mongo, PostGress, MySql).
>All Test cases used the same code which standard
>JCR 2.0 code.   The test cases did the following:
> 
>·
>create 1000 & 10,000 nodes
>·
>move 1000 & 10,000 nodes
>·
>copy 1000 & 10,000 nodes
>·
>delete 1000 & 10,000 nodes
>·
>upload 100 files
>·
>read 1 property on 1000 & 10,000 nodes
>·
>update 1 property on 1000 & 10,000 nodes
> 
> 
>The results were as follows (all results in milliseconds):
> 
>Oak tests ran with the creation, move, copy, delete, update, and read of
>1000 nodes:
> 
>
> 
>Postgress seems to perform well overall.
> 
>In the case of Jackrabbit 2.10 (tests ran with the creation, move, copy,
>delete, update, and read of 1000 nodes):
> 
>
> 
>Jackrabbit 2.10 performs slightly better than Oak.
> 
>The next set of tests were ran with Oak with the creation, move, copy,
>delete, update, and read of 1 nodes:
> 
>
> 
>Postgress once again performed ok.  Mongo and MySql did not do well
>around Moves, deletes, and updates. Querying did well also as indexes
>were created.
> 
>In the case of Jackrabbit 2.10 (tests ran with the creation, move, copy,
>delete, update, and read of 1 nodes):
> 
>
> 
>Jackrabbit 2.10 performed much
>better than Oak in general.
> 
>Based on the results I have a few questions/comments:
> 
>·
>Are these fair comparisons between Jackrabbit and Oak?  In our
>application it is very possible to create 1-10,000 nodes in a user
>session.
>·
>Should I have assumed Oak would outperform Jackrabbit 2.10?
>·
>I understand MySql is experimental but Mongo is not ­ I would assume
>Mongo would perform as well if not better than Postgress
>·
>The performance bottlenecks seem to be at the JDBC level for MySql.  I
>made some configuration changes which helped performance but the changes
>would make MySql fail any ACID tests.
> 
>Just a few notes:
> 
>The same JCR code was used for creating, moving, deleting etc any nodes.
>The JCR code was used for all the tests.  The tests were all run on the
>same machine
> 
>Used DocumentMK Builder for all DataStores:
> 
>Mongo:
>DocumentNodeStore storeD = new
>DocumentMK.Builder().setPersistentCache("D:\\ekm-oak\\Mongo,size=1024,bina
>ry=0").setMongoDB(db).setBlobStore(new
>DataStoreBlobStore(fds)).getNodeStore();
> 
>MySql:
>   RDBOptions options = new
>RDBOptions().tablePrefix(prefix).dropTablesOnClose(false);
>DocumentNodeStore storeD = new
>DocumentMK.Builder().setBlobStore(new
>DataStoreBlobStore(fds)).setClusterId(1).memoryCacheSize(64 * 1024 *
>1024).
>
>setPersistentCache("D:\\ekm-oak\\MySql,size=1024,binary=0").setRDBConnecti
>on(RDBDataSourceFactory.forJdbcUrl(url, userName, password),
>options).getNodeStore();
>PostGres:
>RDBOptions options = new
>RDBOptions().tablePrefix(prefix).dropTablesOnClose(false);
>DocumentNodeStore storeD = new
>DocumentMK.Builder().setAsyncDelay(0).setBlobStore(new
>DataStoreBlobStore(fds)).setClusterId(1).memoryCacheSize(64 * 1024 *
>1024).
>
>setPersistentCache("D:\\ekm-oak\\postGress,size=1024,binary=0").setRDBConn
>ection(RDBDataSourceFactory.forJdbcUrl(url, userName, password),
>options).getNodeStore();
> 
>The repository was created the same for all three:
>Repository repository = new Jcr(new Oak(storeD)).with(new
>LuceneIndexEditorProvider()).with(configureSearch()).createRepository();
> 
>Any input is welcomeŠ.
> 
>Thanks,
>Domenic
> 
>