Hi Marcel,

Thanks for your input, it is really appreciated - yes I can put all the code
in Github (I will need a day or two).

I did forget to mention that I ran the tests with the following settings:
-Xms1024m -Xmx2048m
-Doak.queryLimitInMemory=500000
-Doak.queryLimitReads=100000
-Dupdate.limit=250000
-Doak.fastQuerySize=true

As for the data, there is one issue in the second Jackrabbit 2 run there
were 100 files uploaded as opposed to 1000.  No I did not mix up the other
results, I ran these tests about 10 times and these results were pretty
consistent.  I ran them on my local laptop, so I would assume that you would
get better results with a dedicated machine.

"In contrast to Jackrabbit 2, a move of a large subtree is an expensive
operation in Oak"
So should I avoid doing a move of a large number of items using Oak?  If we
are using Oak then should we avoid operations with a large number of items
in general?  As a FYI - there are other benefits for us to move to Oak, but
our application uses executes JCR operations with a large number of items
quite often.  I am worried about the performance.

The move method is pretty simple - should I be doing it differently?

public static long moveNodes(Session session, Node node, String newNodeName)
throws Exception{
        long start = System.currentTimeMillis();
        session.move(node.getPath(), "/"+newNodeName);
             session.save();
        long end = System.currentTimeMillis();
        return end-start;
}

Thanks,
Domenic


-----Original Message-----
From: Marcel Reutegger [mailto:mreut...@adobe.com]
Sent: Wednesday, March 30, 2016 4:42 AM
To: oak-dev@jackrabbit.apache.org
Subject: Re: Jackrabbit 2.10 vs Oak 1.2.7

Hi,

On 29/03/16 14:55, "Domenic DiTano" wrote:
>Sending the data again, I hope this is makes it clearer.  I do not mind
>sharing the source, assuming you just want the code that does the
>creating, deleting etc of nodes (attached) How I created the Document
>stores is in the previous email, but if you  want I can send that also.

yes, I'm just interested in the test code. Can you please make it available,
e.g. over github?

some comments on the results:

In contrast to Jackrabbit 2, a move of a large subtree is an expensive
operation in Oak. With Jackrabbit 2, both the content update as well as the
index update is rather cheap when a subtree is moved. With Oak, the cost
depends on the number of items you move.

Some of the results for Jackrabbit 2 with 10k nodes are better than with
just 1k. Did you mix up numbers?

As mentioned before you potentially get a speedup with Oak when you tweak
the update.limit for large change sets with 10k nodes.

Regards
 Marcel

>All milliseconds...
>
>Oak:
>Create 1000 (Mysql,PostGress,Mongo) 3444,2483,8497 Query 1000
>(Mysql,PostGress,Mongo) 2,19,2 Upload 100 files (Mysql,PostGress,Mongo)
>1455,1130,845 Move 1000 (Mysql,PostGress,Mongo) 96349,2404,14428 Copied
>1000 (Mysql,PostGress,Mongo) 2246,556,4432 Delete 1000
>(Mysql,PostGress,Mongo) 92923,1523,7667 Update 1000
>(Mysql,PostGress,Mongo) 48647,1055,4640 Read 1000
>(Mysql,PostGress,Mongo) 98,111,142
>
>
>Jackrabbit 2:
>Create 1000 (Mysql) 3022
>Query 1000 (Mysql) 143
>Upload 100 files (Mysql) 1105
>Move 1000 (Mysql) 16
>Copied 1000 (Mysql) 764
>Delete 1000 (Mysql) 1481
>Update 1000 (Mysql) 1139
>Read 1000 (Mysql) 12
>
>
>Oak:
>Create 10000 (Mysql,PostGress,Mongo) 31250,16475,342192 Query 10000
>(Mysql,PostGress,Mongo) 4,16,2 Upload 100 files (Mysql,PostGress,Mongo)
>1146,605,753 Move 10000 (Mysql,PostGress,Mongo) 741474,30339,406259
>Copied 10000 (Mysql,PostGress,Mongo) 20755,7615,43670 Delete 10000
>(Mysql,PostGress,Mongo) 728737,24461,43670 Update 10000
>(Mysql,PostGress,Mongo) 374387,12453,41053 Read 10000
>(Mysql,PostGress,Mongo) 2216,2989,968
>
>
>Jackrabbit 2:
>Create 10000 (Mysql) 8507
>Query 10000 (Mysql) 94
>Upload 100 files (Mysql) 744
>Move 10000 (Mysql) 14
>Copied 10000 (Mysql) 489
>Delete 10000 (Mysql) 824
>Update 10000 (Mysql) 987
>Read 10000 (Mysql) 8
>
>
>On Tue, Mar 29, 2016 at 8:28 AM, Marcel Reutegger <mreut...@adobe.com>
>wrote:
>
>Hi Domenic,
>
>the number of test cases do not match the results you provided. i.e.
>the column headers do not match data columns. can you please clarify
>how the results map to the test cases?
>
>also, do you mind sharing the test code? I'd like to better understand
>what the tests do.
>
>Regards
> Marcel
>
>On 29/03/16 14:04, "Domenic DiTano" wrote:
>
>>Sorry those images did not come through, posting the email again with
>>the raw data:
>>
>>I work with web application that has Jackrabbit 2.10 embedded and we
>>wanted to try upgrading to Oak.  Our current configuration that we use
>>for Jackrabbit 2.10 is the FileDataStore along with MySql for the
>>Persistence DataStore.  We wrote some test cases to measure the
>>performance of JackRabbit 2.1.0 vs latest Oak 1.2.  In the case of
>>JackRabbit 2.10, we used what our current application configuration -
>>FileDataStore along with MySql.  In the case of Oak we tried many
>>configurations but the one we settled on was a DocumentNodeStore with
>>a FileDataStore backend. We tried all 3 RDB options (Mongo, PostGress,
>>MySql).  All Test cases used the same
>>code which standard JCR 2.0 code.   The test cases did the following:
>>
>>.      create 1000 & 10,000 nodes
>>.      move 1000 & 10,000 nodes
>>.      copy 1000 & 10,000 nodes
>>.      delete 1000 & 10,000 nodes
>>.      upload 100 files
>>.      read 1 property on 1000 & 10,000 nodes
>>.      update 1 property on 1000 & 10,000 nodes
>>
>>
>>The results were as follows (all results in milliseconds):
>>
>>Oak tests ran with the creation, move, copy, delete, update, and read
>>of
>>1000 nodes:
>>
>>Create 1000 Nodes,Query Properties,Upload 100,Move 1000,Copied
>>1000,Delete
>>1000
>>MySql:3444,2,1445,96349,2246,92923,48647,98
>>Postgress:2483,19,1130,2404,556,1523,1055,111
>>Mongo:8497,2,845,14428,4432,7667,4640,142
>>
>>Postgress seems to perform well overall.
>>
>>In the case of Jackrabbit 2.10 (tests ran with the creation, move,
>>copy, delete, update, and read of 1000 nodes):
>>Create 1000 Nodes,Query Properties,Upload 100,Move 1000,Copied
>>1000,Delete
>>1000
>>MySql:3022,143,1105,16,764,1481,1139,12
>>
>>
>>Jackrabbit 2.10 performs slightly better than Oak.
>>
>>The next set of tests were ran with Oak with the creation, move, copy,
>>delete, update, and read of 10000 nodes:
>>
>>Create 10000 Nodes,Query Properties,Upload 100,Move 10000,Copied
>>10000,Delete 10000,Update 10000,Read 10000
>>MySql:31250,4,1146,741474,20755,728737,374387,2216
>>Postgress:16475,16,605,30339,7615,24461,12453,2989
>>Mongo:342192,2,753,406259,321040,43670,41053,968
>>
>>Postgress once again performed ok.  Mongo and MySql did not do well
>>around Moves, deletes, and updates. Querying did well also as indexes
>>were created.
>>
>>In the case of Jackrabbit 2.10 (tests ran with the creation, move,
>>copy, delete, update, and read of 10000 nodes):
>>Create 10000 Nodes,Query Properties,Upload 100,Move 10000,Copied
>>10000,Delete 10000,Update 10000,Read 10000
>>MySql:8507,94,744,14,489,824,987,8
>>
>>Jackrabbit 2.10 performed much better than Oak in general.
>>
>>Based on the results I have a few questions/comments:
>>
>>.      Are these fair comparisons between Jackrabbit and Oak?  In our
>>application it is very possible to create 1-10,000 nodes in a user
>>session.
>>.      Should I have assumed Oak would outperform Jackrabbit 2.10?
>>.      I understand MySql is experimental but Mongo is not - I would
>>assume Mongo would perform as well if not better than Postgress
>>.      The performance bottlenecks seem to be at the JDBC level for
>>MySql.  I made some configuration changes which helped performance but
>>the changes would make MySql fail any ACID tests.
>>
>>Just a few notes:
>>
>>The same JCR code was used for creating, moving, deleting etc any nodes.
>>The JCR code was used for all the tests.  The tests were all run on
>>the same machine
>>
>>Used DocumentMK Builder for all DataStores:
>>
>>Mongo:
>>                DocumentNodeStore storeD = new
>>DocumentMK.Builder().setPersistentCache("D:\\ekm-oak\\Mongo,size=1024,
>>bin
>>a
>>ry=0").setMongoDB(db).setBlobStore(new
>>DataStoreBlobStore(fds)).getNodeStore();
>>
>>MySql:
>>       RDBOptions options = new
>>RDBOptions().tablePrefix(prefix).dropTablesOnClose(false);
>>        DocumentNodeStore storeD = new
>>DocumentMK.Builder().setBlobStore(new
>>DataStoreBlobStore(fds)).setClusterId(1).memoryCacheSize(64 * 1024 *
>>1024).
>>
>>setPersistentCache("D:\\ekm-oak\\MySql,size=1024,binary=0").setRDBConn
>>ect
>>i
>>on(RDBDataSourceFactory.forJdbcUrl(url, userName, password),
>>options).getNodeStore();
>>PostGres:
>>                                RDBOptions options = new
>>RDBOptions().tablePrefix(prefix).dropTablesOnClose(false);
>>        DocumentNodeStore storeD = new
>>DocumentMK.Builder().setAsyncDelay(0).setBlobStore(new
>>DataStoreBlobStore(fds)).setClusterId(1).memoryCacheSize(64 * 1024 *
>>1024).
>>
>>setPersistentCache("D:\\ekm-oak\\postGress,size=1024,binary=0").setRDB
>>Con
>>n
>>ection(RDBDataSourceFactory.forJdbcUrl(url, userName, password),
>>options).getNodeStore();
>>
>>The repository was created the same for all three:
>>Repository repository = new Jcr(new Oak(storeD)).with(new
>>LuceneIndexEditorProvider()).with(configureSearch()).createRepository(
>>);
>>
>>Any input is welcome..
>>
>>Thanks,
>>Domenic
>>
>>-----Original Message-----
>>From: Marcel Reutegger [mailto:mreut...@adobe.com]
>>Sent: Tuesday, March 29, 2016 4:41 AM
>>To: oak-dev@jackrabbit.apache.org
>>Subject: Re: Jackrabbit 2.10 vs Oak 1.2.7
>>
>>Hi,
>>
>>the graphs didn't make it through to the mailing list.
>>Can you please post raw numbers or a link to the graphs?
>>
>>Without access to more data, my guess is that Oak on DocumentNodeStore
>>is slower with the bigger changes set because it internally creates a
>>branch to stage changes when it reaches a given threshold. This
>>introduces more traffic to the backend storage when save() is called,
>>because previously written data is retrieved again from the backend.
>>
>>Jackrabbit 2.10 on the other hand keeps the entire changes in memory
>>until
>>save() is called.
>>
>>You can increase the threshold for the DocumentNodeStore with a system
>>property: -Dupdate.limit=100000
>>
>>The default is 10'000.
>>
>>Regards
>> Marcel
>>
>>On 29/03/16 04:19, "Domenic DiTano" wrote:
>>
>>>Hello,
>>>
>>>I work with web application that has Jackrabbit 2.10 embedded and we
>>>wanted to try upgrading to Oak.  Our current configuration that we
>>>use for Jackrabbit
>>>2.10 is the FileDataStore along with MySql for the Persistence
>>>DataStore.
>>> We wrote some test cases to measure the performance of JackRabbit
>>>2.1.0 vs latest Oak 1.2.  In the case of JackRabbit 2.10, we used
>>>what our current application configuration ­ FileDataStore along with
>>>MySql.
>>>In the case of Oak we tried many configurations but the one we
>>>settled on was a DocumentNodeStore with a FileDataStore backend.We
>>>tried all 3 RDB options (Mongo, PostGress, MySql).
>>>All Test cases used the same code which standard
>>>JCR 2.0 code.   The test cases did the following:
>>>
>>>.
>>>create 1000 & 10,000 nodes
>>>.
>>>move 1000 & 10,000 nodes
>>>.
>>>copy 1000 & 10,000 nodes
>>>.
>>>delete 1000 & 10,000 nodes
>>>.
>>>upload 100 files
>>>.
>>>read 1 property on 1000 & 10,000 nodes .
>>>update 1 property on 1000 & 10,000 nodes
>>>
>>>
>>>The results were as follows (all results in milliseconds):
>>>
>>>Oak tests ran with the creation, move, copy, delete, update, and read
>>>of
>>>1000 nodes:
>>>
>>>
>>>
>>>Postgress seems to perform well overall.
>>>
>>>In the case of Jackrabbit 2.10 (tests ran with the creation, move,
>>>copy, delete, update, and read of 1000 nodes):
>>>
>>>
>>>
>>>Jackrabbit 2.10 performs slightly better than Oak.
>>>
>>>The next set of tests were ran with Oak with the creation, move,
>>>copy, delete, update, and read of 10000 nodes:
>>>
>>>
>>>
>>>Postgress once again performed ok.  Mongo and MySql did not do well
>>>around Moves, deletes, and updates. Querying did well also as indexes
>>>were created.
>>>
>>>In the case of Jackrabbit 2.10 (tests ran with the creation, move,
>>>copy, delete, update, and read of 10000 nodes):
>>>
>>>
>>>
>>>Jackrabbit 2.10 performed much
>>>better than Oak in general.
>>>
>>>Based on the results I have a few questions/comments:
>>>
>>>.
>>>Are these fair comparisons between Jackrabbit and Oak?  In our
>>>application it is very possible to create 1-10,000 nodes in a user
>>>session.
>>>.
>>>Should I have assumed Oak would outperform Jackrabbit 2.10?
>>>.
>>>I understand MySql is experimental but Mongo is not ­ I would assume
>>>Mongo would perform as well if not better than Postgress .
>>>The performance bottlenecks seem to be at the JDBC level for MySql.
>>>I made some configuration changes which helped performance but the
>>>changes would make MySql fail any ACID tests.
>>>
>>>Just a few notes:
>>>
>>>The same JCR code was used for creating, moving, deleting etc any nodes.
>>>The JCR code was used for all the tests.  The tests were all run on
>>>the same machine
>>>
>>>Used DocumentMK Builder for all DataStores:
>>>
>>>Mongo:
>>>                DocumentNodeStore storeD = new
>>>DocumentMK.Builder().setPersistentCache("D:\\ekm-oak\\Mongo,size=1024
>>>,b
>>>ina
>>>ry=0").setMongoDB(db).setBlobStore(new
>>>DataStoreBlobStore(fds)).getNodeStore();
>>>
>>>MySql:
>>>       RDBOptions options = new
>>>RDBOptions().tablePrefix(prefix).dropTablesOnClose(false);
>>>        DocumentNodeStore storeD = new
>>>DocumentMK.Builder().setBlobStore(new
>>>DataStoreBlobStore(fds)).setClusterId(1).memoryCacheSize(64 * 1024 *
>>>1024).
>>>
>>>setPersistentCache("D:\\ekm-oak\\MySql,size=1024,binary=0").setRDBCon
>>>ne cti on(RDBDataSourceFactory.forJdbcUrl(url, userName, password),
>>>options).getNodeStore();
>>>PostGres:
>>>                                RDBOptions options = new
>>>RDBOptions().tablePrefix(prefix).dropTablesOnClose(false);
>>>        DocumentNodeStore storeD = new
>>>DocumentMK.Builder().setAsyncDelay(0).setBlobStore(new
>>>DataStoreBlobStore(fds)).setClusterId(1).memoryCacheSize(64 * 1024 *
>>>1024).
>>>
>>>setPersistentCache("D:\\ekm-oak\\postGress,size=1024,binary=0").setRD
>>>BC onn ection(RDBDataSourceFactory.forJdbcUrl(url, userName,
>>>password), options).getNodeStore();
>>>
>>>The repository was created the same for all three:
>>>Repository repository = new Jcr(new Oak(storeD)).with(new
>>>LuceneIndexEditorProvider()).with(configureSearch()).createRepository
>>>()
>>>;
>>>
>>>Any input is welcomeŠ.
>>>
>>>Thanks,
>>>Domenic
>>>
>>>
>
>
>
>
>
>
>
>
>
>
>--
>Domenic DiTano
>ANSYS, Inc.
>Tel: 1.724.514.3624
>
>domenic.dit...@ansys.com
>www.ansys.com <http://www.ansys.com>
>
>
>
>
>
>
>

Reply via email to