optimization of JenaDB

Jovanovska Sashka Mon, 17 Dec 2018 01:12:57 -0800

Dear all,

We are a group of developers from Macedonia, who are currently working with 
JenaDB. We are trying to implement Jena in our project and we did a lot of 
research and decided that this may be the best approach for us. The requirement 
for choosing of suitable database was defined to store data about objects with 
their classes and attributes. The database needs to be chosen in that way so it 
has the highest (fastest) performance concerning RDF data, so objects can be 
retrieved in shortest time possible. The analysis for choosing database have 
shown that JENA TDB is the right approach for storing RDF data model, which as 
triple store technology is shown as best option for RDF structures (subject, 
predicate and object). It is concluded that the JENA TDB database is capable to 
persist all objects according to profile defined classes.


1. We have three environments. Their configurations are:

                               1.1 CPU 4 cores @ 2.2 GHz

                              1.2 16Gb - RAM

                              1.3 100Gb - Disk

                               2.1 CPU 16 cores @ 2.0 GHz

                              2.2 96Gb - RAM

                              2.3 500Gb - Disk

                               3.1 CPU 4 cores @ 3.2 GHz

                              3.2 16Gb - RAM

                              3.3 320Gb - SSD Disk

We are using Jena version 3.6

2. We are working on system that will contain data for objects with unique 
object ID - mRID (Master Resource ID). The main goal of the system is exchange 
of those objects between more systems. Importing of model from file goes from 
model.write and with that is creating a dataset. Also we are creating single 
objects with insert. Attached are examples of our model, together with its 
namespaces.

3. Our objectives on terms of speed are  for create and get to be less then 
30ms, and fully load database with 100M objects (~500M triplets). Also we need 
to make export of the database, with correct tag names from namespace of the 
objects (currently we receive all of them with rdf:Description tag)

4. Currently we are getting numbers like 200ms or more for creating of 1 object 
and we think that we have reached the load limit of the database which is 
currently full with 80 million objects.



But in the process of developing we faced some problems. Our goal was to fill 
the database with 100M objects in which we encounter a lot of problems. Nothing 
is working as it should, and the importing process is getting slower and slower 
as the database grows bigger, even though it still dose not reach the highest 
limit.

Would you be able to answer few questions of ours?



  1.  Is it possible somehow to optimize the application (and database also) in 
order to work faster and more reliable?
  2.  What is the correct way for using namespace prefixes in order to export 
data in correct way?
  3.  Will it be possible for us to get your help in form of a workshop or 
training?



I would like to thank you in advance for your help.

Kind Regards,

Sashka

sampleModel.xml
Description: sampleModel.xml

optimization of JenaDB

Reply via email to