On 30/09/11 10:52, Shri :) wrote:
Hi All,
@Damian thanks for the link, I will now try increasing the buffer_pool_size
and carry out the loading..Will let you know how it goes.
@ Andy: Are you using the sdb bulk loader or loading via your own code?What
format is the data in?
But why not use the sdbload tool? Take the source code and add whatever
extras timing you need (it already can print some timing info).
I am using the following code, which I don't think it is very different from
the one that you suggested, *my data is in .TTL format*
Here is the snippet of my code:
StoreDesc storeDesc = StoreDesc.read("sdb2.ttl") ; IDBConnection conn = new
DBConnection ( DB_URL, DB_USER, DB_PASSWD, DB ); conn.getConnection();
SDBConnection sdbconn = SDBFactory.createConnection( conn.getConnection()) ;
Store store = SDBFactory.connectStore(sdbconn, storeDesc) ; Model model=
SDBFactory.connectDefaultModel(store); //read data into the database
InputStream inn= new FileInputStream ("dataset_70000.nt"); long start =
System.currentTimeMillis(); model.read(inn, "localhost", "TTL");
loadtime=ext.elapsedTime(start); // Close the database connection
store.close(); System.out.println("Loading time: " + loadtime);
(Unreadable)
[
Damian - does model.read() go via the bulkloader or is this code using
one transaction per triple
]
Try putting around the load:
store.getLoader().startBulkUpdate();
...
store.getLoader().finishBulkUpdate();
Using the Turtle reader for N-Triples is slightly slower - but only tens
of %.
Andy
@Dave I think I followed the pattern suggested in the link that you gave me
(http://openjena.org/wiki/SDB/Loading_data), the above is the snippet of my
source code.
And one more thing, I didn't get the idea of "Are you wrapping the load in a
transaction to avoid auto-commit costs?", can you please elaborate a bit on
this?? Sorry, I am relatively a novice..
Any thoughts over this? thank you very much! :)
BR,
shri
On Thu, Sep 29, 2011 at 12:00 AM, Shri :)<[email protected]> wrote:
*
*
Hi Again,
I supposed to evaluate the performance of few triple stores as a part of my
thesis work (which is the specification which I cannot change
unfortunately)one among them is Jens SDB with Mysql, I am using my own java
code to load the data and not the command line tool, as I wanted to make
note of the loading time. I am using .NT format of data for loading.
I have a 8 GB RAM
any thoughts/suggestion over this? thanks for your help.
On Wed, Sep 28, 2011 at 4:09 PM, Shri :)<[email protected]> wrote:
Hi Everyone,
I am currently doing my master thesis wherein I have to work with Jena SDB
using mySQL as a backend store. I have around 25 million triples to load
which has taken more than 5 days to load in windows platform, whereas
according to the Berlin Benchmark, it took only 4 hours to load the same
number of triples but in Linux platform, this has left me confused..is the
enormous difference because of the difference in the platform or should I do
any performance tuning/optimization to improve the load time??
kindly give your suggestions/comments
P.S I am using WAMP
Thanks
Shridevika