(at a conference with shoddy networking -- I guess my reply this morning is list somewhere)
Sent from my iPhone On 30 Sep 2011, at 15:10, Andy Seaborne <[email protected]> wrote: > On 30/09/11 10:52, Shri :) wrote: >> Hi All, >> >> >> @Damian thanks for the link, I will now try increasing the buffer_pool_size >> and carry out the loading..Will let you know how it goes. >> >> @ Andy: Are you using the sdb bulk loader or loading via your own code?What >> format is the data in? >> But why not use the sdbload tool? Take the source code and add whatever >> extras timing you need (it already can print some timing info). >> >> >> I am using the following code, which I don't think it is very different from >> the one that you suggested, *my data is in .TTL format* >> Here is the snippet of my code: >> >> StoreDesc storeDesc = StoreDesc.read("sdb2.ttl") ; IDBConnection conn = new >> DBConnection ( DB_URL, DB_USER, DB_PASSWD, DB ); conn.getConnection(); >> SDBConnection sdbconn = SDBFactory.createConnection( conn.getConnection()) ; >> Store store = SDBFactory.connectStore(sdbconn, storeDesc) ; Model model= >> SDBFactory.connectDefaultModel(store); //read data into the database >> InputStream inn= new FileInputStream ("dataset_70000.nt"); long start = >> System.currentTimeMillis(); model.read(inn, "localhost", "TTL"); >> loadtime=ext.elapsedTime(start); // Close the database connection >> store.close(); System.out.println("Loading time: " + loadtime); > > (Unreadable) > > [ > Damian - does model.read() go via the bulkloader or is this code using one > transaction per triple Certainly should do. It would explain a lot if it didn't. I thought the readers signalled bulk loading. Will check. Damian > ] > > Try putting around the load: > store.getLoader().startBulkUpdate(); > ... > store.getLoader().finishBulkUpdate(); > > > Using the Turtle reader for N-Triples is slightly slower - but only tens of %. > > Andy > >> >> >> >> @Dave I think I followed the pattern suggested in the link that you gave me >> (http://openjena.org/wiki/SDB/Loading_data), the above is the snippet of my >> source code. >> And one more thing, I didn't get the idea of "Are you wrapping the load in a >> transaction to avoid auto-commit costs?", can you please elaborate a bit on >> this?? Sorry, I am relatively a novice.. >> >> >> Any thoughts over this? thank you very much! :) >> >> BR, >> shri >> >> >> >> >> >> >> >> >> On Thu, Sep 29, 2011 at 12:00 AM, Shri :)<[email protected]> wrote: >> >>> * >>> * >>> >>> Hi Again, >>> >>> I supposed to evaluate the performance of few triple stores as a part of my >>> thesis work (which is the specification which I cannot change >>> unfortunately)one among them is Jens SDB with Mysql, I am using my own java >>> code to load the data and not the command line tool, as I wanted to make >>> note of the loading time. I am using .NT format of data for loading. >>> >>> I have a 8 GB RAM >>> >>> any thoughts/suggestion over this? thanks for your help. >>> >>> >>> >>> On Wed, Sep 28, 2011 at 4:09 PM, Shri :)<[email protected]> wrote: >>> >>>> Hi Everyone, >>>> >>>> I am currently doing my master thesis wherein I have to work with Jena SDB >>>> using mySQL as a backend store. I have around 25 million triples to load >>>> which has taken more than 5 days to load in windows platform, whereas >>>> according to the Berlin Benchmark, it took only 4 hours to load the same >>>> number of triples but in Linux platform, this has left me confused..is the >>>> enormous difference because of the difference in the platform or should I >>>> do >>>> any performance tuning/optimization to improve the load time?? >>>> >>>> kindly give your suggestions/comments >>>> >>>> P.S I am using WAMP >>>> >>>> >>>> Thanks >>>> >>>> Shridevika >>>> >>> >>> >> >
