Hi All,

I did bulk loading through the command line utility (with time) after making
some tuning with the Mysql(buffer_pool and key buffer size) and got the
final loading time as roughly 5 and half hours for ~24 million triples which
seems okay to me. It took 4 hours to index this dataset.

Any comments here??

I am now querying this dataset through command utility again where the
resulting tuples are printed along with the execution time, I would like to
know if this execution time includes the printing time as well (which I
would *not* prefer), kindly let me know this..

Thanks to all of you for you advices, it was very helpful to me :)

BR,
Shri


On Fri, Sep 30, 2011 at 2:54 AM, Shri :) <[email protected]> wrote:

> Hello, Sorry my dataset is in .NT format..
>
>
> On Fri, Sep 30, 2011 at 2:52 AM, Shri :) <[email protected]> wrote:
>
>> Hi All,
>>
>>
>> @Damian  thanks for the link, I will now try increasing the
>> buffer_pool_size and carry out the loading..Will let you know how it goes.
>>
>> @ Andy: Are you using the sdb bulk loader or loading via your own code?What
>> format is the data in?
>> But why not use the sdbload tool? Take the source code and add whatever
>> extras timing you need (it already can print some timing info).
>>
>>
>> I am using the following code, which I don't think it is very different
>> from the one that you suggested, *my data is in .TTL format*
>> Here is the snippet of my code:
>>
>> StoreDesc storeDesc = StoreDesc.read("sdb2.ttl") ; IDBConnection conn =
>> new DBConnection ( DB_URL, DB_USER, DB_PASSWD, DB ); conn.getConnection();
>> SDBConnection sdbconn = SDBFactory.createConnection( conn.getConnection()) ;
>> Store store = SDBFactory.connectStore(sdbconn, storeDesc) ; Model model=
>> SDBFactory.connectDefaultModel(store); //read data into the database
>> InputStream inn= new FileInputStream ("dataset_70000.nt"); long start =
>> System.currentTimeMillis(); model.read(inn, "localhost", "TTL");
>> loadtime=ext.elapsedTime(start); // Close the database connection
>> store.close(); System.out.println("Loading time: " + loadtime);
>>
>>
>>
>> @Dave I think I followed the pattern suggested in the link that you gave
>> me (http://openjena.org/wiki/SDB/Loading_data), the above is the snippet
>> of my source code.
>>  And one more thing, I didn't get the idea of "Are you wrapping the load
>> in a transaction to avoid auto-commit costs?", can you please elaborate a
>> bit on this?? Sorry, I am relatively a novice..
>>
>>
>> Any thoughts over this? thank you very much! :)
>>
>> BR,
>> shri
>>
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Sep 29, 2011 at 12:00 AM, Shri :) <[email protected]> wrote:
>>
>>>  *
>>> *
>>>
>>> Hi Again,
>>>
>>> I supposed to evaluate the performance of few triple stores as a part of
>>> my thesis work (which is the specification which I cannot change
>>> unfortunately)one among them is Jens SDB with Mysql, I am using my own java
>>> code to load the data and not the command line tool, as I wanted to make
>>> note of the loading time. I am using .NT format of data for loading.
>>>
>>> I have a 8 GB RAM
>>>
>>> any thoughts/suggestion over this? thanks for your help.
>>>
>>>
>>>
>>> On Wed, Sep 28, 2011 at 4:09 PM, Shri :) <[email protected]> wrote:
>>>
>>>> Hi Everyone,
>>>>
>>>> I am currently doing my master thesis wherein I have to work with Jena
>>>> SDB using mySQL as a backend store. I have around 25 million triples to 
>>>> load
>>>> which has taken more than 5 days to load in windows platform, whereas
>>>> according to the Berlin Benchmark, it took only 4 hours to load the same
>>>> number of triples but in Linux platform, this has left me confused..is the
>>>> enormous difference because of the difference in the platform or should I 
>>>> do
>>>> any performance tuning/optimization to improve the load time??
>>>>
>>>> kindly give your suggestions/comments
>>>>
>>>> P.S I am using WAMP
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Shridevika
>>>>
>>>
>>>
>>
>

Reply via email to