très bon thinking batman!
On the wireless at the Panera but going to abandon it soon as they don't 
believe in AC here.
Da Stink
On Jun 24, 2012, at 9:37 AM, Stefan Scheffler wrote:

> Am 24.06.2012 18:29, schrieb Paolo Castagna:
>> Stefan Scheffler wrote:
>>> Hey Paolo.
>>> Thanks for your reply.
>>> I used tdbloader2 with an own Tokenizer / Errorhandler (which just
>>> catches / skips errors and writes them into a file).
>>> the command was /.tdbloader2 --loc=<store> <srcpath>/*
>>> 
>>> Is there a possibility to do incremental loads with the script files or
>>> do i have to write a own program?
>> Hi Stefan,
>> if you want to run an incremental load you should use tdbloader, not 
>> tdbloader2.
>> tdbloader supports incremental loads, tdbloader2 not.
>> 
>> If you are loading large datasets make sure you have enough RAM (you can load
>> data on a machine with a lot of RAM and move indexes elsewhere).
>> 
>> Paolo
> Thank you. i will try it tomorrow.
> Stefan
>> 
>>> Regards,
>>> Stefan
>>> 
>>> Am 24.06.2012 10:42, schrieb Paolo Castagna:
>>>> Hi Stefan,
>>>> as Rob said, loading data into an empty TDB store is a different from
>>>> loading
>>>> data into an existing TDB store.
>>>> 
>>>> I assume that for your second data load you used tdbloader not
>>>> tdbloader2.
>>>> 
>>>> tdbloader2 does not even support incremental data loads (i.e. it will
>>>> overwrite
>>>> your existing data). I suspect this is what is going on.
>>>> 
>>>> Can you share the exact commands you used as well as links to the RDF
>>>> data?
>>>> (this way others can replicate your experiments).
>>>> 
>>>> Regards,
>>>> Paolo
>>>> 
>>>> Stefan Scheffler wrote:
>>>>> Hello,
>>>>> At the moment i am doing some performance checks on tdb. The first i
>>>>> checked was the import of the tdbloader2 and i got some weird results.
>>>>> Maybe someone can help me out. Here are my testbase and the results.
>>>>> 
>>>>> The first test was to store 12 GB of triples into an empty store (i used
>>>>> the german dbpedia).
>>>>> 
>>>>> Load time: 16 minutes
>>>>> average loading: ca 81.000 triple / second
>>>>> index time: 40 minutes
>>>>> store size: 9,3GB
>>>>> 
>>>>> 
>>>>> The second test was to store the same data into an allready filled store
>>>>> As i started the import i created a store with 348.398.593 Triples from
>>>>> DNB and HBZ (which are german libraries, store size: 33 GB).
>>>>> Then i started to load the german dbpedia in.
>>>>> 
>>>>> Load time: 3 hours and 4 minutes
>>>>> average loading: ca 7200 / second
>>>>> index time: 38 minutes
>>>>> store size: 19 GB!!!!!
>>>>> 
>>>>> Why does the loading time increases that immense? My expectation was,
>>>>> that the index time increases. But it does not. There where no other big
>>>>> proccesses running nearby. And why does the store size shrink to 19GB? I
>>>>> am totally confused about that point.
>>>>> 
>>>>> With friendly regards
>>>>> Stefan
>>>>> 
>>> 
> 
> 

Reply via email to