Re: Import Messures

Graham Matthews Sun, 24 Jun 2012 10:16:10 -0700

très bon thinking batman!
On the wireless at the Panera but going to abandon it soon as they don't 
believe in AC here.
Da Stink
On Jun 24, 2012, at 9:37 AM, Stefan Scheffler wrote:


> Am 24.06.2012 18:29, schrieb Paolo Castagna:
>> Stefan Scheffler wrote:
>>> Hey Paolo.
>>> Thanks for your reply.
>>> I used tdbloader2 with an own Tokenizer / Errorhandler (which just
>>> catches / skips errors and writes them into a file).
>>> the command was /.tdbloader2 --loc=<store> <srcpath>/*
>>> 
>>> Is there a possibility to do incremental loads with the script files or
>>> do i have to write a own program?
>> Hi Stefan,
>> if you want to run an incremental load you should use tdbloader, not 
>> tdbloader2.
>> tdbloader supports incremental loads, tdbloader2 not.
>> 
>> If you are loading large datasets make sure you have enough RAM (you can load
>> data on a machine with a lot of RAM and move indexes elsewhere).
>> 
>> Paolo
> Thank you. i will try it tomorrow.
> Stefan
>> 
>>> Regards,
>>> Stefan
>>> 
>>> Am 24.06.2012 10:42, schrieb Paolo Castagna:
>>>> Hi Stefan,
>>>> as Rob said, loading data into an empty TDB store is a different from
>>>> loading
>>>> data into an existing TDB store.
>>>> 
>>>> I assume that for your second data load you used tdbloader not
>>>> tdbloader2.
>>>> 
>>>> tdbloader2 does not even support incremental data loads (i.e. it will
>>>> overwrite
>>>> your existing data). I suspect this is what is going on.
>>>> 
>>>> Can you share the exact commands you used as well as links to the RDF
>>>> data?
>>>> (this way others can replicate your experiments).
>>>> 
>>>> Regards,
>>>> Paolo
>>>> 
>>>> Stefan Scheffler wrote:
>>>>> Hello,
>>>>> At the moment i am doing some performance checks on tdb. The first i
>>>>> checked was the import of the tdbloader2 and i got some weird results.
>>>>> Maybe someone can help me out. Here are my testbase and the results.
>>>>> 
>>>>> The first test was to store 12 GB of triples into an empty store (i used
>>>>> the german dbpedia).
>>>>> 
>>>>> Load time: 16 minutes
>>>>> average loading: ca 81.000 triple / second
>>>>> index time: 40 minutes
>>>>> store size: 9,3GB
>>>>> 
>>>>> 
>>>>> The second test was to store the same data into an allready filled store
>>>>> As i started the import i created a store with 348.398.593 Triples from
>>>>> DNB and HBZ (which are german libraries, store size: 33 GB).
>>>>> Then i started to load the german dbpedia in.
>>>>> 
>>>>> Load time: 3 hours and 4 minutes
>>>>> average loading: ca 7200 / second
>>>>> index time: 38 minutes
>>>>> store size: 19 GB!!!!!
>>>>> 
>>>>> Why does the loading time increases that immense? My expectation was,
>>>>> that the index time increases. But it does not. There where no other big
>>>>> proccesses running nearby. And why does the store size shrink to 19GB? I
>>>>> am totally confused about that point.
>>>>> 
>>>>> With friendly regards
>>>>> Stefan
>>>>> 
>>> 
> 
>

Re: Import Messures

Reply via email to