Hi,
I pinged Chris to ask him if he has solved his problem.
Local expertise is not in the office this week! :-)

Paolo Castagna wrote:
> Hi Chris,
> from your message it is not clear if you are using tdbloader, tdbloader2 or
> something else.
> 
> Also, you should make absolutely sure you are not running Fuseki or any other
> Java application pointing at the same TDB location while you do a load.

This was the issue.

Chris was loading data into TDB while running (and querying with) Fuseki.

TDB has concurrency checking within the same JVM.
It's not supported and it's not save running two JVM pointing them at the
same TDB location.

Chris has confirmed that he reloaded all of his data successfully making
sure Fuseki was not running at load time.

Incremental bulkloading without shutting down Fuseki is IMHO an interesting
use case for those using Fuseki in production to run their SPARQL endpoints.
I guess, the answer to that is to use SPARQL Update or the Graph Store HTTP
Protocol. Isn't it?

Cheers,
Paolo

> 
> Andy Seaborne wrote:
>> On 07/10/11 23:49, Chris Clarke wrote:
>>> Hi all,
>> Hi Chris,
>>
>>> I've loaded around 200M triples (from around 25 source datasets) into
>>> single tdb dataset. Fronted by Fuseki, I now get the following error
>>> when trying to SPARQL:
>> Which way are you loading them?  There are several ways of getting files
>> in with SPARQL: Update/LOAD or HTTP PUT/POST
>>
>>> Error 500: BlockMgrFile: Bounds exception: /mnt/ceu/DB/node2id.idn:
>>> (8220,7168)
>> I've not seen that before.  Is there more of a stacktrace in the log file?
>>
>> Does it occur at a particular point in the load?
>>
>>> Fuseki - version 0.2.1-SNAPSHOT (Date: 2011-09-08T16:38:26+0000)
>>>
>>> Any ideas?
>> There are 2 bulkloaders for loading from scratch - they both work by
>> directly manipulating the database so aren't suitable for use with an
>> online service.  But an online service is going to be pretty offline
>> while a large load is done so having a published copy of the database
>> and a staging version might make sense for other reasons.
>>
>>     Andy
>>
>> PS There is local expertise near you that is very familiar with the
>> codebase.
> 
> True, but posting questions on jena-users mailing list is much better than
> asking "local expertise" since others can benefit from the answers.
> Also, we get to see problems and/or needs of users wanting to use Fuseki.
> 
> So, I encourage Chris (now a Fuseki and TDB user) to continue posting his
> questions on jena-users mailing list. ;-)
> 
> (By the way, I fear the next big load which might come from Chris's team:
> 1 or 2 billion triples|quads. How much RAM would that need? Has someone
> ever attempted that before?)
> 
> Paolo
> 
>>> Chris
>>>

Reply via email to