Firstly, I run "riot --validate" on the data before loading.
After loading I do
SELECT (count(*) as ?c)
{ { ?s ?p ?o } UNION { GRAPH ?s { ?s ?p ?o } } }
the key is it goes to the end of the SPO index (hence checking the end
of the load happened) and it is reasonably quick.
note that's the triple count, and if you have duplicates in the data
then it will be less than wc -l of the nquads.
and I do a
SELECT * { <uri> ?p ?o }
for a <uri> known to be a subject.
Andy
On 09/11/11 23:59, Paolo Castagna wrote:
Peter Jungen wrote:
Hello dear jena team,
I have loaded a dump (nquads format), numerous files, into my TDB using
tdbloader2.
How do I verify the loading of the dump was indeed completly successfull?
regards
Pete
Hi Pete,
I usually use tdbdump. See tdbdump --help:
tdbdump : Write N-Quads to stdout
Location
--loc=DIR Location (a directory)
--tdb= Assembler description file
Symbol definition
--set Set a configuration symbol to a value
--strict Operate in strict SPARQL mode (no extensions of any kind)
--desc= Assembler description file
General
-v --verbose Verbose
-q --quiet Run with minimal output
--debug Output information for debugging
--help
--version Version information
You can also load stuff with tdbloader, dump that out, sort it and diff
with
what you get from tdbloader2 (there should be no differences, but bnodes).
A small TDB "verifier" is here:
https://github.com/castagna/tdbloader3/blob/master/src/test/java/dev/TDBVerifier.java
It's not an "official" command for TDB and it's just a quick ack, but
you could
help improving it and contribute back to TDB as a command to check the
integrity
of an index and help people to understand if an index is corrupted, why
it is
corrupted.
Hope this helps.
Paolo