Re: [Virtuoso-users] What is the best way of migrating large RDF database from Virtuoso 5.0.x to 6

2009-07-07 Thread Tim Haynes

Matthias Samwald wrote:


Dear all,
 
I am facing a relatively simple problem, but it seems like there is no 
mechanism to really address this issue: I have a large RDF database 
(>400 million triples, dozens of named graphs) in a Virtuoso 5 instance 
that I want to transfer to a Virtuoso 6 instance. What is the best way 
of doing that? The main problem is re-creating the named graphs. 
Virtuoso still has no mechanism to dump RDF data including graph 
provenance (correct?),


You can write a simple enough stored procedure to do that.

-- Dump a graph to specified filename in DAV
create procedure me..dumpgraph (in iri varchar, in fname varchar)
{
  declare res, spq, state, msg, maxrows, metas, rset, ses, triples, s, 
path any;


  spq:=sprintf('sparql CONSTRUCT { ?s ?p ?o . } from <%s> WHERE { ?s ?p ?o 
. }',iri);


-- first dump to TTL files
  dbg_printf('Preparing query..\n');
  ses:=string_output();

  exec (spq, state, msg, vector(), maxrows, metas, rset);

  path:=sprintf('/DAV/home/me/dumps/%s.ttl', fname);
  dbg_printf('Path is: [%s]', path);

  triples := rset[0][0];
  rset := dict_list_keys (triples, 1);
  DB.DBA.RDF_TRIPLES_TO_TTL (rset, ses);
  s := string_output_string (ses);

  DAV_RES_UPLOAD (path, s, 'application/rdf+xml', '110100100NM', 'me', 
null, 'me', 'me');


-- Then dump to RDF/XML files
  dbg_printf('Preparing query..\n');
  ses:=string_output();
  exec (spq, state, msg, vector(), maxrows, metas, rset);

  path:=sprintf('/DAV/home/me/dumps/%s.rdf', fname);
  dbg_printf('Path is: [%s]', path);

  triples := rset[0][0];
  rset := dict_list_keys (triples, 1);
  DB.DBA.RDF_TRIPLES_TO_RDF_XML_TEXT (rset, 1, ses);
  s := string_output_string (ses);

  DAV_RES_UPLOAD (path, s, 'application/rdf+xml', '110100100NM', 'me', 
null, 'me', 'mypassword');


-- Log the graph IRI/fname mapping
  spq:=sprintf('sparql PREFIX rdfs: 
\ninsert into graph 
 { <%s>  
 }', iri, fname);


  dbg_printf('Updating quadstore with [%s]\n', spq);
  exec (spq, state, msg, vector(), maxrows, metas, rset);
};

Obviously you don't have to choose DAV output if you don't want, just as 
long as the filesystem dir is listed in DirsAllowed in the ini-file.


HTH,

~Tim
--
Tim Haynes
Product Development Consultant
OpenLink Software




[Virtuoso-users] What is the best way of migrating large RDF database from Virtuoso 5.0.x to 6

2009-07-07 Thread Matthias Samwald
Dear all,

I am facing a relatively simple problem, but it seems like there is no 
mechanism to really address this issue: I have a large RDF database (>400 
million triples, dozens of named graphs) in a Virtuoso 5 instance that I want 
to transfer to a Virtuoso 6 instance. What is the best way of doing that? The 
main problem is re-creating the named graphs. Virtuoso still has no mechanism 
to dump RDF data including graph provenance (correct?), and using the 
backup/restore functionality will not work when different versions of Virtuoso 
are involved (correct?).

It would be great to have a simple  feature in Virtuoso that allows the user to 
dump RDF data including the graph information (e.g., in TriX format, or in 
'N-Quadruples' format).

Cheers,
Matthias