OK, I did further investigations: I had a backup of the virtuoso db at the point just after loading the DBpedia dumps (en & de) and installing the DBpedia and rdf_mappers packages. I replayed this backup and executed all 3 queries:
On Thursday 19 August 2010, Jörn Hees wrote: > On Thursday 19 August 2010, Hugh Williams wrote: > > SPARQL SELECT ?g count(*) WHERE { GRAPH ?g {?s ?p ?o.} } GROUP BY ?g > > ORDER BY DESC 2; > > > > SPARQL SELECT DISTINCT ?g WHERE { GRAPH ?g {?s ?p ?o.} }; > > > > select * from SPARQL_SELECT_KNOWN_GRAPHS_T; All resulting in the same 26 graphs. I then went to the conductor / RDF / Schemas tab and imported http://www.w3.org/2004/02/skos/core/history/2006-04-18.rdf . Then went to the Graphs tab, deleted the http://www.w3.org/2004/02/skos/core graph (worked fine), and renamed http://www.w3.org/2004/02/skos/core/history/2006-04-18.rdf to http://www.w3.org/2004/02/skos/core . After this the first of the queries results in 27 graphs, both others in 26. (In the first one the http://www.w3.org/2004/02/skos/core/history/2006-04-18.rdf still exists.) I went back to the Schemas tab and deleted the http://www.w3.org/2004/02/skos/core/history/2006-04-18.rdf Schema. Still the same behavior, so 27 vs. 26 graphs. When I now try to delete the http://www.w3.org/2004/02/skos/core from the Graphs tab the first query returns 26 graphs, the second and third 25. The http://www.w3.org/2004/02/skos/core/history/2006-04-18.rdf is still there. I then tried to *manually delete* the persisting graph: sparql clear graph <http://www.w3.org/2004/02/skos/core/history/2006-04-18.rdf>; The result is even stranger: If I do sparql select count(*) from <http://www.w3.org/2004/02/skos/core/history/2006-04-18.rdf> where {?s ?p ?o}; I get 0 rows. BUT: my first sparql query still tells me that there are 1196 triples in that graph! Also: if I execute this sql query: select distinct id_to_iri(g) from rdf_quad; or this one: select distinct id_to_iri(g), count(*) from rdf_quad group by g; The *graph still exists*. A control query: sparql select count(*) from <http://dbpedia.org> where {?s ?p ?o} ; tells me that there are 258867871 rows in my DBpedia dump. As a side-question: why do all those queries take so long? Isn't there a primary key index on the rdf_quad table for g,s,p,o which they should be able to use and return with the count in a split second? If you want I can provide you with a detailed description how I imported the DBpedia dumps (en & de) and could even give you the backup (11 GB gzipped). Jörn