Hi Hugh, It appears that they are being loaded from the sitemap file correctly because I can query against them.
One related issue is that the setup for crawling the sitemap uses the sitemap url as the named graph. http://lod.taxonconcept.org/sitemap.xml.gz Unfortunately the named graph http://lod.taxonconcept.org does not appear to show up as a named graph on Virtuoso. It seems to default to the named graphs being the URI for each species rdf. http://lod.taxonconcept.org/ses/v6n7p.rdf I wonder if it would be useful to add a field to the setup menu that allows you to choose what named graph the data set should be loaded into. Load these into this <> named graph. This would allow me to add data sets from many different crawl locations into one named graph. Thanks! - Pete On Fri, Jun 11, 2010 at 12:47 PM, Hugh Williams <[email protected]>wrote: > Hi Peter, > > So you are using the dum_graph() function detailed at: > > > http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtDumpLoadRdfGraphs > > I presume you can query these site map triples in your Virtuoso triple > store ? How where these site map triples loaded into Virtuoso in the first > place ? Is it just the sitemap triples that are missing or others possibly ? > > You could trying dumping the triples again and use the Virtuoso trace_on() > function to trace the dump sequence to see what triples are being dumped and > if any errors might be occurring. The trace output would be written to the > Virtuoso log file and details on the trace_on() function can be found at: > > http://docs.openlinksw.com/virtuoso/fn_trace_on.html > > Best Regards > Hugh Williams > Professional Services > OpenLink Software > Web: http://www.openlinksw.com > Support: http://support.openlinksw.com > Forums: http://boards.openlinksw.com/support > Twitter: http://twitter.com/OpenLink > > On 11 Jun 2010, at 16:53, Peter DeVries wrote: > > I am trying to get a data dump of all the triples in my Virtuoso Opensource > triple store. > > I have this procedure defined: > > create procedure dump_graphs (in dir varchar := > '/usr/share/virtuoso/dump/', in file_length_limit integer := 1000000000) > { > declare inx int; > inx := 1; > set isolation = 'uncommitted'; > for (select * from (sparql define input:storage "" select distinct ?g { > graph ?g { ?s ?p ?o } . filter ( ?g != virtrdf: ) } ) as sub option (loop)) > do > { > dump_one_graph ("g", sprintf ('%s/graph%06d_', dir, inx), > file_length_limit); > inx := inx + 1; > } > } > ; > > when I run > > dump_graphs(): > > I get a set of .ttl files, the largest of which is about 10MB. > > These contain what appear to be virtuoso triples, but now the triples I > have loaded via my sitemap crawl. > > Any suggestions? > > Thanks! > > - Pete > > > > ---------------------------------------------------------------- > Pete DeVries > Department of Entomology > University of Wisconsin - Madison > 445 Russell Laboratories > 1630 Linden Drive > Madison, WI 53706 > GeoSpecies Knowledge Base > About the GeoSpecies Knowledge Base > ------------------------------------------------------------ > ------------------------------------------------------------------------------ > ThinkGeek and WIRED's GeekDad team up for the Ultimate > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the > lucky parental unit. See the prize list and enter to win: > > http://p.sf.net/sfu/thinkgeek-promo_______________________________________________ > Virtuoso-users mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/virtuoso-users > > > -- ---------------------------------------------------------------- Pete DeVries Department of Entomology University of Wisconsin - Madison 445 Russell Laboratories 1630 Linden Drive Madison, WI 53706 GeoSpecies Knowledge Base About the GeoSpecies Knowledge Base ------------------------------------------------------------
