Peter DeVries wrote:
Hi Hugh,

It appears that they are being loaded from the sitemap file correctly because I can query against them.

One related issue is that the setup for crawling the sitemap uses the sitemap url as the named graph.

http://lod.taxonconcept.org/sitemap.xml.gz

Unfortunately the named graph http://lod.taxonconcept.org does not appear to show up as a named graph on Virtuoso.

It seems to default to the named graphs being the URI for each species rdf.
The source file URL is the eventual Graph IRI.


http://lod.taxonconcept.org/ses/v6n7p.rdf

I wonder if it would be useful to add a field to the setup menu that allows you to choose what named graph the data set should be loaded into.

Yes, this is a long overdue feature re. Crawler and Sponger.

Load these into this <> named graph.

This would allow me to add data sets from many different crawl locations into one named graph.

Yes.

Kingsley

Thanks!

- Pete

On Fri, Jun 11, 2010 at 12:47 PM, Hugh Williams <[email protected] <mailto:[email protected]>> wrote:

    Hi Peter,

    So you are using the dum_graph() function detailed at:

    http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtDumpLoadRdfGraphs

    I presume you can query these site map triples in your Virtuoso
    triple store ? How where these site map triples loaded into
    Virtuoso in the first place ? Is it just the sitemap triples that
    are missing or others possibly ?

    You could trying dumping the triples again and use the Virtuoso
    trace_on() function to trace the dump sequence to see what triples
    are being dumped and if any errors might be occurring. The trace
    output would be written to the Virtuoso log file and details on
    the trace_on() function can be found at:

    http://docs.openlinksw.com/virtuoso/fn_trace_on.html

    Best Regards
    Hugh Williams
    Professional Services
    OpenLink Software
    Web: http://www.openlinksw.com
    Support: http://support.openlinksw.com
    Forums: http://boards.openlinksw.com/support
    Twitter: http://twitter.com/OpenLink

    On 11 Jun 2010, at 16:53, Peter DeVries wrote:

    I am trying to get a data dump of all the triples in my Virtuoso
    Opensource triple store.

    I have this procedure defined:

    create procedure dump_graphs (in dir varchar :=
    '/usr/share/virtuoso/dump/', in file_length_limit integer :=
    1000000000)
    {
      declare inx int;
      inx := 1;
      set isolation = 'uncommitted';
      for (select * from (sparql define input:storage "" select
    distinct ?g { graph ?g { ?s ?p ?o } . filter ( ?g != virtrdf: ) }
    ) as sub option (loop)) do
        {
          dump_one_graph ("g", sprintf ('%s/graph%06d_', dir, inx),
    file_length_limit);
          inx := inx + 1;
        }
    }
    ;

when I run
    dump_graphs():

    I get a set of .ttl files, the largest of which is about 10MB.

    These contain what appear to be virtuoso triples, but now the
    triples I have loaded via my sitemap crawl.

    Any suggestions?

    Thanks!

    - Pete



    ----------------------------------------------------------------
    Pete DeVries
    Department of Entomology
    University of Wisconsin - Madison
    445 Russell Laboratories
    1630 Linden Drive
    Madison, WI 53706
    GeoSpecies Knowledge Base
    About the GeoSpecies Knowledge Base
    ------------------------------------------------------------
    
------------------------------------------------------------------------------
    ThinkGeek and WIRED's GeekDad team up for the Ultimate
    GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
    lucky parental unit.  See the prize list and enter to win:
    
http://p.sf.net/sfu/thinkgeek-promo_______________________________________________
    Virtuoso-users mailing list
    [email protected]
    <mailto:[email protected]>
    https://lists.sourceforge.net/lists/listinfo/virtuoso-users




--
----------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
GeoSpecies Knowledge Base
About the GeoSpecies Knowledge Base
------------------------------------------------------------
------------------------------------------------------------------------

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo
------------------------------------------------------------------------

_______________________________________________
Virtuoso-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


--

Regards,

Kingsley Idehen President & CEO OpenLink Software Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen





Reply via email to