Hi Peter,

It would be useful to see what is actually in the sub-directories, especially 
config files for graph names which is the key.  You give an example of 
"taxonconcept.ext.graph"  which may be the source of you problem and needs 
clarifying better in the online docs which also needs an example using 
"ld_dir_all" . The "ext" in the graph file should actually be the extension of 
the data you are loading, thus "taxonconcept.ext.graph" probably should be  
"taxonconcept.rdf.graph" assuming you have a dataset file of the name 
"taxonconcept.rdf" in that directory.

Basically, if a file called "global.graph" exists in the directory its contents 
will override the graph name specified in ld_dir_all() and be used for all 
datasets files in that directory unless there is a file of the same name as a 
dataset file with .graph on the end containing an alternative graph name for 
the contents of that dataset file to be loaded with.  For example:

[virtuoso@opllinux5 database]$ ls -l dump
total 32
drwxrwxr-x 2 virtuoso virtuoso 4096 Oct 20 08:37 d1
drwxrwxr-x 2 virtuoso virtuoso 4096 Oct 20 08:36 d2
-rw-rw-r-- 1 virtuoso virtuoso  474 Oct 20 08:35 file.n3
-rw-rw-r-- 1 virtuoso virtuoso   12 Oct 20 08:37 global.graph
[virtuoso@opllinux5 database]$ ls -l dump/d1
total 16
-rw-rw-r-- 1 virtuoso virtuoso 474 Oct 20 08:35 file1.n3
-rw-rw-r-- 1 virtuoso virtuoso  13 Oct 20 08:37 file1.n3.graph
-rw-rw-r-- 1 virtuoso virtuoso  12 Oct 20 09:20 global.graph
[virtuoso@opllinux5 database]$ ls -l dump/d2
total 16
-rw-rw-r-- 1 virtuoso virtuoso 474 Oct 20 08:36 file2.n3
-rw-rw-r-- 1 virtuoso virtuoso  13 Oct 20 08:36 file2.n3.graph
-rw-rw-r-- 1 virtuoso virtuoso  12 Oct 20 09:20 global.graph
[virtuoso@opllinux5 database]$ more dump/global.graph 
http://file
[virtuoso@opllinux5 database]$ more dump/d1/file1.n3.graph 
http://file1
[virtuoso@opllinux5 database]$ more dump/d2/file2.n3.graph 
http://file2
$ isql 1162
Connected to OpenLink Virtuoso
Driver: 06.02.3128 OpenLink Virtuoso ODBC Driver
OpenLink Interactive SQL (Virtuoso), version 0.9849b.
Type HELP; for help and EXIT; to exit.
SQL> ld_dir_all ('./dump', '*.n3', 'http://hugh');

Done. -- 4 msec.
SQL> rdf_loader_run ();

Done. -- 5 msec.
SQL> SPARQL SELECT ?g count(*) WHERE {GRAPH ?g {?s ?p ?o} };
g                                                                               
  callret-1
VARCHAR                                                                         
  VARCHAR
_______________________________________________________________________________

http://file2                                                                    
  3
http://file1                                                                    
  3
http://opllinux5.usnet.private:8862/DAV                                         
  2245
http://file                                                                     
  3
http://www.openlinksw.com/schemas/virtrdf#                                      
  1813
http://www.w3.org/2002/07/owl#                                                  
  160

6 Rows. -- 22 msec.
SQL>

So the "global.graph" name http://file in the dump directory overrode the graph 
name in the ld_dir_all() call, the file1.n3.graph name http://file1 in the 
dump/d1 directory override the global.graph name, ditto for the dump/d2 
directory ...

We are going to update the online docs to make this clearer ...

Best Regards
Hugh Williams
Professional Services
OpenLink Software
Web: http://www.openlinksw.com
Support: http://support.openlinksw.com
Forums: http://boards.openlinksw.com/support
Twitter: http://twitter.com/OpenLink

On 20 Oct 2010, at 02:54, Peter DeVries wrote:

> Questions about bulk loading .rdf from a directory path
> 
> I have a directory of .rdf files partitioned into the the following graph 
> structure.
> 
> within /home/lod_data I have the following directories that contain .rdf 
> files.
> 
> /home/lod_data/bbc
> /home/lod_data/dbpedia
> /home/lod_data/eunis
> /home/lod_data/geonames
> /home/lod_data/global.graph
> /home/lod_data/gni
> /home/lod_data/index.rdf
> /home/lod_data/taxonconcept
> 
> there is a file for global.graph and a files in each of the directories for 
> the specific subgraph
> 
> for instance: taxonconcept/taxonconcept.ext.graph
> 
> Most of these are just one directory deep with .rdf files, one "dbpedia" has 
> subdirectories for species, authors etc.
> 
> If I try to run the following from isql-vt
> 
> SQL> ld_dir_all ('/home/lod_data', '*.rdf', 
> 'http://lsd.taxonconcept.org/dataspace');
> SQL> rdf_loader_run ();
> 
> It loads all the triples into the global graph 
> <http://lsd.taxonconcept.org/dataspace>
> 
> When I do a count by graph I don't see the subgraphs
> 
> SQL> SPARQL SELECT ?g count(*) WHERE {GRAPH ?g {?s ?p ?o} };
> 
> I had thought that this would work
> 
> ld_dir ('/home/lod_data', '*.rdf', 'http://lsd.taxonconcept.org/dataspace');
> 
> In the past ld_dir only worked when I load one graph at a time, for instance.
> 
> ld_dir ('/home/lod_data/taxonconcept', '*.rdf', 
> 'urn:org:linkedopenspeciesdata:dataspace:taxonconcept');
> rdf_loader_run ();
> 
> In my experience ld_dir works when the .rdf is all within the first 
> directory, no subdirectories.
> 
> While ld_dir_all works with directories that have subdirectories.
> 
> I have a slightly modified version of the example dbpedia bulk_loader.isql, 
> that I modified so for a different global graph and .rdf rather than .n3
> 
> I will attach to this message.
> 
> My goal is to get understand some of the nuances of how to bulk load .rdf 
> from a directory structure like the one above, so I can make it easy for 
> others.
> 
> Specifically, a set of procedures which will create a global graph that 
> contains subgraphs based on a structure similar to what I have described.
> 
> Thanks in Advance,
> 
> - Pete
> 
> ----------------------------------------------------------------
> Pete DeVries
> Department of Entomology
> University of Wisconsin - Madison
> 445 Russell Laboratories
> 1630 Linden Drive
> Madison, WI 53706
> TaxonConcept Knowledge Base / GeoSpecies Knowledge Base
> About the GeoSpecies Knowledge Base
> ------------------------------------------------------------
> <bulk_loader_txn.isql>------------------------------------------------------------------------------
> Download new Adobe(R) Flash(R) Builder(TM) 4
> The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
> Flex(R) Builder(TM)) enable the development of rich applications that run
> across multiple browsers and platforms. Download your free trials today!
> http://p.sf.net/sfu/adobe-dev2dev_______________________________________________
> Virtuoso-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Reply via email to