Oh I forgot to include the error message in the database. <DB NULL> Query result:VQ_HOST VARCHARVQ_TS DATETIMEVQ_URL VARCHARVQ_ROOT VARCHARVQ_STAT VARCHARVQ_OTHER VARCHARVQ_ERROR LONG VARCHARVQ_LEVEL INTEGERVQ_VIA_SITEMAP INTEGERVQ_DT TIMESTAMPVQ_ORIGIN IRI_ID
xapi.vocab.pub 2015-10-27 23:25:38.613142 /datasets/adl/verbs/ home/dba/rdf_sink/adl/index.html retrieved <DB NULL> <DB NULL> 0 0 2015-11-12 21:08:17.809787 <DB NULL> On Thu, Nov 12, 2015 at 4:16 PM, Haag, Jason <jhaa...@gmail.com> wrote: > Hi All, > > I have been trying to understand how virtuoso's crawler content import and > sponging features work. I'm currently evaluating virtuoso using 07.20.3214 > VOS. > > I set up three crawl jobs for three different HTML/RDFa files and received > no errors. > > When I attempt to use the sparql interface to query the data it doesn't > show up: > > For example, http://w3id.org/xapi/adb/verbs/ is the target URL of a crawl > job I set up in conductor under content imports. I am using the xhtml/HTM5 > variants cartridge with the following options: > > fallback-mode=no > rdfa=yes > reify_html5md=0 > reify_rdfa=1 > reify_jsonld=0 > reify_all_grddl=0 > reify_html=0 > passthrough_mode=yes > loose=yes > reify_html_misc=no > reify_turtle=no > > If I go to http://54.152.125.100:8890/sparql and use the following sparql > query it returns no results: > > #Query all Verb IRIs > PREFIX xapi: <https://w3id.org/xapi/ontology#> > > SELECT DISTINCT ?Verb > > WHERE { > ?Verb a xapi:Verb . > > } > > > However, the data does start to show up in this query if I subsequently > add http://w3id.org/xapi/adb/verbs/ as the default data set name / graph > IRI in the sparql interface and also select the sponging option to download > all RDF resources. > > Is this sponging option from the sparql interface actually adding/download > the triples? Wouldn't this allow anyone to add triples that has access to > the sparql interface? The faceted search interface seems to indicate so as > I did this with > the following graph IRI, http://adlnet.gov/expapi/verbs > > > http://54.152.125.100:8890/describe/?url=http%3A%2F%2Fadlnet.gov%2Fexpapi%2Fverbs&sid=4 > > I tried to set up this IRI as a crawl job and it never populated > virtuoso's data store. But as soon as I add it as a graph IRI using the > sparql interface and sponging it shows up. Is this the expected behavior / > by design for this sparql sponging option? I thought graphs and triples > could only be added with special SPARQL permissions and using INSERT. > > I still don't think the crawler feature is working for HTML/RDFa. It > appears to be processing and storing the HTML file in the > repository/locally in virtuoso, but it doesn't seem to actually add the > graph or triples to the database. > > Thanks in advance for your patience and help! > > J Haag > > ------------------------------------------------------- > > > > On Wed, Oct 28, 2015 at 5:17 AM, Tim Haynes <thay...@openlinksw.com> > wrote: > >> >> On 27 October 2015 at 20:49, Haag, Jason <jhaa...@gmail.com> wrote: >> >>> I think I know the answer to my last two questions. I had additional >>> html files below the /verbs/ directory. I believe that is where the >>> duplicates came from. I'm guessing sponger also looks for any html files at >>> the specified path, not just the "index.html" file that was specified as a >>> target URL. Can anyone verify this? >> >> >> Hi, >> >> It's unlikely - I don't know of anything in the Sponger that implements >> directory browsing, but it may well be following e.g. <link >> rel="alternate" href="...." /> to RSS/Atom feeds, etc. >> >> As Kingsley says, Faceted Browser will show you what graphs the triples >> appear in. >> >> When a page is sponged, its URL becomes 1:1 the graph IRI in which data >> from/about/in that resource is stored. Multiple graphs implies multiple >> sponging events. >> >> HTH, >> >> ~Tim >> -- >> Tim Haynes >> Product Development Consultant >> OpenLink Software >> <http://www.openlinksw.com/> >> <http://twitter.com/openlink> >> > >
------------------------------------------------------------------------------
_______________________________________________ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users