Oh I forgot to include the error message in the database. <DB NULL>

Query result:VQ_HOST
VARCHARVQ_TS
DATETIMEVQ_URL
VARCHARVQ_ROOT
VARCHARVQ_STAT
VARCHARVQ_OTHER
VARCHARVQ_ERROR
LONG VARCHARVQ_LEVEL
INTEGERVQ_VIA_SITEMAP
INTEGERVQ_DT
TIMESTAMPVQ_ORIGIN
IRI_ID










 xapi.vocab.pub 2015-10-27 23:25:38.613142 /datasets/adl/verbs/
 home/dba/rdf_sink/adl/index.html retrieved <DB NULL> <DB NULL> 0 0 2015-11-12
21:08:17.809787 <DB NULL>



On Thu, Nov 12, 2015 at 4:16 PM, Haag, Jason <jhaa...@gmail.com> wrote:

> Hi All,
>
> I have been trying to understand how virtuoso's crawler content import and
> sponging features work. I'm currently evaluating virtuoso using 07.20.3214
> VOS.
>
> I set up three crawl jobs for three different HTML/RDFa files and received
> no errors.
>
> When I attempt to use the sparql interface to query the data it doesn't
> show up:
>
> For example, http://w3id.org/xapi/adb/verbs/ is the target URL of a crawl
> job I set up in conductor under content imports. I am using the xhtml/HTM5
> variants cartridge with the following options:
>
> fallback-mode=no
> rdfa=yes
> reify_html5md=0
> reify_rdfa=1
> reify_jsonld=0
> reify_all_grddl=0
> reify_html=0
> passthrough_mode=yes
> loose=yes
> reify_html_misc=no
> reify_turtle=no
>
> If I go to http://54.152.125.100:8890/sparql and use the following sparql
> query it returns no results:
>
> #Query all Verb IRIs
> PREFIX xapi: <https://w3id.org/xapi/ontology#>
>
> SELECT DISTINCT ?Verb
>
> WHERE {
>    ?Verb a xapi:Verb .
>
> }
>
>
> However, the data does start to show up in this query if I subsequently
> add http://w3id.org/xapi/adb/verbs/ as the default data set name / graph
> IRI in the sparql interface and also select the sponging option to download
> all RDF resources.
>
> Is this sponging option from the sparql interface actually adding/download
> the triples? Wouldn't this allow anyone to add triples that has access to
> the sparql interface? The faceted search interface seems to indicate so as
> I did this with
> the following graph IRI, http://adlnet.gov/expapi/verbs
>
>
> http://54.152.125.100:8890/describe/?url=http%3A%2F%2Fadlnet.gov%2Fexpapi%2Fverbs&sid=4
>
> I tried to set up this IRI as a crawl job and it never populated
> virtuoso's data store. But as soon as I add it as a graph IRI using the
> sparql interface and sponging it shows up. Is this the expected behavior /
> by design for this sparql sponging option? I thought graphs and triples
> could only be added with special SPARQL permissions and using INSERT.
>
> I still don't think the crawler feature is working for HTML/RDFa. It
> appears to be processing and storing the HTML file in the
> repository/locally in virtuoso, but it doesn't seem to actually add the
> graph or triples to the database.
>
> Thanks in advance for your patience and help!
>
> J Haag
>
> -------------------------------------------------------
>
>
>
> On Wed, Oct 28, 2015 at 5:17 AM, Tim Haynes <thay...@openlinksw.com>
> wrote:
>
>>
>> On 27 October 2015 at 20:49, Haag, Jason <jhaa...@gmail.com> wrote:
>>
>>> I think I know the answer to my last two questions. I had additional
>>> html files below the /verbs/ directory. I believe that is where the
>>> duplicates came from. I'm guessing sponger also looks for any html files at
>>> the specified path, not just the "index.html" file that was specified as a
>>> target URL. Can anyone verify this?
>>
>>
>> Hi,
>>
>> It's unlikely - I don't know of anything in the Sponger that implements
>> directory browsing, but it may well be following e.g. <link
>> rel="alternate" href="...." /> to RSS/Atom feeds, etc.
>>
>> As Kingsley says, Faceted Browser will show you what graphs the triples
>> appear in.
>>
>> When a page is sponged, its URL becomes 1:1 the graph IRI in which data
>> from/about/in that resource is stored. Multiple graphs implies multiple
>> sponging events.
>>
>> HTH,
>>
>> ~Tim
>> --
>> Tim Haynes
>> Product Development Consultant
>> OpenLink Software
>> <http://www.openlinksw.com/>
>> <http://twitter.com/openlink>
>>
>
>
------------------------------------------------------------------------------
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Reply via email to