Re: [Virtuoso-users] Automating RDF data imports in VIrtuoso

2015-09-30 Thread Haag, Jason
Following up on my original inquiry:

What is the best option for automating the import and update of
RDFa/HTML data on a regular basis into the virtuoso DB?

I'm able to use the crawler to import direct RDF/XML graph (.rdf)
URIs, but I receive the following error page when I use a HTML/RDFa
URI:

This page contains the following errors:

error on line 22 at column 8: Opening and ending tag mismatch: link
line 0 and head

Below is a rendering of the page up to the first error.


There are no further details. There are no HTML validation errors. The
options I checked during import include:

Semantic Web Crawling
Follow URLs outside target host
Accept RDF

When looking at the source of the HTML page, line 22 is where the
 ends. This is as far as the page rendered (to the
). There are no errors with the HTML (I'm using HTML5),
but I'm curious if the issue might be if Virtuoso only works with
XHTML doctype declarations? Appreciate any ideas or experience you all
have to share on support for RDFa.




---
+1.850.266.7100(office)
+1.850.471.1300 (mobile)
jhaag75 (skype)
http://jasonhaag.com (Web)
http://twitter.com/mobilejson (Twitter)
http://linkedin.com/in/jasonhaag (LinkedIn)



On Tue, Sep 29, 2015 at 9:57 AM, Haag, Jason  wrote:
> Following up on my original inquiry: I currently have several RDF
> datasets available on my server. Each data set has an RDF dump
> available as RDF/XML, JSON-LD, and Turtle. These dumps are generated
> automatically without virtuoso from an HTML page marked up using RDFa.
>
> What is the best option for automating the import of this data on a
> regular basis into the virtuoso DB? I would like to automatically
> import RDFa data ideally, but or even rdf/xml or turtle files would be
> fine too. I tried this with the attached settings, but the data
> doesn't appear in the database. What do I need to enable or change in
> my settings in order to automatically import RDF data? See attached
> screen captures. Thanks for any tips or advice!
>
>
>
> ---
> +1.850.266.7100(office)
> +1.850.471.1300 (mobile)
> jhaag75 (skype)
> http://jasonhaag.com (Web)
> http://twitter.com/mobilejson (Twitter)
> http://linkedin.com/in/jasonhaag (LinkedIn)
>
>
>
> On Mon, Sep 28, 2015 at 4:20 PM, Haag, Jason  wrote:
>> What would the steps/instructions be to set up an automatic import for
>> 7.2.1? The instructions and screens here don't match the new interface
>> and field options:
>> http://docs.openlinksw.com/virtuoso/rdfinsertmethods.html#rdfinsertmethodvirtuosocrawler
>>
>> For example, there is no longer a field for "Local WebDAV Identifier"
>> which was previously required.
>> ---
>> +1.850.266.7100(office)
>> +1.850.471.1300 (mobile)
>> jhaag75 (skype)
>> http://jasonhaag.com (Web)
>> http://twitter.com/mobilejson (Twitter)
>> http://linkedin.com/in/jasonhaag (LinkedIn)
>>
>>
>>
>> On Sat, Sep 26, 2015 at 5:39 PM, Paul Houle  wrote:
>>> I like the cloud solution of creating a new virtuoso system,  doing the
>>> load,  having plenty of time to test it,  then replacing the production
>>> instance with the new instance and retiring the production instance.
>>>
>>> The main advantage here is that there is no way a screw-up in the load
>>> procedure can trash the production system --  even if Virtuoso was entirely
>>> reliable,  as the data sources grow the rate of exceptional events (say you
>>> fill the disk) goes up.  The temporary server approach eliminates a lot of
>>> headaches and it is good cloud economics.  (if you run a server at AMZN for
>>> 1 hour a day to update,  the cost of your system only goes up by %4).
>>>
>>> I was having good luck with this approach until Virtuoso 7.2.0 came along
>>> and since then I've had problems similar in severity to what the N.I.H. was
>>> reporting,  it really looked like massive corruption of the data structures,
>>> 7.2.1 did not help.
>>>
>>> I don't know if these issues are fixed in the current TRUNK but if they are
>>> it would be nice to get an official release.
>>>
>>> On Fri, Sep 25, 2015 at 1:31 PM, Haag, Jason  wrote:


 Hi Users,

 I'm trying to determine the best option for my situation for importing RDF
 data into Virtuoso. Here's my situation:

 I currently have several RDF datasets available on my server. Each data
 set has an RDF dump available as RDF/XML, JSON-LD, and Turtle. These dumps
 are generated automatically without virtuoso from an HTML page marked up
 using RDFa.

 What is the best option for automating the import of this data on a
 regular basis into the virtuoso DB? The datasets may grow so it should not
 just import the data once, but import on a regular basis, perhaps daily or
 weekly.

 Based on what I've read in the 

Re: [Virtuoso-users] Automating RDF data imports in VIrtuoso

2015-09-30 Thread Davis, Daniel (NIH/NLM) [C]
> What is the best option for automating the import and update of RDFa/HTML 
> data on a regular basis into the virtuoso DB?

Jason, 

Are you running virtuoso on a Windows system or on a UNIX/Linux system?

In either case, I suggest you:
- Write a script that talks to the server on port  (e.g. the 
database port)
- Good choices are perl/python with ODBC or Java with JDBC are good 
options.
- Debug the script

If you are running this from UNIX/Linux, you can possibly edit a file to run 
regular jobs be entering:
crontab -e

If that doesn't work, you may have to ask your system administrators for help.
Entering the command 'man crontab' may help.



--
___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Re: [Virtuoso-users] how to disable transitivity in sparql queries

2015-09-30 Thread Kingsley Idehen
On 9/30/15 9:11 AM, Adam Sanchez wrote:
> Hello
>
> I would like to retrieve explicitly stated triples for this simple query
>
> SELECT DISTINCT ?class {
> ?instance rdf:type ?class.
> FILTER (?instance = )
> }
>
> but Virtuoso includes inferred triples as well in the results.
>
> Then, instead to get only
>
> http://dbpedia.org/ontology/Settlement
>
> I get also
>
> http://dbpedia.org/ontology/Place
> http://dbpedia.org/ontology/PopulatedPlace
>
> How can I disable this feature in run time using OPTION.
>
> Thanks
>
> Adam

Adam,

Virtuoso doesn't inject inferred triples into SPARQL solutions if the
inference rules pragmas aren't set.

You are get a solution (unaltered) based on the data in the DBpedia
instance.

[1]
http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org=SELECT+DISTINCT+%3Fg+%3Fclass+%7B+graph+%3Fg+%7B%0D%0A%3Finstance+rdf%3Atype+%3Fclass.%0D%0AFILTER+%28%3Finstance+%3D+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FLovedean%3E%29%7D%0D%0A%7D=text%2Fhtml_redir_for_subjs=121_redir_for_hrefs==3=on
 
-- Run this query.

-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software 
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this




smime.p7s
Description: S/MIME Cryptographic Signature
--
___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


[Virtuoso-users] how to disable transitivity in sparql queries

2015-09-30 Thread Adam Sanchez
Hello

I would like to retrieve explicitly stated triples for this simple query

SELECT DISTINCT ?class {
?instance rdf:type ?class.
FILTER (?instance = )
}

but Virtuoso includes inferred triples as well in the results.

Then, instead to get only

http://dbpedia.org/ontology/Settlement

I get also

http://dbpedia.org/ontology/Place
http://dbpedia.org/ontology/PopulatedPlace

How can I disable this feature in run time using OPTION.

Thanks

Adam

--
___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Re: [Virtuoso-users] Automating RDF data imports in VIrtuoso

2015-09-30 Thread Haag, Jason
Thanks Daniel. I'm running Linux/Debian + Ubuntu. I am able to import
the HTML/RDFa file using the crawler. I'm able to find it in WEBDAV.
It just doesn't populate the database. If I import the same data in
RDF/XML format it does populate the database.


---
+1.850.266.7100(office)
+1.850.471.1300 (mobile)
jhaag75 (skype)
http://jasonhaag.com (Web)
http://twitter.com/mobilejson (Twitter)
http://linkedin.com/in/jasonhaag (LinkedIn)



On Wed, Sep 30, 2015 at 4:20 PM, Davis, Daniel (NIH/NLM) [C]
 wrote:
>> What is the best option for automating the import and update of RDFa/HTML 
>> data on a regular basis into the virtuoso DB?
>
> Jason,
>
> Are you running virtuoso on a Windows system or on a UNIX/Linux system?
>
> In either case, I suggest you:
> - Write a script that talks to the server on port  (e.g. the 
> database port)
> - Good choices are perl/python with ODBC or Java with JDBC are 
> good options.
> - Debug the script
>
> If you are running this from UNIX/Linux, you can possibly edit a file to run 
> regular jobs be entering:
> crontab -e
>
> If that doesn't work, you may have to ask your system administrators for help.
> Entering the command 'man crontab' may help.
>
>

--
___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Re: [Virtuoso-users] Automating RDF data imports in VIrtuoso

2015-09-30 Thread Hugh Williams
Hi Jason,

Have you installed the Virtuoso Sponger VAD [1] which installs the necessary 
RDF mappers [2] to perform the transformations of RDFa and other structure data 
to RDF for storage in the Quad Store  ?

[1] 
http://s3.amazonaws.com/opldownload/uda/vad-vos-packages/7.2/rdf_mappers_dav.vad
[2] http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtSponger

Best Regards
Hugh Williams
Professional Services
OpenLink Software, Inc.  //  http://www.openlinksw.com/
Weblog   -- http://www.openlinksw.com/blogs/
LinkedIn -- http://www.linkedin.com/company/openlink-software/
Twitter  -- http://twitter.com/OpenLink
Google+  -- http://plus.google.com/100570109519069333827/
Facebook -- http://www.facebook.com/OpenLinkSoftware
Universal Data Access, Integration, and Management Technology Providers

> On 30 Sep 2015, at 23:10, Haag, Jason  wrote:
> 
> Thanks Daniel. I'm running Linux/Debian + Ubuntu. I am able to import
> the HTML/RDFa file using the crawler. I'm able to find it in WEBDAV.
> It just doesn't populate the database. If I import the same data in
> RDF/XML format it does populate the database.
> 
> 
> ---
> +1.850.266.7100(office)
> +1.850.471.1300 (mobile)
> jhaag75 (skype)
> http://jasonhaag.com (Web)
> http://twitter.com/mobilejson (Twitter)
> http://linkedin.com/in/jasonhaag (LinkedIn)
> 
> 
> 
> On Wed, Sep 30, 2015 at 4:20 PM, Davis, Daniel (NIH/NLM) [C]
>  wrote:
>>> What is the best option for automating the import and update of RDFa/HTML 
>>> data on a regular basis into the virtuoso DB?
>> 
>> Jason,
>> 
>> Are you running virtuoso on a Windows system or on a UNIX/Linux system?
>> 
>> In either case, I suggest you:
>>- Write a script that talks to the server on port  (e.g. the 
>> database port)
>>- Good choices are perl/python with ODBC or Java with JDBC are 
>> good options.
>>- Debug the script
>> 
>> If you are running this from UNIX/Linux, you can possibly edit a file to run 
>> regular jobs be entering:
>>crontab -e
>> 
>> If that doesn't work, you may have to ask your system administrators for 
>> help.
>> Entering the command 'man crontab' may help.
>> 
>> 
> 
> --
> ___
> Virtuoso-users mailing list
> Virtuoso-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/virtuoso-users


--
___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Re: [Virtuoso-users] Automating RDF data imports in VIrtuoso

2015-09-30 Thread Kingsley Idehen
On 9/30/15 6:10 PM, Haag, Jason wrote:
> Thanks Daniel. I'm running Linux/Debian + Ubuntu. I am able to import
> the HTML/RDFa file using the crawler. I'm able to find it in WEBDAV.
> It just doesn't populate the database. If I import the same data in
> RDF/XML format it does populate the database.

Did you set a Named Graph IRI in the import/crawl job? That's how you
get data into the quad store. Storing the ingest data to WebDAV is
totally optional. Also note that you can invoke the sponger too, as part
of this crawling functionality.

You can also make Linked Data Folder Types  that are mapped to named
graph iris as part of the folder config. Once in place you can make said
folder the target of RDF content that you want to import into the quad
store.

[1] https://www.pinterest.com/kidehen/virtuoso-universal-server-related/
-- I've added 4 screenshots showcasing Linked Data Folder Type setup


Kingsley
>
> ---
> +1.850.266.7100(office)
> +1.850.471.1300 (mobile)
> jhaag75 (skype)
> http://jasonhaag.com (Web)
> http://twitter.com/mobilejson (Twitter)
> http://linkedin.com/in/jasonhaag (LinkedIn)
>
>
>
> On Wed, Sep 30, 2015 at 4:20 PM, Davis, Daniel (NIH/NLM) [C]
>  wrote:
>>> What is the best option for automating the import and update of RDFa/HTML 
>>> data on a regular basis into the virtuoso DB?
>> Jason,
>>
>> Are you running virtuoso on a Windows system or on a UNIX/Linux system?
>>
>> In either case, I suggest you:
>> - Write a script that talks to the server on port  (e.g. the 
>> database port)
>> - Good choices are perl/python with ODBC or Java with JDBC are 
>> good options.
>> - Debug the script
>>
>> If you are running this from UNIX/Linux, you can possibly edit a file to run 
>> regular jobs be entering:
>> crontab -e
>>
>> If that doesn't work, you may have to ask your system administrators for 
>> help.
>> Entering the command 'man crontab' may help.
>>
>>
> --
> ___
> Virtuoso-users mailing list
> Virtuoso-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
>


-- 
Regards,

Kingsley Idehen   
Founder & CEO 
OpenLink Software 
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this




smime.p7s
Description: S/MIME Cryptographic Signature
--
___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Re: [Virtuoso-users] Virtuoso DBpedia load - parsing errors

2015-09-30 Thread Roman Sokolov
So could somebody help me to understand how to deal with this error while
importing the data?
/btc2014_unzipped/01/data.nq-10
http://fake-latest.org
   2   2015.9.22 23:10.20 322216000  2015.9.22 23:10.38
888367000  0   NULL42000 RDFGE: RDF box with a geometry RDF
type and a non-geometry content

There is no clue which particular lines cause the error, so I stuck and can
not remove or change them.
Or how can I load the data without lines containing errors?

Thank you.


On 23 September 2015 at 16:12, Roman Sokolov  wrote:

> Thanks a lot for your help, Patrick!
> Yes, my mistake, it is BTC dataset, not DBpedia.
> I changed the literal types from XML to Plain and the errors disappeared.
>
> But now I got the new error:
> /btc2014_unzipped/01/data.nq-10
> http://fake-latest.org
>  2   2015.9.22 23:10.20 322216000  2015.9.22 23:10.38
> 888367000  0   NULL42000 RDFGE: RDF box with a geometry RDF
> type and a non-geometry content
>
> This error is quite frequent in the dataset. And I guess it is related to
> geo-data. But the problem is, in contrast to the previous error, I can not
> see the details and the line where the error occured, so I can not check in
> the dataset which line caused the error. Strange that there is no details...
>
> Thank you.
>
> On 18 September 2015 at 13:42, Patrick van Kleef 
> wrote:
>
>> Hi Roman,
>>
>> > Hello.
>> > I have a lot of errors when I want to load DBpedia dataset using isql,
>> the command:
>> > ld_dir('/workingDir/btc2014_unzipped/01', 'data.nq-*', 'http://fake.org
>> ');
>> >
>> > Example error:
>> >
>> >  22007 XM003: XML parser detected an error: ERROR  : Tag nesting
>> >  error: name 'img' of end tag does not match the name 'p' of start tag
>> >  at line 4 column 432 at line 4 column 438 of source text
>> >  04/02/skos/core#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#
>> ">
>> >  --^
>> >
>> > Ok, let's find the line where the error occured (I put a line break, so
>> it is easier to see):
>> >
>> >  <
>> http://purl.org/rss/1.0/modules/content/encoded> "> http://www.w3.org/1999/xhtml\; xmlns:content=\"
>> http://purl.org/rss/1.0/modules/content/\; xmlns:dc=\"
>> http://purl.org/dc/terms/\; xmlns:foaf=\"http://xmlns.com/foaf/0.1/\;
>> xmlns:og=\"http://ogp.me/ns#\; xmlns:rdfs=\"
>> http://www.w3.org/2000/01/rdf-schema#\; xmlns:sioc=\"
>> http://rdfs.org/sioc/ns#\; xmlns:sioct=\"http://rdfs.org/sioc/types#\;
>> xmlns:skos=\"http://www.w3.org/2004/02/skos/core#\; xmlns:xsd=\"
>> http://www.w3.org/2001/XMLSchema#\;>What data are exposed\n> xmlns=\"http://www.w3.org/1999/xhtml\; xmlns:content=\"
>> http://purl.org/rss/1.0/modules/content/\; xmlns:dc=\"
>> http://purl.org/dc/terms/\; xmlns:foaf=\"http://xmlns.com/foaf/0.1/\;
>> xmlns:og=\"http://ogp.me/ns#\; xmlns:rdfs=\"
>> http://www.w3.org/2000/01/rdf-schema#\; xmlns:sioc=\"
>> http://rdfs.org/sioc/ns#\; xmlns:sioct=\"http://rdfs.org/sioc/types#\;
>> xmlns:skos=\"http://www.w3.org/2004/02/skos/core#\; xmlns:xsd=\"
>> http://www.w3.org/2001/XMLSchema#\;>The CORE project exposes data about
>> the aggregated content. The following schema shows the kind of metadata
>> CORE holds about each resource. \n> http://www.w3.org/1999/xhtml\; xmlns:content=\"
>> http://purl.org/rss/1.0/modules/content/\; xmlns:dc=\"
>> http://purl.org/dc/terms/\; xmlns:foaf=\"http://xmlns.com/foaf/0.1/\;
>> xmlns:og=\"http://ogp.me/ns#\; xmlns:rdfs=\"
>> http://www.w3.org/2000/01/rdf-schema#\; xmlns:sioc=\"
>> http://rdfs.org/sioc/ns#\; xmlns:sioct=\"http://rdfs.org/sioc/types#\;
>> xmlns:skos=\"http://www.w3.org/2004/02/skos/core#\; xmlns:xsd=\"
>> http://www.w3.org/2001/XMLSchema#\;>Data Schema\n> http://www.w3.org/1999/xhtml\; xmlns:content=\"
>> http://purl.org/rss/1.0/modules/content/\; xmlns:dc=\"
>> http://purl.org/dc/terms/\; xmlns:foaf=\"http://xmlns.com/foaf/0.1/\;
>> xmlns:og=\"http://ogp.me/ns#\; xmlns:rdfs=\"
>> http://www.w3.org/2000/01/rdf-schema#\; xmlns:sioc=\"
>> http://rdfs.org/sioc/ns#\; xmlns:sioct=\"http://rdfs.org/sioc/types#\;
>> xmlns:skos=\"http://www.w3.org/2004/02/skos/core#\; xmlns:xsd=\"
>> http://www.w3.org/2001/XMLSchema#\;>
>> > \nhttp://www.w3.org/1999/xhtml\; xmlns:content=\"
>> http://purl.org/rss/1.0/modules/content/\; xmlns:dc=\"
>> http://purl.org/dc/terms/\; xmlns:foaf=\"http://xmlns.com/foaf/0.1/\;
>> xmlns:og=\"http://ogp.me/ns#\; xmlns:rdfs=\"
>> http://www.w3.org/2000/01/rdf-schema#\; xmlns:sioc=\"
>> http://rdfs.org/sioc/ns#\; xmlns:sioct=\"http://rdfs.org/sioc/types#\;
>> xmlns:skos=\"http://www.w3.org/2004/02/skos/core#\; xmlns:xsd=\"
>> http://www.w3.org/2001/XMLSchema#\;>Data License\n> http://www.w3.org/1999/xhtml\; xmlns:content=\"
>> http://purl.org/rss/1.0/modules/content/\; xmlns:dc=\"
>> http://purl.org/dc/terms/\; xmlns:foaf=\"http://xmlns.com/foaf/0.1/\;
>>