Re: [Virtuoso-users] obtaining a copy/dump of an online SPARQL endpoint in a nice manner
On 31 Mar 2016, at 22:10, Kingsley Idehen wrote: > > How are you arriving at data devoid or metadata about its origins? Link rot... Endpoint is still working, dumps aren't... > You would be better served, ultimately, instantiating a dedicated > Virtuoso instance in the cloud for your specific needs. This instance > could load datasets from wherever, using some of the existing endpoints > (DBpedia and others) as a mechanism for exposing provenance data etc.. How would i do that? I'm not sure i fully understand... We have a Virtuoso instance in our local network for our working group already. I usually manually load it with the dumps of datasets i want to run my algorithms against so i don't impede the public endpoints operations / get blocked for doing too many requests. > There is no nice way of trying to dump all the data from an existing > SPARQL endpoint. If there is no nice way, is there any way at all? I mean I can't be the first one who is interested in all triples a graph on a remote endpoint contains (even if that graph is big). Best, Jörn -- ___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
Re: [Virtuoso-users] obtaining a copy/dump of an online SPARQL endpoint in a nice manner
On 4/1/16 9:02 AM, Jörn Hees wrote: > On 31 Mar 2016, at 22:10, Kingsley Idehen wrote: >> How are you arriving at data devoid or metadata about its origins? > Link rot... Endpoint is still working, dumps aren't... Example please. > > >> You would be better served, ultimately, instantiating a dedicated >> Virtuoso instance in the cloud for your specific needs. This instance >> could load datasets from wherever, using some of the existing endpoints >> (DBpedia and others) as a mechanism for exposing provenance data etc.. > How would i do that? I'm not sure i fully understand... We have a Virtuoso > instance in our local network for our working group already. You could make a pre-configured DBpedia instance, or LOD the same datasets we have in our LOD Cloud cache etc.. > I usually manually load it with the dumps of datasets i want to run my > algorithms against so i don't impede the public endpoints operations / get > blocked for doing too many requests. > > >> There is no nice way of trying to dump all the data from an existing >> SPARQL endpoint. > If there is no nice way, is there any way at all? There is a slow way using OFFSET and LIMIT with smaller sizes that what you had in the initial post. > > I mean I can't be the first one who is interested in all triples a graph on a > remote endpoint contains (even if that graph is big). You need to be more specific about the data you seek and from what endpoint & repository combo it's currently visible etc.. > > > Best, > Jörn > > -- Regards, Kingsley Idehen Founder & CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog 1: http://kidehen.blogspot.com Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen Twitter Profile: https://twitter.com/kidehen Google+ Profile: https://plus.google.com/+KingsleyIdehen/about LinkedIn Profile: http://www.linkedin.com/in/kidehen Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this smime.p7s Description: S/MIME Cryptographic Signature -- Transform Data into Opportunity. Accelerate data analysis in your applications with Intel Data Analytics Acceleration Library. Click to learn more. http://pubads.g.doubleclick.net/gampad/clk?id=278785471&iu=/4140___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
Re: [Virtuoso-users] obtaining a copy/dump of an online SPARQL endpoint in a nice manner
On 3/31/16 1:30 PM, Jörn Hees wrote: > Hi, > > i developed some machine learning algorithms that i'd like to run against > various datasets. > Many of them provide Virtuoso powered SPARQL endpoints online, but running my > algorithms against them would for sure not be considered "fair use". > > Some datasets provide dumps, so i'm able to play nice, load the dumps on a > local Virtuoso instance and torture that local instance with my algorithms. > > How can i do something similar in case there is no dump available for > download, but only a SPARQL endpoint? > > I was thinking about issuing a `construct where { ?s ?p ?o } limit X offset > Y` and stepping through the endpoint like that once, but the bigger the > offset, the slower the response time: > > http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&qtxt=select+*+where+{%3Fs+%3Fp+%3Fo.}+limit+1+offset+40002&format=text%2Fhtml&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=3&debug=on > > Any suggestions how to improve this and do this in a "nice" way? > Also maybe without the danger of skipping a lot of data by different orders? > > Best, > Jörn How are you arriving at data devoid or metadata about its origins? You would be better served, ultimately, instantiating a dedicated Virtuoso instance in the cloud for your specific needs. This instance could load datasets from wherever, using some of the existing endpoints (DBpedia and others) as a mechanism for exposing provenance data etc.. There is no nice way of trying to dump all the data from an existing SPARQL endpoint. Regards, Kingsley Idehen Founder & CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog 1: http://kidehen.blogspot.com Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen Twitter Profile: https://twitter.com/kidehen Google+ Profile: https://plus.google.com/+KingsleyIdehen/about LinkedIn Profile: http://www.linkedin.com/in/kidehen Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this smime.p7s Description: S/MIME Cryptographic Signature -- Transform Data into Opportunity. Accelerate data analysis in your applications with Intel Data Analytics Acceleration Library. Click to learn more. http://pubads.g.doubleclick.net/gampad/clk?id=278785471&iu=/4140___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users