Thanks a lot, Kingsley, that really seems to solve our problem.
Best,
Heiko.
Am 24.04.2012 15:36, schrieb Kingsley Idehen:
On 4/24/12 8:01 AM, Heiko Paulheim wrote:
I am trying to get *all* the triples that match the following query:
SELECT DISTINCT ?p ?s
FROM <http://dbpedia.org>
WHERE
{ ?s ?p <http://dbpedia.org/resource/Germany> }
By retrieving chunks of 1000, I will start with
SELECT DISTINCT ?p ?s
FROM <http://dbpedia.org>
WHERE
{ ?s ?p <http://dbpedia.org/resource/Germany> }
ORDER BY ASC(?p)
OFFSET 0
LIMIT 1000
and then increase OFFSET and LIMIT by 1000 with each pass.
Eventually, I arrive at the point where OFFSET=40000, and the
reported error occurs.
Best,
Heiko.
Okay, try this subquery approach instead:
SELECT ?p ?s
WHERE {
{SELECT DISTINCT ?p ?s FROM <http://dbpedia.org>
WHERE
{ ?s ?p
<http://dbpedia.org/resource/Germany> }
ORDER BY ASC(?p) }
}
OFFSET 50000 LIMIT 1000
BTW -- thanks for raising this matter. We are going to put out a tips
and tricks note about this.
Kingsley
Am 24.04.2012 13:58, schrieb Kingsley Idehen:
On 4/24/12 7:46 AM, Heiko Paulheim wrote:
Dear Kingsley,
the window size is only 1000, and it also occurs with window size
100. As far as I understand the error, the problem is that the
underlying collection is larger than 40000.
You can try it yourself at
http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=SELECT+DISTINCT++%3Fp+%3Fs%0D%0AFROM+%3Chttp%3A%2F%2Fdbpedia.org%3E%0D%0AWHERE%0D%0A++{+%3Fs+%3Fp+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FGermany%3E+}%0D%0AORDER+BY+ASC%28%3Fp%29%0D%0AOFFSET++40000%0D%0ALIMIT+++1000&format=text%2Fhtml&timeout=0&debug=on
Best,
Heiko.
Why not:
http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&qtxt=SELECT+DISTINCT++%3Fp+%3Fs%0D%0AFROM+%3Chttp%3A%2F%2Fdbpedia.org%3E%0D%0AWHERE%0D%0A++%7B+%3Fs+%3Fp+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FGermany%3E+%7D%0D%0AORDER+BY+ASC%28%3Fp%29%0D%0AOFFSET++4000%0D%0ALIMIT+++1000&format=text%2Fhtml&timeout=150000&debug=on
Kingsley
Am 24.04.2012 13:00, schrieb Kingsley Idehen:
On 4/24/12 6:52 AM, Heiko Paulheim wrote:
Daer Kingsley,
thank you again for your advice. We are now using the solution
you suggested, which really works fine until there is more than
40000 results (see below) -- e.g., when asking for all statements
related to Germany.
Do you have any idea how to fix that?
Why not make a smaller window/cursor ? Reduce your OFFSET.
Kingsley
Best,
Heiko.
HttpException: HttpException: 500 SPARQL Request Failed
Virtuoso 22023 Error SR353: Sorted TOP clause specifies more then
41000 rows to sort. Only 40000 are allowed. Either decrease the
offset and/or row count or use a scrollable cursor
SPARQL query:
define sql:big-data-const 0 SELECT DISTINCT ?p ?s
FROM <http://dbpedia.org>
WHERE
{ ?s ?p <http://dbpedia.org/resource/Germany> }
ORDER BY ASC(?p)
OFFSET 40000
LIMIT 1000
: HttpException: 500 SPARQL Request Failed
Virtuoso 22023 Error SR353: Sorted TOP clause specifies more then
41000 rows to sort. Only 40000 are allowed. Either decrease the
offset and/or row count or use a scrollable cursor
Am 11.04.2012 16:44, schrieb Kingsley Idehen:
On 4/11/12 10:14 AM, Heiko Paulheim wrote:
Dear Kingsley,
OK, here's a basic example URL of a query I use:
http://dbpedia.org/sparql?query=SELECT+DISTINCT+%3Fp+%3Ft%0D%0AFROM+%3Chttp%3A%2F%2Fdbpedia.org%3E%0D%0AWHERE+{%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FEngland%3E+%3Fp+%3Fo.+%3Fo+a+%3Ft+}&format=text%2Fhtml&timeout=120000
Links:
1. http://dbpedia.org/c/Z63DBC -- query result
2. http://dbpedia.org/c/ZNS2TM -- query text.
Page through the data using:
1. SELECT * WHERE {?s a ?o} OFFSET 0 LIMIT 1000 -- iteration 1
2. SELECT * WHERE {?s a ?o} OFFSET 1000 LIMIT 1000 -- iteration 2
3. Ditto with OFFSET incremented in blocks of 1000 .
Kingsley
Best,
Heiko.
Am 11.04.2012 16:09, schrieb Kingsley Idehen:
On 4/11/12 10:07 AM, Heiko Paulheim wrote:
Dear Kingsley,
as I said, there is no query that times out as such. All the
queries work fine in isolation. It is rather a problem of a
longer series of queries that eventually provokes the
"Bandwith Limit Exceeded" exception.
Thus, my question is what that limit exactly is, and how many
queries can be issued from a single client per minute/hour. I
also have no problems in restricting my client to that limit,
I just need to know it.
Please just send me a URL and then I can send one back to you
that shows the setting you need re. Virtuoso's Anytime Query
feature. This feature is to be used in combination with OFFSET
and LIMIT re. the public endpoint, in line with how it is
deliberately configured.
Kingsley
Best,
Heiko.
Am 11.04.2012 15:41, schrieb Kingsley Idehen:
On 4/11/12 9:14 AM, Heiko Paulheim wrote:
Dear Kingsley,
the query in question is looking for types of objects
related to a resource (e.g. a person knows some scientists,
has written some books, etc.), i.e.
SELECT DISTINCT ?p ?t
FROM <http://dbpedia.org>
WHERE {?s ?p <objectInQuestion>. ?s a ?t }
SELECT DISTINCT ?p ?t
FROM <http://dbpedia.org>
WHERE {<objectInQuestion> ?p ?o. ?o a ?t }
The particular query where the program terminates works
fine in isolation, just like about 200 before.
I have set the timeout to 120 seconds and retry a failed
query after waiting for 1 second.
Please send a SPARQL URL of a query that times out.
Kingsley
Best,
Heiko.
Am 11.04.2012 14:55, schrieb Kingsley Idehen:
On 4/11/12 6:47 AM, Heiko Paulheim wrote:
Hi all,
I am currently experiencing a repeated Bandwidth Limit
Exceeded
exception, always occuring at the same position in my
program (i.e.,
after a certain number of requests having been issued in
a certain
time). Since the program can be properly started again
afterwards (and
runs to that very point), I assume that limiting the
number of requests
per minute/hour within the program would solve the problem.
Does anybody know detailed figures about the bandwith
restrictions of
DBpedia, in particular w.r.t. the dbpedia.org/sparql
endpoint?
Thanks,
Heiko.
What is your query?
Are you using timeouts?
The DBpedia instance is configured to serve the world and
all its associated idiosyncrasies.
------------------------------------------------------------------------------
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
--
Dr. Heiko Paulheim
Knowledge Engineering Group
Technische Universität Darmstadt
Phone: +49 6151 16 6634
Fax: +49 6151 16 5482
http://www.ke.tu-darmstadt.de/staff/heiko-paulheim
--
Regards,
Kingsley Idehen
Founder& CEO
OpenLink Software
Company Web:http://www.openlinksw.com
Personal Weblog:http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile:https://plus.google.com/112399767740508618350/about
LinkedIn Profile:http://www.linkedin.com/in/kidehen
--
Dr. Heiko Paulheim
Knowledge Engineering Group
Technische Universität Darmstadt
Phone: +49 6151 16 6634
Fax: +49 6151 16 5482
http://www.ke.tu-darmstadt.de/staff/heiko-paulheim
--
Regards,
Kingsley Idehen
Founder& CEO
OpenLink Software
Company Web:http://www.openlinksw.com
Personal Weblog:http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile:https://plus.google.com/112399767740508618350/about
LinkedIn Profile:http://www.linkedin.com/in/kidehen
--
Dr. Heiko Paulheim
Knowledge Engineering Group
Technische Universität Darmstadt
Phone: +49 6151 16 6634
Fax: +49 6151 16 5482
http://www.ke.tu-darmstadt.de/staff/heiko-paulheim
--
Regards,
Kingsley Idehen
Founder& CEO
OpenLink Software
Company Web:http://www.openlinksw.com
Personal Weblog:http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile:https://plus.google.com/112399767740508618350/about
LinkedIn Profile:http://www.linkedin.com/in/kidehen
--
Dr. Heiko Paulheim
Knowledge Engineering Group
Technische Universität Darmstadt
Phone: +49 6151 16 6634
Fax: +49 6151 16 5482
http://www.ke.tu-darmstadt.de/staff/heiko-paulheim
--
Regards,
Kingsley Idehen
Founder& CEO
OpenLink Software
Company Web:http://www.openlinksw.com
Personal Weblog:http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile:https://plus.google.com/112399767740508618350/about
LinkedIn Profile:http://www.linkedin.com/in/kidehen
--
Dr. Heiko Paulheim
Knowledge Engineering Group
Technische Universität Darmstadt
Phone: +49 6151 16 6634
Fax: +49 6151 16 5482
http://www.ke.tu-darmstadt.de/staff/heiko-paulheim
--
Regards,
Kingsley Idehen
Founder& CEO
OpenLink Software
Company Web:http://www.openlinksw.com
Personal Weblog:http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile:https://plus.google.com/112399767740508618350/about
LinkedIn Profile:http://www.linkedin.com/in/kidehen
--
Dr. Heiko Paulheim
Knowledge Engineering Group
Technische Universität Darmstadt
Phone: +49 6151 16 6634
Fax: +49 6151 16 5482
http://www.ke.tu-darmstadt.de/staff/heiko-paulheim
--
Regards,
Kingsley Idehen
Founder& CEO
OpenLink Software
Company Web:http://www.openlinksw.com
Personal Weblog:http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile:https://plus.google.com/112399767740508618350/about
LinkedIn Profile:http://www.linkedin.com/in/kidehen
--
Dr. Heiko Paulheim
Knowledge Engineering Group
Technische Universität Darmstadt
Phone: +49 6151 16 6634
Fax: +49 6151 16 5482
http://www.ke.tu-darmstadt.de/staff/heiko-paulheim
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion