I made a tcpdump to see the exact packets flowing. I confirm, DBpedia is sending empty responses:
stanbol > dbpedia GET /sparql?query=SELECT+DISTINCT+?id+%0AWHERE+%7B+%0A++%7B+%0A++++?id+%3Chttp:// www.w3.org/1999/02/22-rdf-syntax-ns%23type%3E+%3Chttp://dbpedia.org/ontology/Organisation%3E+.+%0A++++?id+%3Chttp://www.w3.org/2000/01/rdf-schema%23label%3E+?tmp1+.+%0A++++++?tmp1+bif:contains+'(%22Research%22+AND+%22in%22+AND+%22Motion%22)'+.+%0A%7D+%0A%7D+%0AORDER+BY+DESC+(+%3CLONG::IRI_RANK%3E+(?id)+)+%0ALIMIT+20+%0A&format=application/sparql-results%2BjsonHTTP/1.1 Accept: application/sparql-results+json User-Agent: Java/1.6.0_20 Host: dbpedia.org Connection: keep-alive dbpedia > stanbol HTTP/1.1 200 OK Date: Tue, 30 Aug 2011 09:05:41 GMT Content-Type: application/sparql-results+json; charset=UTF-8 Connection: keep-alive Server: Virtuoso/06.02.3130 (Linux) x86_64-generic-linux-glibc25-64 VDB Accept-Ranges: bytes Content-Length: 203 ----- Content Length of 203, but nothing after ... other tests made with curl: $ curl " http://dbpedia.org/sparql?query=SELECT+DISTINCT+?id+%0AWHERE+%7B+%0A++%7B+%0A++++?id+%3Chttp://www.w3.org/2000/01/rdf-schema%23label%3E+?tmp1+.+%0A++++++?tmp1+bif:contains+'(%22London%22+AND+%22Metropolitan%22+AND+%22Police%22)'+.+%0A%7D+%0A%7D+%0AORDER+BY+DESC+(+%3CLONG::IRI_RANK%3E+(?id)+)+%0ALIMIT+20+%0A&format=application/sparql-results%2Bjson " *curl: (18) transfer closed with 311 bytes remaining to read* David On Tue, Aug 30, 2011 at 12:04 PM, Rupert Westenthaler < [email protected]> wrote: > On Tue, Aug 30, 2011 at 10:45 AM, David Riccitelli > <[email protected]> wrote: > > Thanks Rupert, > >> > >> So for me it looks like that the dbpedia.org SPARQL server has > >> currently some problems in writing JSON responses. > > > > According to you, if I would create an offline cache of DBpedia, would I > be > > able to overcome this issue? > > Queries are not supported by "caches". (cache strategy "used") > Only if you use local index (cache strategy "all") also queries are > exceeded on the locally available data. > > You could simple go the the configuration tab of the Felix Web Console > (http://localhost:8080/system/console/configMgr), open the > configuration for the dbpedia Referenced Site and change the > configuration of the CacheStrategy from "used" to "all". After that > queries should be executed locally over the entities you have > previously cached from dbpedia.org. > > best > Rupert Westenthaler > > BTW: I forgot to mention in my last mail, that I am getting the exact > same JSONExceptions as you mentioned. > Caused by: > org.apache.stanbol.entityhub.servicesapi.site.ReferencedSiteException: > Unable to execute query on remote site http://dbpedia.org/sparql with > entitySearcher org.apache.stanbol.entityhub.searcher.VirtuosoSearcher! > at > org.apache.stanbol.entityhub.core.impl.ReferencedSiteImpl.findEntities(ReferencedSiteImpl.java:318) > at > org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEngine.computeEntityRecommentations(NamedEntityTaggingEngine.java:408) > at > org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEngine.computeEnhancements(NamedEntityTaggingEngine.java:320) > ... 44 more > Caused by: java.io.IOException: Unable to parse JSON Result Set for parsed > query > at > org.apache.stanbol.entityhub.site.linkeddata.impl.SparqlSearcher.extractEntitiesFromJsonResult(SparqlSearcher.java:113) > at > org.apache.stanbol.entityhub.site.linkeddata.impl.VirtuosoSearcher.findEntities(VirtuosoSearcher.java:92) > at > org.apache.stanbol.entityhub.core.impl.ReferencedSiteImpl.findEntities(ReferencedSiteImpl.java:316) > ... 46 more > Caused by: org.codehaus.jettison.json.JSONException: A JSONObject text > must begin with '{' at character 0 of > at > org.codehaus.jettison.json.JSONTokener.syntaxError(JSONTokener.java:439) > at org.codehaus.jettison.json.JSONObject.<init>(JSONObject.java:169) > at org.codehaus.jettison.json.JSONObject.<init>(JSONObject.java:266) > at > org.apache.stanbol.entityhub.site.linkeddata.impl.SparqlSearcher.extractEntitiesFromJsonResult(SparqlSearcher.java:87) > ... 48 more > > > > > BR > > David > > On Tue, Aug 30, 2011 at 11:39 AM, Rupert Westenthaler > > <[email protected]> wrote: > >> > >> Hi David > >> > >> On Mon, Aug 29, 2011 at 4:40 PM, David Riccitelli > >> <[email protected]> wrote: > >> > Hi Rupert, > >> > > >> > I think is this one: > >> > > >> > > http://dbpedia.org/sparql?query=SELECT+DISTINCT+?id+%0AWHERE+%7B+%0A++%7B+%0A++++?id+%3Chttp://www.w3.org/1999/02/22-rdf-syntax-ns%23type%3E+%3Chttp://dbpedia.org/ontology/Person%3E+.+%0A++++?id+%3Chttp://www.w3.org/2000/01/rdf-schema%23label%3E+?tmp1+.+%0A++++++?tmp1+bif:contains+'%22Plantar%22'+.+%0A%7D+%0A%7D+%0AORDER+BY+DESC+(+%3CLONG::IRI_RANK%3E+(?id)+)+%0ALIMIT+20+%0A&format=application/sparql-results%2Bjson > >> > > >> > >> This corresponds to this SPARQL query: > >> > >> SELECT DISTINCT ?id > >> WHERE { > >> { > >> ?id <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> > >> <http://dbpedia.org/ontology/Person> . > >> ?id <http://www.w3.org/2000/01/rdf-schema#label> ?tmp1 . > >> ?tmp1 bif:contains '"Plantar"' . > >> } > >> } > >> ORDER BY DESC ( <LONG::IRI_RANK> (?id) ) > >> LIMIT 20 > >> > >> this is a typical query as used by the NamedEntityTaggingEngine to > lookup > >> an Entity with the name "Plantar" for an TextAnnotation with the > >> dc:type dbpedia-ont:Person > >> > >> > In fact before that exception I have this one: > >> > > >> > 29.08.2011 08:51:54.939 *WARN* [1605442425@qtp-23480987-1] > >> > org.apache.felix.http.jetty /engines/ > >> > (org.apache.stanbol.enhancer.servicesapi.EngineException: > >> > 'NamedEntityTaggingEngine' failed to process content item > >> > 'urn:content-item-sha1-feef2e6a8bb37b08ff71e7b2d27582ae4c640adb' with > >> > type > >> > 'text/plain': > >> > org.apache.stanbol.entityhub.servicesapi.site.ReferencedSiteException: > >> > Unable to execute query on remote site http://dbpedia.org/sparql with > >> > entitySearcher > org.apache.stanbol.entityhub.searcher.VirtuosoSearcher!) > >> > org.apache.stanbol.enhancer.servicesapi.EngineException: > >> > 'NamedEntityTaggingEngine' failed to process content item > >> > 'urn:content-item-sha1-feef2e6a8bb37b08ff71e7b2d27582ae4c640adb' with > >> > type > >> > 'text/plain': > >> > org.apache.stanbol.entityhub.servicesapi.site.ReferencedSiteException: > >> > Unable to execute query on remote site http://dbpedia.org/sparql with > >> > entitySearcher org.apache.stanbol.entityhub.searcher.VirtuosoSearcher! > >> > (...) > >> > Caused by: java.io.IOException: Server returned HTTP response code: > 500 > >> > for > >> > URL: > >> > > >> > > http://dbpedia.org/sparql?query=SELECT+DISTINCT+?id+%0AWHERE+%7B+%0A++%7B+%0A++++?id+%3Chttp://www.w3.org/1999/02/22-rdf-syntax-ns%23type%3E+%3Chttp://dbpedia.org/ontology/Person%3E+.+%0A++++?id+%3Chttp://www.w3.org/2000/01/rdf-schema%23label%3E+?tmp1+.+%0A++++++?tmp1+bif:contains+'%22Plantar%22'+.+%0A%7D+%0A%7D+%0AORDER+BY+DESC+(+%3CLONG::IRI_RANK%3E+(?id)+)+%0ALIMIT+20+%0A&format=application/sparql-results%2Bjson > >> > (...) > >> > > >> In the earlier mail you where getting a JSONException while praising > >> the response of a query. Here you are getting a 500 indicating the the > >> dbpedia SPARQL endpoint had some problems. > >> > >> Testing this query today resulted in a 200. However I could observe > >> strange behavior while testing. > >> Precisely: sending the above URL with curl -v printed the headers of > >> the response however after that the curl command has not terminated as > >> expected. It kept running until I manually terminated it. > >> Making the same request but without the > >> "&format=application/sparql-results%2Bjson" parameter I was also > >> getting a 200 (as expected) but this time curl terminated and printed > >> > >> <sparql xmlns="http://www.w3.org/2005/sparql-results#" > >> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > >> xsi:schemaLocation=" > http://www.w3.org/2001/sw/DataAccess/rf1/result2.xsd"> > >> <head> > >> <variable name="id"/> > >> </head> > >> <results distinct="false" ordered="true"> > >> </results> > >> </sparql> > >> > >> ... an empty response. > >> > >> If I make a request with the above URL (including the > >> "&format=application/sparql-results%2Bjson") directly in the Browser I > >> get an empty response. As far as I know this would not be valid JSON > >> and could explain the JSONException as mentioned in the earlier mail. > >> > >> So for me it looks like that the dbpedia.org SPARQL server has > >> currently some problems in writing JSON responses. > >> > >> On Mon, Aug 29, 2011 at 4:51 PM, David Riccitelli > >> <[email protected]> wrote: > >> > If I run, $ curl " > >> > > >> > > http://dbpedia.org/sparql?query=SELECT+DISTINCT+?id+%ntax-ns%23type%3E+%3Chttp://dbpedia.org/ontology/Person%3E+.+%0A++++?id+%3Chttp://www.w3.org/2000/01/rdf-schema%23label%3E+?tmp1+.+%0A++++++?tmp1+bif:contains+'%22Plantar%22'+.+%0A%7D+%0A%7D+%0AORDER+BY+DESC+(+%3CLONG::IRI_RANK%3E+(?id)+)+%0ALIMIT+20+%0A&format=application/sparql-results%2Bjson > >> > " > >> > > >> > I get a: > >> > > >> > { "head": { "link": [], "vars": ["id"] }, > >> > "results": { "distinct": false, "ordered": true, "bindings": [ ] } } > >> > > >> I was not able to confirm this. The above URL is not an valid request > >> because > >> > >> SELECT+DISTINCT+?id+%ntax-ns%23type%3E+ > >> > >> "%ntax-ns" is not valid. Could it be that you have missed some parts > >> when copying this request to the mail? > >> > >> However the result noted in this mail would be the expected response > >> if the query would not return any results. > >> > >> On Mon, Aug 29, 2011 at 4:40 PM, David Riccitelli > >> <[email protected]> wrote: > >> > And then I have lots of these (at least 20): > >> > > >> > 29.08.2011 08:53:35.924 *INFO* [1605442425@qtp-23480987-1] > >> > org.apache.stanbol.enhancer.servicesapi.helper.EnhancementEngineHelper > >> > No > >> > Triple found for > <urn:enhancement-fa2cfb8d-5cad-5e80-ea03-b2b39be6f404> > >> > and > >> > property <http://purl.org/dc/terms/type>! -> return null > >> > 29.08.2011 08:53:35.924 *WARN* [1605442425@qtp-23480987-1] > >> > > >> > > org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEngine > >> > Unable to process TextAnnotation > >> > <urn:enhancement-fa2cfb8d-5cad-5e80-ea03-b2b39be6f404> because > property< > >> > http://purl.org/dc/terms/type> is not present > >> > > >> > >> Before processing TextAnnotations the NamedEntityTaggingEngine checks > >> that the "dc:type" and "selected-text" properties are present for an > >> TextAnnotations. So if you see such warnings there must be some other > >> enhancement engine present that creates TextAnnotations that are > >> missing the "dc:type" property. > >> > >> best > >> Rupert Westenthaler > >> > >> > >> -- > >> | Rupert Westenthaler [email protected] > >> | Bodenlehenstraße 11 ++43-699-11108907 > >> | A-5500 Bischofshofen > > > > > > > > -- > > David Riccitelli > > ----- > > Skype: ziodave > > Twitter: @ziodave > > LinkedIn: http://it.linkedin.com/in/riccitelli > > ----- > > Interact SpA > > Via A. Bargoni 78 (scala F) > > 00153 Roma > > > > T +39 06 58318 301 > > F +39 06 58318 303 > > > > > > -- > | Rupert Westenthaler [email protected] > | Bodenlehenstraße 11 ++43-699-11108907 > | A-5500 Bischofshofen > -- David Riccitelli ----- Skype: ziodave Twitter: @ziodave LinkedIn: http://it.linkedin.com/in/riccitelli ----- Interact SpA Via A. Bargoni 78 (scala F) 00153 Roma T +39 06 58318 301 F +39 06 58318 303
