this still seems to be a thing, I get the same error on 3.13.1 now with qexec.execAsk() which doesn't provide content negotiation. in addition a method to ignore cert evaluation for https would make sense here since a large number of of sparql sites seem to have an invalid cert status.
On Tue, May 19, 2015 at 6:31 PM Andy Seaborne <a...@apache.org> wrote: > On 19/05/15 17:18, Olivier Rossel wrote: > > Should we still ask DBPedia to switch back to XML 1.0 ? > > It is not just Jena - it's the laggardly state of the JVM so it might > affect others. If it turns out to be a recurrent question, then it > would be good to let them know -- I guess the reason is to get the wider > character support in XML 1.1, not tied to a version of Unicode. > Of course, wikipedia can be a bit messy but fixing on ingestion helps > everyone. > > Andy > > http://norman.walsh.name/2004/09/30/xml11 > > > > > On Wed, May 13, 2015 at 9:02 PM, Andy Seaborne <a...@apache.org> wrote: > >> On 13/05/15 15:27, Rob Vesse wrote: > >>> > >>> I assume you'll go ahead and file a bug against Xerces? > >> > >> > >> The issue does not seem to be in Apache Xerces. > >> > >> Jena is picking up the JDK XMLStreamReader implementation. > >> > >> Xerces does not provide javax.xml.stream.XMLInputFactory and > XMLStreamReader > >> at least its not in META-INF/services > >> > >> It means that adding org.codehaus.woodstox:wstx-asl is a valid > workaround > >> always as the default JDK provider is not used unless there are no > >> XMLInputFactory registered (ServiceLoader). > >> > >> Its surprising that the JDK bug is still open as the fix for the JDK > looks > >> small. > >> > >> Andy > >> > >> > >>> > >>> Rob > >>> > >>> On 13/05/2015 14:56, "Andy Seaborne" <a...@apache.org> wrote: > >>> > >>>> So far we know: > >>>> > >>>> It is a bug in Xerces handling of 1.1 > >>>> > >>>> Specifically, an NPE > >>>> XML11NSDocumentScannerImpl:scanStartElement line 356 > >>>> > >>>> (a big +1 to open source here) > >>>> > >>>> 1/ The first problem line hit is <variable name="class"/> > >>>> > >>>> "/>" is the trigger. > >>>> > >>>> <variable name="class"></variable> would work. > >>>> > >>>> 2/ It affects Xerces 2.11.0 and also the Xerces inside OpenJDK. > >>>> https://bugs.openjdk.java.net/browse/JDK-8029437 > >>>> > >>>> 3/ Adding org.codehaus.woodstox:wstx-asl to the dependencies can fix > it > >>>> (may depend on ordering) - e.g. add jena-text to your project (!!!). > >>>> because it picks up a different STaX parser. > >>>> > >>>> Andy > >>>> > >>>> > >>>> > >>>> On 13/05/15 14:06, Rob Vesse wrote: > >>>>> > >>>>> Jeremy > >>>>> > >>>>> Looks like someone else just ran into the same issue and filed a bug > - > >>>>> JENA-940 [1] - feel free to add a comment there indicating that this > >>>>> appears to be the same issue you encounter > >>>>> > >>>>> Apparently the issue has something to do with DBPedia adopting XML > 1.1 > >>>>> and > >>>>> a lack of support for that in Xerces (or at least the version of > Xerces > >>>>> Jena currently uses) > >>>>> > >>>>> Rob > >>>>> > >>>>> [1] https://issues.apache.org/jira/browse/JENA-940 > >>>>> > >>>>> On 13/05/2015 12:27, "Jeremy Debattista" <debat...@iai.uni-bonn.de> > >>>>> wrote: > >>>>> > >>>>>> Hi Rob, > >>>>>> > >>>>>> Yes that is what I suspect as well, even though when I use a curl > >>>>>> function with content negotiation [1], the returned results look > good > >>>>>> (and well formed). Anyway, this is the complete error stack: > >>>>>> > >>>>>> com.hp.hpl.jena.sparql.resultset.ResultSetException: Failed when > >>>>>> initializing the StAX parsing engine > >>>>>> at > >>>>>> > >>>>>> > >>>>>> > com.hp.hpl.jena.sparql.resultset.XMLInputStAX.<init>(XMLInputStAX.java:1 > >>>>>> 19 > >>>>>> ) > >>>>>> at > >>>>>> com.hp.hpl.jena.sparql.resultset.XMLInput.make(XMLInput.java:73) > >>>>>> at > >>>>>> com.hp.hpl.jena.sparql.resultset.XMLInput.fromXML(XMLInput.java:42) > >>>>>> at > >>>>>> com.hp.hpl.jena.sparql.resultset.XMLInput.fromXML(XMLInput.java:37) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> > com.hp.hpl.jena.query.ResultSetFactory.fromXML(ResultSetFactory.java:312 > >>>>>> ) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> > com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP.execSelect(QueryEngin > >>>>>> eH > >>>>>> TTP.java:372) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> > de.unibonn.iai.eis.linda.helper.SPARQLHandler.executeQuery(SPARQLHandler > >>>>>> .j > >>>>>> ava:41) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> > de.unibonn.iai.eis.linda.helper.SPARQLHandler.getLabelFromNode(SPARQLHan > >>>>>> dl > >>>>>> er.java:80) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> > de.unibonn.iai.eis.linda.querybuilder.classes.RDFClass.<init>(RDFClass.j > >>>>>> av > >>>>>> a:62) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> > de.unibonn.iai.eis.linda.querybuilder.classes.RDFClass.searchRDFClass(RD > >>>>>> FC > >>>>>> lass.java:228) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> > de.unibonn.iai.eis.linda.querybuilder.classes.RDFClass.searchRDFClass(RD > >>>>>> FC > >>>>>> lass.java:222) > >>>>>> at > >>>>>> com.servlet.routes.BuilderRoute.getProperties(BuilderRoute.java:172) > >>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav > >>>>>> a: > >>>>>> 57) > >>>>>> at > >>>>>> > >>>>>> > >>>>>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor > >>>>>> Im > >>>>>> pl.java:43) > >>>>>> at java.lang.reflect.Method.invoke(Method.java:606) > >>>>>> > >>>>>> Cheers, > >>>>>> Jeremy > >>>>>> > >>>>>> > >>>>>> [1] curl -H "Accept: application/sparql-results+xml" -g > >>>>>> > >>>>>> > >>>>>> " > http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&qu > >>>>>> er > >>>>>> > >>>>>> > >>>>>> y=PREFIX+rdf%3A%3Chttp%3A%2F%2Fwww.w3.org > %2F1999%2F02%2F22-rdf-syntax-ns > >>>>>> %2 > >>>>>> > >>>>>> > >>>>>> 3%3E+PREFIX+rdfs%3A%3Chttp%3A%2F%2Fwww.w3.org > %2F2000%2F01%2Frdf-schema%2 > >>>>>> 3% > >>>>>> > >>>>>> > >>>>>> 3E+PREFIX+owl%3A%3Chttp%3A%2F%2Fwww.w3.org > %2F2002%2F07%2Fowl%23%3E++SELE > >>>>>> CT > >>>>>> > >>>>>> > >>>>>> > +distinct+%3Fclass+%3Flabel++WHERE+%7B+%7B%3Fclass+rdf%3Atype+owl%3AClas > >>>>>> s% > >>>>>> > >>>>>> > >>>>>> > 7D+UNION+%7B%3Fclass+rdf%3Atype+rdfs%3AClass%7D.+%3Fclass+rdfs%3Alabel+% > >>>>>> 3F > >>>>>> > >>>>>> > >>>>>> > label.+++FILTER%28bound%28%3Flabel%29++%26%26+REGEX%28%3Flabel%2C+%22%5C > >>>>>> %5 > >>>>>> Cbact%22%2C%22i%22%29%29%7D+ORDER+BY+%3Fclass%0D%0A” > >>>>>> > >>>>>> On 13 May 2015, at 12:32, Rob Vesse <rve...@dotnetrdf.org> wrote: > >>>>>> > >>>>>>> What is the error message you get? > >>>>>>> > >>>>>>> It is not unheard of for Virtuoso (the software that powers > DBPedia) > >>>>>>> to > >>>>>>> produce bad output particularly if the data has not been > appropriately > >>>>>>> sanitised so I would suspect Virtuoso before suspecting Jena in a > case > >>>>>>> like this > >>>>>>> > >>>>>>> Rob > >>>>>>> > >>>>>>> On 13/05/2015 10:16, "Jeremy Debattista" <debat...@iai.uni-bonn.de > > > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Dear All, > >>>>>>>> > >>>>>>>> I am trying to query the DBpedia SPARQL endpoint using the > >>>>>>>> QueryExecutionFactory sparqlService and execSelect(), but I’m > given > >>>>>>>> the > >>>>>>>> following error: > com.hp.hpl.jena.sparql.resultset.ResultSetException: > >>>>>>>> Failed when initializing the StAX parsing engine > >>>>>>>> > >>>>>>>> The query in question is > >>>>>>>> PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX > >>>>>>>> rdfs:<http://www.w3.org/2000/01/rdf-schema#> PREFIX > >>>>>>>> owl:<http://www.w3.org/2002/07/owl#> SELECT distinct ?class > ?label > >>>>>>>> WHERE { {?class rdf:type owl:Class} UNION {?class rdf:type > >>>>>>>> rdfs:Class}. > >>>>>>>> ?class rdfs:label ?label. FILTER(bound(?label) && REGEX(?label, > >>>>>>>> "\\bact","i"))} ORDER BY ?class > >>>>>>>> > >>>>>>>> which gives a result in dbpedia sparql web interface [1]. > >>>>>>>> > >>>>>>>> The code in question is the following: > >>>>>>>> > >>>>>>>> public static ResultSet executeQuery(String uri, String > queryString) > >>>>>>>> { > >>>>>>>> Query query = QueryFactory.create(queryString); > >>>>>>>> QueryExecution qexec = > >>>>>>>> QueryExecutionFactory.sparqlService(uri, > >>>>>>>> query); > >>>>>>>> try { > >>>>>>>> ResultSet results = qexec.execSelect(); > >>>>>>>> return results; > >>>>>>>> } finally { > >>>>>>>> > >>>>>>>> } > >>>>>>>> } > >>>>>>>> > >>>>>>>> After debugging, the problem seems to be related to how the XML > >>>>>>>> parser > >>>>>>>> is > >>>>>>>> reading the stream input. Would you have any other idea how I can > go > >>>>>>>> around it? > >>>>>>>> > >>>>>>>> Best Regards, > >>>>>>>> Jeremy > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> [1] > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&q > >>>>>>>> ue > >>>>>>>> ry > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> =PREFIX+rdf%3A%3Chttp%3A%2F%2Fwww.w3.org > %2F1999%2F02%2F22-rdf-syntax-n > >>>>>>>> s% > >>>>>>>> 23 > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> %3E+PREFIX+rdfs%3A%3Chttp%3A%2F%2Fwww.w3.org > %2F2000%2F01%2Frdf-schema% > >>>>>>>> 23 > >>>>>>>> %3 > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> E+PREFIX+owl%3A%3Chttp%3A%2F%2Fwww.w3.org > %2F2002%2F07%2Fowl%23%3E++SEL > >>>>>>>> EC > >>>>>>>> T+ > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > distinct+%3Fclass+%3Flabel++WHERE+%7B+%7B%3Fclass+rdf%3Atype+owl%3ACla > >>>>>>>> ss > >>>>>>>> %7 > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > D+UNION+%7B%3Fclass+rdf%3Atype+rdfs%3AClass%7D.+%3Fclass+rdfs%3Alabel+ > >>>>>>>> %3 > >>>>>>>> Fl > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > abel.+++FILTER%28bound%28%3Flabel%29++%26%26+REGEX%28%3Flabel%2C+%22%5 > >>>>>>>> C% > >>>>>>>> 5C > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > bact%22%2C%22i%22%29%29%7D+ORDER+BY+%3Fclass%0D%0A&format=text%2Fhtml& > >>>>>>>> ti > >>>>>>>> me > >>>>>>>> out=30000&debug=on > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>> > >>> > >>> > >>> > >>> > >> > > -- --- Marco Neumann KONA