[ 
https://issues.apache.org/jira/browse/JENA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190011#comment-13190011
 ] 

Rob Vesse edited comment on JENA-199 at 1/20/12 9:57 PM:
---------------------------------------------------------

So I got into this a lot more and was able to reproduce the issue in code and 
isolate it from Fuseki

It appears that something in this data causes TDB to return a null for one of 
the variables which is bizarre because there is no optional variables and it 
shouldn't be possible to set a null on a Binding AFAIK.

Stack Trace from the code I will shortly attach is as follows:

null
java.lang.NullPointerException
        at 
com.hp.hpl.jena.sparql.engine.binding.BindingBase.hashCode(BindingBase.java:204)
        at 
com.hp.hpl.jena.sparql.engine.binding.BindingBase.hashCode(BindingBase.java:185)
        at java.util.HashMap.put(HashMap.java:372)
        at java.util.HashSet.add(HashSet.java:200)
        at org.openjena.atlas.data.SortedDataBag.add(SortedDataBag.java:109)
        at 
org.openjena.atlas.data.DistinctDataNet.netAdd(DistinctDataNet.java:58)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIterDistinct.fill(QueryIterDistinct.java:87)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIterDistinct.moveToNextBinding(QueryIterDistinct.java:118)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.moveToNextBinding(QueryIteratorWrapper.java:43)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.moveToNextBinding(QueryIteratorWrapper.java:43)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152)
        at 
com.hp.hpl.jena.sparql.engine.ResultSetStream.nextBinding(ResultSetStream.java:84)
        at 
com.hp.hpl.jena.sparql.engine.ResultSetStream.nextSolution(ResultSetStream.java:102)
        at 
com.hp.hpl.jena.sparql.engine.ResultSetStream.next(ResultSetStream.java:111)
        at 
com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:43)
        at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
        at 
com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:481)
        at 
com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:459)
        at bugs.tdb.TDBEmptyOutput.test(TDBEmptyOutput.java:44)
        at bugs.tdb.TDBEmptyOutput.main(TDBEmptyOutput.java:29)

So the lack of Text, CSV and TSV output appears to be that it hits this error 
partway through and all those writers either do a single flush at the end of 
output or in the case of the Text output have to do a complete iteration over 
the ResultSet before they can output or they reach whatever Jetty's flush 
threshold is.  Hence the lack of outputs for these formats yet output in the 
XML/JSON cases.

What I failed to notice previously was that the XML/JSON output was incomplete 
(though strangely it was valid) but it doesn't include as many results as it 
should.  Something in the Fuseki layer appears to catch and clean up the 
incomplete output whereas when evaluated via the code you get larger but 
incomplete output.
                
      was (Author: rvesse):
    So I got into this a lot more and was able to reproduce the issue in code 
and isolate it from Fuseki

It appears that something in this data causes TDB to return a null for one of 
the variables which is bizarre because there is no optional variables and it 
shouldn't be possible to set a null on a Binding AFAIK.

Stack Trace from the code I will shortly attach is as follows:

null
java.lang.NullPointerException
        at 
com.hp.hpl.jena.sparql.engine.binding.BindingBase.hashCode(BindingBase.java:204)
        at 
com.hp.hpl.jena.sparql.engine.binding.BindingBase.hashCode(BindingBase.java:185)
        at java.util.HashMap.put(HashMap.java:372)
        at java.util.HashSet.add(HashSet.java:200)
        at org.openjena.atlas.data.SortedDataBag.add(SortedDataBag.java:109)
        at 
org.openjena.atlas.data.DistinctDataNet.netAdd(DistinctDataNet.java:58)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIterDistinct.fill(QueryIterDistinct.java:87)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIterDistinct.moveToNextBinding(QueryIterDistinct.java:118)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.moveToNextBinding(QueryIteratorWrapper.java:43)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.moveToNextBinding(QueryIteratorWrapper.java:43)
        at 
com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152)
        at 
com.hp.hpl.jena.sparql.engine.ResultSetStream.nextBinding(ResultSetStream.java:84)
        at 
com.hp.hpl.jena.sparql.engine.ResultSetStream.nextSolution(ResultSetStream.java:102)
        at 
com.hp.hpl.jena.sparql.engine.ResultSetStream.next(ResultSetStream.java:111)
        at 
com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:43)
        at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
        at 
com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:481)
        at 
com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:459)
        at bugs.tdb.TDBEmptyOutput.test(TDBEmptyOutput.java:44)
        at bugs.tdb.TDBEmptyOutput.main(TDBEmptyOutput.java:29)

So the lack of Text, CSV and TSV output appears to be that it hits this error 
partway through and all those writers either do a single flush at the end of 
output or in the case of the Text output have to do a complete iteration over 
the ResultSet before they can output.  Hence the lack of outputs and output in 
the XML/JSON cases, what I failed to notice previously was that the XML/JSON 
output was incomplete (it was valid) but it doesn't include as many results as 
it should.

Something in the Fuseki layer appears to catch and clean up the incomplete 
input whereas when evaluated via the code you get larger but incomplete output.
                  
> BindingBase can hit a null pointer exception on certain queries against a TDB 
> dataset
> -------------------------------------------------------------------------------------
>
>                 Key: JENA-199
>                 URL: https://issues.apache.org/jira/browse/JENA-199
>             Project: Jena
>          Issue Type: Bug
>          Components: TDB
>            Reporter: Rob Vesse
>              Labels: csv, results, sparql, tdb, tsv
>         Attachments: 5b.txt, 8.txt, TDBEmptyOutput.java, sp2b10k.nt
>
>
> This is a strange bug which I have been unable to reduce to a more minimal 
> example than the files I will attach so I apologize for that.
> Essentially the problem manifests as follows, when using a TDB dataset with 
> Fuseki some queries will return blank output if the user requests Text, CSV 
> or TSV.  When using XML/JSON the output is fine.
> The test data used is SP2B 10k, two of the SP2B queries that exhibit this 
> issue are as follows:
> PREFIX rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
> PREFIX bench: <http://localhost/vocabulary/bench/>
> PREFIX dc:    <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?person ?name
> WHERE {
>   ?article rdf:type bench:Article .
>   ?article dc:creator ?person .
>   ?inproc rdf:type bench:Inproceedings .
>   ?inproc dc:creator ?person .
>   ?person foaf:name ?name
> }
> And:
> PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#> 
> PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> PREFIX dc:   <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?name
> WHERE {
>   ?erdoes rdf:type foaf:Person .
>   ?erdoes foaf:name "Paul Erdoes"^^xsd:string .
>   {
>     ?document dc:creator ?erdoes .
>     ?document dc:creator ?author .
>     ?document2 dc:creator ?author .
>     ?document2 dc:creator ?author2 .
>     ?author2 foaf:name ?name
>     FILTER (?author!=?erdoes &&
>             ?document2!=?document &&
>             ?author2!=?erdoes &&
>             ?author2!=?author)
>   } UNION {
>     ?document dc:creator ?erdoes.
>     ?document dc:creator ?author.
>     ?author foaf:name ?name
>     FILTER (?author!=?erdoes)
>   }
> }
> I will attach these as files as well for convenience.
> If you run Fuseki with a memory dataset using the --mem option, load this 
> data and run the same queries the Text, CSV and TSV output works fine.  This 
> implies that there is something in the TDB code related to its return of 
> results or iterators which somehow causes the Text, CSV and TSV formatters to 
> either error or to believe that they have no results to format.
> I'm completely unfamiliar with the TDB codebase so I haven't attempted to 
> discover what the cause of the issue is though I may poke around anyway

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to