[ 
https://issues.apache.org/jira/browse/JENA-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502472#comment-17502472
 ] 

Claus Stadler commented on JENA-2302:
-------------------------------------

Hm, there was a time when XML was the default result set format in jena which I 
still use most of the time for virtuoso open source compatibility which 
incorrectly returns empty result sets as an empty json object. Probably that's 
why I never noticed that change.

* Repeated header fields: I find it hard to imagine a sparql engine 
implementation that would return a json result set that does that (possibly 
even with changing content).

* Putting the header after the body: There might be use cases such as when 
bindings are generated by custom SPARQL extensions e.g. SERVICE 
<loadCsvFileAsBindings> {} so the effective variables cannot be derived from a 
sparql query. Optimistic parsing would still be possible - but 
RowSet.getResultVars would then have to block and buffer the stream (probably 
still using BagFactory) until the header is seen.

I have written streaming JSON parsers using google GSON which actually provides 
a streaming pull API with GSON.newJsonReader(inputStreamReader); it seems this 
is not supported by riot JSON?
To me it seems shouldn't be too hard to adapt the existing code to parse 
optimistically using the GSON API. Would that be of interest?



> RowSetReaderJSON is not streaming
> ---------------------------------
>
>                 Key: JENA-2302
>                 URL: https://issues.apache.org/jira/browse/JENA-2302
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: ARQ
>    Affects Versions: Jena 4.5.0
>            Reporter: Claus Stadler
>            Priority: Major
>
> Retrieving all data from our TDB2 endpoint with jena 4.5.0-SNAPSHOT is no 
> longer streaming for the JSON format. I tracked the issue to RowSetReaderJson 
> which reads everything into in memory (and then checks whether it is a SPARQL 
> ASK result)
> {code:java}
> public class RowSetReaderJson {
>         private void parse(InputStream in) {
>             JsonObject obj = JSON.parse(in); // !!! Loads everything !!!
>             // Boolean?
>             if ( obj.hasKey(kBoolean) ) { ... }
>     }
> }
> {code}
> Streaming works when switching the to RS_XML in the example below:
> {code:java}
> public class Main {
>     public static void main(String[] args) {
>         System.out.println("Test Started");
>         try (QueryExecution qe = QueryExecutionHTTP.create()
>                 
> .acceptHeader(ResultSetLang.RS_JSON.getContentType().getContentTypeStr())
>                 .endpoint("http://moin.aksw.org/sparql";).queryString("SELECT 
> * { ?s ?p ?o }").build()) {
>             qe.execSelect().forEachRemaining(System.out::println);
>         }
>         System.out.println("Done");
>     }
> }
> {code}
> For completeness, I can rule out any problem with TDB2 because streaming of 
> JSON works just fine with: 
> {code:bash}
> curl --data-urlencode "query=select * { ?s ?p ?o }"  
> "http://moin.aksw.org/sparql";
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to