[ 
https://issues.apache.org/jira/browse/JENA-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502941#comment-17502941
 ] 

Andy Seaborne commented on JENA-2302:
-------------------------------------

A quick glance and it looks good.

It needs to cope with the results-then-head case, not necessarily be as 
performant. If that situation occurs in the wild, I would expect it i more 
likely with small results rather that large. It is possible to determine the 
variables from the query itself, there is no need to scan the results.
 # Is results-then-head tested? It probably isn't in the ARQ test suite (please 
add!)
 # I don't see ASK results covered. Maybe I missed that.
 # Do you have performance measurements?
 # Formally {{DataBag}} is not order preserving (IIRC).
 # When does it spill? Or even in-memory only for the delayed results.
 # Please use constants for keyworks : {{JSONResultsKW}}
 # I didn't see coverage of the legacy case {{kTypedLiteral}} "typed-literal"
 # No author tags please.

GSON:
 # How big is GSON (with dependencies, i.e. impact on Fuseki)?
 # Is the proposed code only using the parser with no data mapping? (This is 
the risk for injection attacks that Jackson went through).

Two things next:
 * PR
 * Could you please email users@ to say the work is in-progress? See if we can 
identify any corner cases.

The JSON results are quite important so we have to take care that a release is 
not going to cause problems.

 

> RowSetReaderJSON is not streaming
> ---------------------------------
>
>                 Key: JENA-2302
>                 URL: https://issues.apache.org/jira/browse/JENA-2302
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: ARQ
>    Affects Versions: Jena 4.5.0
>            Reporter: Claus Stadler
>            Priority: Major
>
> Retrieving all data from our TDB2 endpoint with jena 4.5.0-SNAPSHOT is no 
> longer streaming for the JSON format. I tracked the issue to RowSetReaderJson 
> which reads everything into in memory (and then checks whether it is a SPARQL 
> ASK result)
> {code:java}
> public class RowSetReaderJson {
>         private void parse(InputStream in) {
>             JsonObject obj = JSON.parse(in); // !!! Loads everything !!!
>             // Boolean?
>             if ( obj.hasKey(kBoolean) ) { ... }
>     }
> }
> {code}
> Streaming works when switching the to RS_XML in the example below:
> {code:java}
> public class Main {
>     public static void main(String[] args) {
>         System.out.println("Test Started");
>         try (QueryExecution qe = QueryExecutionHTTP.create()
>                 
> .acceptHeader(ResultSetLang.RS_JSON.getContentType().getContentTypeStr())
>                 .endpoint("http://moin.aksw.org/sparql";).queryString("SELECT 
> * { ?s ?p ?o }").build()) {
>             qe.execSelect().forEachRemaining(System.out::println);
>         }
>         System.out.println("Done");
>     }
> }
> {code}
> For completeness, I can rule out any problem with TDB2 because streaming of 
> JSON works just fine with: 
> {code:bash}
> curl --data-urlencode "query=select * { ?s ?p ?o }"  
> "http://moin.aksw.org/sparql";
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to