[ 
https://issues.apache.org/jira/browse/JENA-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502673#comment-17502673
 ] 

Claus Stadler commented on JENA-2302:
-------------------------------------

I have cleaned up the code and tested my streaming implementation against the 
current one. For my test dataset the results match (219.441 triples amounting 
to 200MB; including RDFStar).

[Latest 
State|https://github.com/Scaseco/jenax/tree/develop/jenax-io-parent/jenax-io-core/src/main/java/org/aksw/jenax/io/json]

I had a look at the Jakarta Json API but I am afraid GSON is much more succinct 
(and thus less painful to use) for writing streaming parsers (as you can 
probably see from the code); so maybe in this case its worth adding it to the 
zoo? If so I would transfer my code to a jena PR and wire it up with the test 
cases.


> RowSetReaderJSON is not streaming
> ---------------------------------
>
>                 Key: JENA-2302
>                 URL: https://issues.apache.org/jira/browse/JENA-2302
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: ARQ
>    Affects Versions: Jena 4.5.0
>            Reporter: Claus Stadler
>            Priority: Major
>
> Retrieving all data from our TDB2 endpoint with jena 4.5.0-SNAPSHOT is no 
> longer streaming for the JSON format. I tracked the issue to RowSetReaderJson 
> which reads everything into in memory (and then checks whether it is a SPARQL 
> ASK result)
> {code:java}
> public class RowSetReaderJson {
>         private void parse(InputStream in) {
>             JsonObject obj = JSON.parse(in); // !!! Loads everything !!!
>             // Boolean?
>             if ( obj.hasKey(kBoolean) ) { ... }
>     }
> }
> {code}
> Streaming works when switching the to RS_XML in the example below:
> {code:java}
> public class Main {
>     public static void main(String[] args) {
>         System.out.println("Test Started");
>         try (QueryExecution qe = QueryExecutionHTTP.create()
>                 
> .acceptHeader(ResultSetLang.RS_JSON.getContentType().getContentTypeStr())
>                 .endpoint("http://moin.aksw.org/sparql";).queryString("SELECT 
> * { ?s ?p ?o }").build()) {
>             qe.execSelect().forEachRemaining(System.out::println);
>         }
>         System.out.println("Done");
>     }
> }
> {code}
> For completeness, I can rule out any problem with TDB2 because streaming of 
> JSON works just fine with: 
> {code:bash}
> curl --data-urlencode "query=select * { ?s ?p ?o }"  
> "http://moin.aksw.org/sparql";
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to