[
https://issues.apache.org/jira/browse/JENA-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17503498#comment-17503498
]
Claus Stadler commented on JENA-2302:
-------------------------------------
* When does it spill? Or even in-memory only for the delayed results.
I am relying on the default databag system here that is also used by
UpdateEngineWorker to implement deletions. Relevant code:
{code:java}
ThresholdPolicy<Binding> policy = ThresholdPolicyFactory.policyFromContext(cxt);
DataBag<Binding> r = BagFactory.newDefaultBag(policy,
SerializationFactoryFinder.bindingSerializationFactory());
public static <E> ThresholdPolicy<E> policyFromContext(Context context) {
long threshold = context.getLong(ARQ.spillToDiskThreshold,
defaultThreshold); // ...
}
private static final long defaultThreshold = -1; // Use the never() policy by
default
{code}
I am a bit surprised that it does not use spill to disk by default - but that
does not seem critical.
I think I have now covered all raised points except for the performance
measures.
> RowSetReaderJSON is not streaming
> ---------------------------------
>
> Key: JENA-2302
> URL: https://issues.apache.org/jira/browse/JENA-2302
> Project: Apache Jena
> Issue Type: Improvement
> Components: ARQ
> Affects Versions: Jena 4.5.0
> Reporter: Claus Stadler
> Priority: Major
>
> Retrieving all data from our TDB2 endpoint with jena 4.5.0-SNAPSHOT is no
> longer streaming for the JSON format. I tracked the issue to RowSetReaderJson
> which reads everything into in memory (and then checks whether it is a SPARQL
> ASK result)
> {code:java}
> public class RowSetReaderJson {
> private void parse(InputStream in) {
> JsonObject obj = JSON.parse(in); // !!! Loads everything !!!
> // Boolean?
> if ( obj.hasKey(kBoolean) ) { ... }
> }
> }
> {code}
> Streaming works when switching the to RS_XML in the example below:
> {code:java}
> public class Main {
> public static void main(String[] args) {
> System.out.println("Test Started");
> try (QueryExecution qe = QueryExecutionHTTP.create()
>
> .acceptHeader(ResultSetLang.RS_JSON.getContentType().getContentTypeStr())
> .endpoint("http://moin.aksw.org/sparql").queryString("SELECT
> * { ?s ?p ?o }").build()) {
> qe.execSelect().forEachRemaining(System.out::println);
> }
> System.out.println("Done");
> }
> }
> {code}
> For completeness, I can rule out any problem with TDB2 because streaming of
> JSON works just fine with:
> {code:bash}
> curl --data-urlencode "query=select * { ?s ?p ?o }"
> "http://moin.aksw.org/sparql"
> {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)