Epic JENA-2125 to track this with tickets for each part.
> ResultSet(resources) - RowSet (Nodes)
> RDFConnection - RDFLink
> QueryExecution - QueryExec
Andy
On 28/06/2021 18:00, Andy Seaborne wrote:
Jena currently uses Apache HttpClient v4 for HTTP.
This supports HTTP 1.1.
Apache HttpClient v5 supports HTTP/2 and there is a migration path from
v4 to new style v5 but the path is not seamless. It is at least package
renaming followed by API changes.
https://hc.apache.org/httpcomponents-client-5.1.x/migration-guide/index.html
and
https://hc.apache.org/httpcomponents-client-5.1.x/migration-guide/migration-to-classic.html
For most Jena users, there are no application changes needed because
SPARQL operations are packed up into the Jena APIs. But if an
application is doing detailed HTTP setup - most importantly,that
includes authentication - there is going to be a migration impact.
Java11 now has a API java.net.http an all-new way to work with HTTP
including HTTP/2. (And there are other HTTP clients - I haven't used any
of those others).
Should we update to java.net.http or Apache HttpClient v5 or other?
Given the JDK has a decent HTTP client, my preference is to switch to
use java.net.http unless there is a positive reason to use a specific
external one.
The JDK provided one means dependencies, is always present, and gets
fixes/improvements (if any) come by updating the JVM used.
----
And also if HTTP support in Jena is being upgraded ... the code could do
with some work. Some of it is really old and is showing its age.
Areas:
RDFConnection,
SPARQL/HTTP QueryExecution and UpdateProcessor,
Graph Store Protocol
SERVICE.
== Improvements
+ Builder style for constructing the more complicated
(e.g. anything HTTP!)
+ Both Model and Graph / Statement and Triple level APIs
(Model-level being adapters of Graph level engines)
ResultSet(resources) - RowSet (Nodes)
RDFConnection - RDFLink
QueryExecution - QueryExec
(not an issue with UpdateProcessor)
+ Deprecation of QueryExecution.setTimeout and setIntialBinding
(use a builder)
+ Switch to rewrite for initial bindings
This will work for remote usage which currently is unsupported,
+ Explicit GSP engine - include support for quads operations.
. SERVICE rewrite to use the new classes.
- HttpOp : Direct use of java.net.http covers the complex cases so
this class can be smaller and focused on the common cases.
(I doubt it's used much directly)
+ Utilities: HttpRDF, AsyncHttpRDF, HttpOp
AsyncHttpRDF should at least cover async GET so apps can
gather data from several places in parallel.
== Migration
If we leave the old code for SPARQL execution (QueryEngineHTTP and
HttpQuery) in-place, with Apache HttpClient4, apply copious deprecations
then, mostly, we have less sudden change. We then remove in a couple of
releases time.
Deprecate all QueryExecutionFactory.sparqlService, createServiceRequest
and refer to (new) QueryExecutionHTTPBuilder
Deprecate of QueryExecution.setTimeout and setIntialBindings - they
should not be where they are.
Update documentation
== Improvements
Code:
https://github.com/afs/jena-http
which at the moment needs a custom Jena build because of misc cleanup
and things found while writing jena-http and not PR'ed to Jena.
Using a different HttpClient should not be too difficult as it
internally encapsulates HttpClient usage. But a switchable HttpClient
isn't so easy and also not invisible to users because of authentication
setup is implementation-specific. We can't abstract authentication
without significant costs in support and maintenance to the project.
Andy