Epic JENA-2125 to track this with tickets for each part.

>       ResultSet(resources) - RowSet (Nodes)
>       RDFConnection - RDFLink
>       QueryExecution - QueryExec

    Andy

On 28/06/2021 18:00, Andy Seaborne wrote:
Jena currently uses Apache HttpClient v4 for HTTP.
This supports HTTP 1.1.

Apache HttpClient v5 supports HTTP/2 and there is a migration path from v4 to new style v5 but the path is not seamless. It is at least package renaming followed by API changes.

https://hc.apache.org/httpcomponents-client-5.1.x/migration-guide/index.html
   and
https://hc.apache.org/httpcomponents-client-5.1.x/migration-guide/migration-to-classic.html


For most Jena users, there are no application changes needed because SPARQL operations are packed up into the Jena APIs. But if an application is doing detailed HTTP setup - most importantly,that includes authentication - there is going to be a migration impact.

Java11 now has a API java.net.http an all-new way to work with HTTP including HTTP/2. (And there are other HTTP clients - I haven't used any of those others).


Should we update to java.net.http or Apache HttpClient v5 or other?


Given the JDK has a decent HTTP client, my preference is to switch to use java.net.http unless there is a positive reason to use a specific external one.

The JDK provided one means dependencies, is always present, and gets fixes/improvements (if any) come by updating the JVM used.

----

And also if HTTP support in Jena is being upgraded ... the code could do with some work. Some of it is really old and is showing its age.

Areas:
    RDFConnection,
    SPARQL/HTTP QueryExecution and UpdateProcessor,
    Graph Store Protocol
    SERVICE.

== Improvements

+ Builder style for constructing the more complicated
   (e.g. anything HTTP!)
+ Both Model and Graph / Statement and Triple level APIs
   (Model-level being adapters of Graph level engines)

      ResultSet(resources) - RowSet (Nodes)
      RDFConnection - RDFLink
      QueryExecution - QueryExec
      (not an issue with UpdateProcessor)

+ Deprecation of QueryExecution.setTimeout and setIntialBinding
   (use a builder)
+ Switch to rewrite for initial bindings
   This will work for remote usage which currently is unsupported,
+ Explicit GSP engine - include support for quads operations.

. SERVICE rewrite to use the new classes.

- HttpOp : Direct use of java.net.http covers the complex cases so
   this class can be smaller and focused on the common cases.
   (I doubt it's used much directly)

+ Utilities: HttpRDF, AsyncHttpRDF, HttpOp
   AsyncHttpRDF should at least cover async GET so apps can
   gather data from several places in parallel.

== Migration

If we leave the old code for SPARQL execution (QueryEngineHTTP and HttpQuery) in-place, with Apache HttpClient4, apply copious deprecations then, mostly, we have less sudden change. We then remove in a couple of releases time.

Deprecate all QueryExecutionFactory.sparqlService, createServiceRequest and refer to (new) QueryExecutionHTTPBuilder

Deprecate of QueryExecution.setTimeout and setIntialBindings - they should not be where they are.

Update documentation

== Improvements

Code:

   https://github.com/afs/jena-http

which at the moment needs a custom Jena build because of misc cleanup and things found while writing jena-http and not PR'ed to Jena.

Using a different HttpClient should not be too difficult as it internally encapsulates HttpClient usage. But a switchable HttpClient isn't so easy and also not invisible to users because of authentication setup is implementation-specific. We can't abstract authentication without significant costs in support and maintenance to the project.

     Andy

Reply via email to