I think the new text looks good, quite easy to understand. Could you add a paragraph about when the configured client would be used? It might not be clear when this HttpClient would be accessed or not. For instance I assume it would be used for remote SPARQL queries or loading of HTTP URLs from RDFDataMgr -- but may not be propagated through to JSON-LD Java's @context loading - which has a similar httpclient setting and documentation on how to configure caching [1]
[1] https://github.com/jsonld-java/jsonld-java#controlling-network-traffic On 9 November 2016 at 15:42, A. Soroka <[email protected]> wrote: > Done. I'll wait to hear from other folks before pulling a trigger on > (re)publishing the site. > > --- > A. Soroka > The University of Virginia Library > >> On Nov 9, 2016, at 6:30 AM, Andy Seaborne <[email protected]> wrote: >> >> Great - >> >> One (!) other thing: >> >> A section specifically calling out migrating SPARQL remote calls: >> QueryExecutionFactory.sparqlService and QueryEngineHTTP. >> >> On the latter, older code may still be directly using >> QueryEngineHTTP.setBasicAuthentication >> >> Andy >> >> On 08/11/16 17:58, A. Soroka wrote: >>> I've made those changes-- should be restaging now. >>> >>> --- >>> A. Soroka >>> The University of Virginia Library >>> >>>> On Nov 8, 2016, at 12:40 PM, Andy Seaborne <[email protected]> wrote: >>>> >>>> >>>> >>>> On 08/11/16 16:59, A. Soroka wrote: >>>>> This commit includes the new docs for HTTP behavior in Jena 3.1.1. I >>>>> can't find any way to see a view of this on the staging site-- >>>>> https://jena.staging.apache.org/ just seems to proxy >>>>> https://cms.apache.org/, for some reason? >>>>> >>>> >>>> It does not for me. >>>> >>>> Try http://jena.staging.apache.org/ (not https) >>>> >>>> PDF attached, cc'ed to you in the hope it get through. >>>> >>>> Comments: >>>> >>>> 1/ I'd put the current (3.1.1) text first and the previous second so the >>>> current is more visible. >>>> >>>> Links at the end of the intro to "current" and "previous", or in the intro >>>> as this difference is mentioned. >>>> >>>> 2/ Title tweaking: >>>> >>>> "HTTP Authentication after Jena 3.1.0" -> >>>> "HTTP Authentication from Jena 3.1.1" >>>> >>>> "HTTP Authentication before Jena 3.1.0" => >>>> "HTTP Authentication from Jena 3.0.0 to 3.1.0" >>>> >>>> (so the range includes 3.1.0 !) >>>> >>>> Mentioning Jena 2.x is not necessary IMO - the additional detail adds >>>> confusion for current users and 3.x upgrading users (the majority). >>>> >>>> 3/ >>>> "Simple authentication using username and password" >>>> >>>> "Authenticating via a form" >>>> >>>> The <h5> don't show up as different on teh screen for me so maybe bump >>>> <h4> "Examples of authentication" up a level to <h3> and move sub <5> to >>>> <h4> . >>>> >>>> Maybe drop <h3> "Applying Authentication" (section title immediately after >>>> a section title) and have the paragraph there straight away. >>>> >>>> Andy >>>> >>>>> --- >>>>> A. Soroka >>>>> The University of Virginia Library >>>>> >>>>>> On Nov 8, 2016, at 11:53 AM, [email protected] wrote: >>>>>> >>>>>> Author: ajs6f >>>>>> Date: Tue Nov 8 16:53:48 2016 >>>>>> New Revision: 1768736 >>>>>> >>>>>> URL: http://svn.apache.org/viewvc?rev=1768736&view=rev >>>>>> Log: >>>>>> Updates for HTTP behavior in Jena 3.1.1 >>>>>> >>>>>> Modified: >>>>>> jena/site/trunk/content/documentation/query/http-auth.mdtext >>>>>> jena/site/trunk/content/documentation/query/service.mdtext >>>>>> >>>>>> Modified: jena/site/trunk/content/documentation/query/http-auth.mdtext >>>>>> URL: >>>>>> http://svn.apache.org/viewvc/jena/site/trunk/content/documentation/query/http-auth.mdtext?rev=1768736&r1=1768735&r2=1768736&view=diff >>>>>> ============================================================================== >>>>>> --- jena/site/trunk/content/documentation/query/http-auth.mdtext >>>>>> (original) >>>>>> +++ jena/site/trunk/content/documentation/query/http-auth.mdtext Tue Nov >>>>>> 8 16:53:48 2016 >>>>>> @@ -16,10 +16,12 @@ Notice: Licensed to the Apache Softwa >>>>>> specific language governing permissions and limitations >>>>>> under the License. >>>>>> >>>>>> -As of ARQ 2.11.0 there is a new unified HTTP operation framework that >>>>>> provides a uniform mechanism for >>>>>> -HTTP authentication that also allows ARQ to support a broader range of >>>>>> authentication mechanisms than were previously possible. >>>>>> +From ARQ 2.11.0 through ARQ 3.1.0 there is a Jena-specific unified HTTP >>>>>> operation framework that provides a uniform mechanism for >>>>>> +HTTP authentication that also allows ARQ to support a broader range of >>>>>> authentication mechanisms than were previously possible. After ARQ >>>>>> 3.1.0, Jena exposes the underlying HTTP Commons functionality to the >>>>>> same end. This documentation is therefore devided into two sections. The >>>>>> first explains the older Jena-specific functionality, and the second >>>>>> explains how to use HTTP Commons code to the same ends. >>>>>> >>>>>> -## Applying Authentication >>>>>> +## HTTP Authentication before ARQ 3.1.0 >>>>>> + >>>>>> +### Applying Authentication >>>>>> >>>>>> APIs that support authentication typically provide two methods for >>>>>> providing authenticators, a `setAuthentication(String username, char[] >>>>>> password)` method >>>>>> which merely configures a `SimpleAuthenticator`. There will also be a >>>>>> `setAuthenticator(HttpAuthenticator authenticator)` method >>>>>> @@ -41,14 +43,14 @@ avoids the needs to cast and manually se >>>>>> ... >>>>>> } >>>>>> >>>>>> -### Authenticators >>>>>> +#### Authenticators >>>>>> >>>>>> Authentication mechanisms are provided by [HttpAuthenticator][1] >>>>>> implementations of which a number are provided built into ARQ. >>>>>> >>>>>> This API provides the authenticator with access to the `HttpClient`, >>>>>> `HttpContext` and target `URI` of the request that is about to be carried >>>>>> out. This allows for authenticators to add credentials to requests on a >>>>>> per-request basis and/or to use different mechanisms and credentials for >>>>>> different services. >>>>>> >>>>>> -#### SimpleAuthenticator >>>>>> +##### SimpleAuthenticator >>>>>> >>>>>> The [simple authenticator][2] is as the name suggests the simplest >>>>>> implementation. It takes a single set of credentials which is applied to >>>>>> any service. >>>>>> @@ -56,7 +58,7 @@ any service. >>>>>> Authentication however is not preemptive so unless the remote service >>>>>> sends a HTTP challenge (401 Unauthorized or 407 Proxy Authorization >>>>>> Required) then credentials will not actually be submitted. >>>>>> >>>>>> -#### ScopedAuthenticator >>>>>> +##### ScopedAuthenticator >>>>>> >>>>>> The [scoped authenticator][3] is an authenticator which maps credentials >>>>>> to different service URIs. This allows you to specify different >>>>>> credentials for different services as appropriate. Similarly to the >>>>>> simple authenticator this is not preemptive authentication so >>>>>> credentials are >>>>>> @@ -67,13 +69,13 @@ if you define credentialsfor `http://exa >>>>>> e.g. `http://example.org/some/path`. However if you had also defined >>>>>> credentials for `http://example.org/some/path` then these would be >>>>>> used in favor of those for `http://example.org` >>>>>> >>>>>> -#### ServiceAuthenticator >>>>>> +##### ServiceAuthenticator >>>>>> >>>>>> The [service authenticator][4] is an authenticator which uses >>>>>> information encoded in the ARQ context and basically provides access to >>>>>> the >>>>>> existing credential provision mechanisms provided for the `SERVICE` >>>>>> clause, see [Basic Federated Query][5] for more information on >>>>>> configuration for this. >>>>>> >>>>>> -#### FormsAuthenticator >>>>>> +##### FormsAuthenticator >>>>>> >>>>>> The [forms authenticator][6] is an authenticator usable with services >>>>>> that require form based logins and use session cookies to verify login >>>>>> state. >>>>>> This is intended for use with services that don't support HTTP's >>>>>> built-in authentication mechanisms for whatever reason. One example of >>>>>> this >>>>>> @@ -104,7 +106,7 @@ that maps each service to an associated >>>>>> >>>>>> Currently forms based login that require more than just a username and >>>>>> password are not supported. >>>>>> >>>>>> -#### PreemptiveBasicAuthenticator >>>>>> +##### PreemptiveBasicAuthenticator >>>>>> >>>>>> This [authenticator][8] is a decorator over another authenticator that >>>>>> enables preemptive basic authentication, this **only** works for servers >>>>>> that support basic authentication and so will cause authentication >>>>>> failures when any other authentication scheme is required. You should >>>>>> **only** >>>>>> @@ -121,20 +123,12 @@ Also be aware that basic authentication >>>>>> many servers will use more secure schemes like Digest authentication >>>>>> which **cannot** be done preemptively as they require more complex >>>>>> challenge response sequences. >>>>>> >>>>>> -#### DelegatingAuthenticator >>>>>> +##### DelegatingAuthenticator >>>>>> >>>>>> The [delegating authenticator][12] allows for mapping different >>>>>> authenticators to different services, this is useful when you need to >>>>>> mix and >>>>>> match the types of authentication needed. >>>>>> >>>>>> -### Debugging Authentication >>>>>> - >>>>>> -ARQ uses [Apache Http Client][14] for all its HTTP operations and this >>>>>> provides detailed logging information that can be used for debugging. To >>>>>> -see this information you need to configure your logging framework to >>>>>> set the `org.apache.http` package to either `DEBUG` or `TRACE` level. >>>>>> - >>>>>> -The `DEBUG` level will give you general diagnostic information about >>>>>> requests and responses while the `TRACE` level will give you detailed >>>>>> -HTTP traces i.e. allow you to see the exact HTTP requests and responses >>>>>> which can be extremely useful for debugging authentication problems. >>>>>> - >>>>>> -### The Default Authenticator >>>>>> +#### The Default Authenticator >>>>>> >>>>>> Since it may not always be possible/practical to configure >>>>>> authenticators on a per-request basis the API includes a means to >>>>>> specify a default >>>>>> authenticator that is used when no authenticator is explicitly >>>>>> specified. This may be configured via the >>>>>> @@ -148,6 +142,82 @@ provided that it is using ARQs APIs to m >>>>>> >>>>>> Note that the default authenticator may be disabled by setting it to >>>>>> `null`. >>>>>> >>>>>> +## HTTP Authentication after ARQ 3.1.0 >>>>>> + >>>>>> +### Applying Authentication >>>>>> + >>>>>> +APIs that support authentication typically provide methods for >>>>>> providing an [HttpClient] for use with the given instance of that API >>>>>> class. `HttpClient` is [extremely flexible][16] and can handle most >>>>>> scenarios very well. Since it may not always be possible/practical to >>>>>> configure authenticators on a per-request basis the API includes a means >>>>>> to specify a default client that is used when no other client is >>>>>> explicitly specified. This may be configured via the >>>>>> +`setDefaultHttpClient(HttpClient httpClient)` method of the >>>>>> [HttpOp][13] class. This allows for static-scoped configuration of HTTP >>>>>> behavior. >>>>>> + >>>>>> +#### Examples of authentication >>>>>> + >>>>>> +This section includes a series of examples showing how to use HTTP >>>>>> Commons classes to perform authenticated work. Most of them take >>>>>> advantage of `HttpOp.setDefaultHttpClient` as described above. >>>>>> + >>>>>> +##### Simple authentication using username and password >>>>>> + >>>>>> +First we build an authenticating client: >>>>>> + >>>>>> + CredentialsProvider credsProvider = new BasicCredentialsProvider(); >>>>>> + Credentials credentials = new UsernamePasswordCredentials("user", >>>>>> "passwd"); >>>>>> + credsProvider.setCredentials(AuthScope.ANY, credentials); >>>>>> + HttpClient httpclient = HttpClients.custom() >>>>>> + .setDefaultCredentialsProvider(credsProvider) >>>>>> + .build(); >>>>>> + HttpOp.setDefaultHttpClient(httpclient); >>>>>> + >>>>>> +Notice that we gave no scope for use with the credentials >>>>>> (`AuthScope.ANY`). We can make further use of that parameter if we want >>>>>> to assign a scope for some credentials: >>>>>> + >>>>>> + CredentialsProvider credsProvider = new BasicCredentialsProvider(); >>>>>> + Credentials unscopedCredentials = new >>>>>> UsernamePasswordCredentials("user", "passwd"); >>>>>> + credsProvider.setCredentials(AuthScope.ANY, unscopedCredentials); >>>>>> + Credentials scopedCredentials = new >>>>>> UsernamePasswordCredentials("user", "passwd"); >>>>>> + final String host = "http://example.com/sparql"; >>>>>> + final int port = 80; >>>>>> + final String realm = "aRealm"; >>>>>> + final String schemeName = "DIGEST"; >>>>>> + AuthScope authscope = new AuthScope(host, port, realm, schemeName); >>>>>> + credsProvider.setCredentials(authscope, scopedCredentials); >>>>>> + HttpClient httpclient = HttpClients.custom() >>>>>> + .setDefaultCredentialsProvider(credsProvider) >>>>>> + .build(); >>>>>> + HttpOp.setDefaultHttpClient(httpclient); >>>>>> + >>>>>> +##### Authenticating via a form >>>>>> + >>>>>> +For this case we introduce an [HttpClientContext][17], which we can use >>>>>> to retrieve the cookie we get from logging into a form. We then use the >>>>>> cookie to authenticate elsewhere. >>>>>> + >>>>>> + // we'll use this context to maintain our HTTP "conversation" >>>>>> + HttpClientContext httpContext = new HttpClientContext(); >>>>>> + >>>>>> + // first we use a method on HttpOp to log in and get our cookie >>>>>> + Params params = new Params(); >>>>>> + params.addParam("username", "Bob Wu"); >>>>>> + params.addParam("password", "my password"); >>>>>> + HttpOp.execHttpPostForm("http://example.com/loginform", params , >>>>>> null, null, null, httpContext); >>>>>> + >>>>>> + // now our cookie is stored in httpContext >>>>>> + CookieStore cookieStore = httpContext.getCookieStore(); >>>>>> + >>>>>> + // lastly we build a client that uses that cookie >>>>>> + HttpClient httpclient = HttpClients.custom() >>>>>> + .setDefaultCookieStore(cookieStore) >>>>>> + .build(); >>>>>> + HttpOp.setDefaultHttpClient(httpclient); >>>>>> + >>>>>> +## Other concerns >>>>>> + >>>>>> +### Debugging Authentication >>>>>> + >>>>>> +ARQ uses [Apache Http Client][14] for all its HTTP operations and this >>>>>> provides detailed logging information that can be used for debugging. To >>>>>> +see this information you need to configure your logging framework to >>>>>> set the `org.apache.http` package to either `DEBUG` or `TRACE` level. >>>>>> + >>>>>> +The `DEBUG` level will give you general diagnostic information about >>>>>> requests and responses while the `TRACE` level will give you detailed >>>>>> +HTTP traces i.e. allow you to see the exact HTTP requests and responses >>>>>> which can be extremely useful for debugging authentication problems. >>>>>> + >>>>>> +### Authenticating to a SPARQL federated service >>>>>> + >>>>>> +ARQ allows the user to configure HTTP behavior to use on a >>>>>> per-`SERVICE` basis, including authentication behavior such as is >>>>>> described above. This works via the ARQ context. See [Basic Federated >>>>>> Query][5] for more information on configuring this functionality. >>>>>> + >>>>>> [1]: >>>>>> http://jena.apache.org/documentation/javadoc/arq/org/apache/jena/atlas/web/auth/HttpAuthenticator.html >>>>>> [2]: >>>>>> http://jena.apache.org/documentation/javadoc/arq/org/apache/jena/atlas/web/auth/SimpleAuthenticator.html >>>>>> [3]: >>>>>> http://jena.apache.org/documentation/javadoc/arq/org/apache/jena/atlas/web/auth/ScopedAuthenticator.html >>>>>> @@ -161,4 +231,7 @@ Note that the default authenticator may >>>>>> [11]: >>>>>> http://jena.apache.org/documentation/javadoc/arq/org/apache/jena/web/DatasetGraphAccessorHTTP.html >>>>>> [12]: >>>>>> http://jena.apache.org/documentation/javadoc/arq/org/apache/jena/atlas/web/auth/DelegatingAuthenticator.html >>>>>> [13]: >>>>>> http://jena.apache.org/documentation/javadoc/arq/org/apache/jena/riot/web/HttpOp.html >>>>>> - [14]: http://hc.apache.org >>>>>> \ No newline at end of file >>>>>> + [14]: http://hc.apache.org >>>>>> + [15]: >>>>>> https://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/client/HttpClient.html >>>>>> + [16]: https://hc.apache.org/httpcomponents-client-ga/examples.html >>>>>> + [17]: >>>>>> https://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/client/protocol/HttpClientContext.html >>>>>> \ No newline at end of file >>>>>> >>>>>> Modified: jena/site/trunk/content/documentation/query/service.mdtext >>>>>> URL: >>>>>> http://svn.apache.org/viewvc/jena/site/trunk/content/documentation/query/service.mdtext?rev=1768736&r1=1768735&r2=1768736&view=diff >>>>>> ============================================================================== >>>>>> --- jena/site/trunk/content/documentation/query/service.mdtext (original) >>>>>> +++ jena/site/trunk/content/documentation/query/service.mdtext Tue Nov >>>>>> 8 16:53:48 2016 >>>>>> @@ -48,19 +48,18 @@ distributed query evaluation. The algebr >>>>>> without regard to how selective the pattern is. So the order of the >>>>>> query will affect the speed of execution. Because it involves HTTP >>>>>> operations, asking the query in the right order matters a lot. >>>>>> -Don't ask for the whole of a bookstore just to find book whose >>>>>> +Don't ask for the whole of a bookstore just to find a book whose >>>>>> title comes from a local RDF file - ask the bookshop a query with >>>>>> the title already bound from earlier in the query. >>>>>> >>>>>> ## Controlling `SERVICE` requests. >>>>>> >>>>>> -The `SERVICE` operation in a SPARQL query may be configured via the >>>>>> Context. >>>>>> -The values for configuration can be set in the global context (accessed >>>>>> via >>>>>> +The `SERVICE` operation in a SPARQL query may be configured via the >>>>>> Context. The values for configuration can be set in the global context >>>>>> (accessed via >>>>>> `ARQ.getContext()`) or in the per-query execution context. >>>>>> >>>>>> The prefix `srv:` is the IRI `<http://jena.hpl.hp.com/Service#>`. >>>>>> >>>>>> -### Summary >>>>>> +### Configuration for ARQ through version 3.1.0 >>>>>> >>>>>> Symbol | Usage >>>>>> ------ | ----- >>>>>> @@ -71,7 +70,7 @@ Symbol | Usage >>>>>> `srv:queryAuthPwd` | Basic authentication >>>>>> `srv:queryContext` | Per-endpoint configuration >>>>>> >>>>>> -### `srv:queryTimeout` >>>>>> +#### `srv:queryTimeout` >>>>>> >>>>>> Set the connect and read timeouts for the query. >>>>>> >>>>>> @@ -86,21 +85,21 @@ read timout = 0 >>>>>> >>>>>> Values of 0 indicate no timeout and service operation will wait until >>>>>> the remote server responds. >>>>>> >>>>>> -### `srv:queryGzip` >>>>>> +#### `srv:queryGzip` >>>>>> >>>>>> Sets the allow Gzip flag. >>>>>> >>>>>> Boolean: True indicates that gzip compressed data is acceptable. >>>>>> false >>>>>> >>>>>> -### `srv:queryDeflate` >>>>>> +#### `srv:queryDeflate` >>>>>> >>>>>> Sets the allow Deflate flag. >>>>>> >>>>>> Boolean: True indicates that deflate compression is acceptable >>>>>> False >>>>>> >>>>>> -### `srv:queryAuthUser` >>>>>> +#### `srv:queryAuthUser` >>>>>> >>>>>> Sets the user id for basic auth. >>>>>> >>>>>> @@ -108,7 +107,7 @@ String: The user id to log in with >>>>>> >>>>>> If null or null length no user id is sent. >>>>>> >>>>>> -### `srv:queryAuthPwd` >>>>>> +#### `srv:queryAuthPwd` >>>>>> >>>>>> Sets the password for basic auth. >>>>>> >>>>>> @@ -116,13 +115,43 @@ String: The password to log in with. >>>>>> >>>>>> If null or null length no password is sent. >>>>>> >>>>>> -### srv:serviceContext >>>>>> +#### `srv:serviceContext` >>>>>> Provides a mechanism to override system context settings on a per URI >>>>>> basis. >>>>>> >>>>>> The value is a `Map<String,Context>` where the map key is the URI of the >>>>>> service endpoint, and the `Context` is a set of values to override the >>>>>> default values. >>>>>> >>>>>> If a context is provided for the URI the system context is copied and >>>>>> the URI specific values are then copied in. This ensures that any URI >>>>>> specific settings will be used. >>>>>> >>>>>> +### Configuration for ARQ after version 3.1.0 >>>>>> >>>>>> +Symbol | Usage | Default >>>>>> +------ | ----- | ------- >>>>>> +`srv:queryTimeout` | Set timeouts | none >>>>>> +`srv:queryCompression` | Enable use of deflation and GZip | true >>>>>> +`srv:queryClient` | Enable use of a specific client | none >>>>>> +`srv:queryContext` | Per-endpoint configuration | none >>>>>> + >>>>>> +#### `srv:queryTimeout` >>>>>> + >>>>>> +As documented above. >>>>>> + >>>>>> + >>>>>> +#### `srv:queryCompression` >>>>>> + >>>>>> +Sets the flag for use of deflation and GZip. >>>>>> + >>>>>> +Boolean: True indicates that gzip compressed data is acceptable. >>>>>> + >>>>>> +#### `srv:queryClient` >>>>>> + >>>>>> +Enable use of a specific client >>>>>> + >>>>>> +Provides a slot for a specific [HttpClient][1] for use with a specific >>>>>> `SERVICE` >>>>>> + >>>>>> +#### `srv:serviceContext` >>>>>> + >>>>>> +As documented above. >>>>>> >>>>>> [ARQ documentation index](index.html) >>>>>> + >>>>>> +[1]: >>>>>> https://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/client/HttpClient.html >>>>>> >>>>>> >>>>> >>> > -- Stian Soiland-Reyes http://orcid.org/0000-0001-9842-9718
