bartek commented on code in PR #1737:
URL: https://github.com/apache/tika/pull/1737#discussion_r1583509595


##########
tika-pipes/tika-fetchers/tika-fetcher-http/src/main/java/org/apache/tika/pipes/fetcher/http/HttpFetcher.java:
##########
@@ -143,10 +143,24 @@ public InputStream fetch(String fetchKey, Metadata 
metadata) throws IOException,
                         .setMaxRedirects(maxRedirects)
                         .setRedirectsEnabled(true).build();
         get.setConfig(requestConfig);
-        if (! StringUtils.isBlank(userAgent)) {
+        setHttpRequestHeaders(metadata, get);
+        return execute(get, metadata, httpClient, true);
+    }
+
+    private void setHttpRequestHeaders(Metadata metadata, HttpGet get) {
+        if (!StringUtils.isBlank(userAgent)) {
             get.setHeader(USER_AGENT, userAgent);
         }
-        return execute(get, metadata, httpClient, true);
+        // additional http request headers can be sent in here.
+        String[] httpRequestHeaders = metadata.getValues("httpRequestHeaders");

Review Comment:
   Should the key here be `httpHeaders` as per the http-fetcher schema? 
(embedding here as I am not sure where this is hosted)? It looks like this code 
is reading from the user-provided Metadata
   
   ```json
   {
       "$schema": "http://json-schema.org/draft-07/schema#";,
       "type": "object",
       "properties": {
           "authScheme": {
               "type": "string"
           },
           "connectTimeout": {
               "type": "integer"
           },
           "httpHeaders": {
               "type": "array",
               "items": {
                   "type": "string"
               }
           },
           "jwtExpiresInSeconds": {
               "type": "integer"
           },
           "jwtIssuer": {
               "type": "string"
           },
           "jwtPrivateKeyBase64": {
               "type": "string"
           },
           "jwtSecret": {
               "type": "string"
           },
           "jwtSubject": {
               "type": "string"
           },
           "maxConnections": {
               "type": "integer"
           },
           "maxConnectionsPerRoute": {
               "type": "integer"
           },
           "maxErrMsgSize": {
               "type": "integer"
           },
           "maxRedirects": {
               "type": "integer"
           },
           "maxSpoolSize": {
               "type": "integer"
           },
           "ntDomain": {
               "type": "string"
           },
           "overallTimeout": {
               "type": "integer"
           },
           "password": {
               "type": "string"
           },
           "proxyHost": {
               "type": "string"
           },
           "proxyPort": {
               "type": "integer"
           },
           "requestTimeout": {
               "type": "integer"
           },
           "socketTimeout": {
               "type": "integer"
           },
           "userAgent": {
               "type": "string"
           },
           "userName": {
               "type": "string"
           }
       }
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to