Copilot commented on code in PR #2894:
URL: https://github.com/apache/tika/pull/2894#discussion_r3421938137
##########
tika-server/tika-server-core/pom.xml:
##########
@@ -95,7 +95,12 @@
</dependency>
<dependency>
<groupId>org.eclipse.jetty.http2</groupId>
- <artifactId>http2-server</artifactId>
+ <artifactId>jetty-http2-server</artifactId>
+ </dependency>
+ <dependency>
+ <groupId>jakarta.servlet</groupId>
+ <artifactId>jakarta.servlet-api</artifactId>
+ <version>6.0.0</version>
</dependency>
Review Comment:
`jakarta.servlet-api` is pinned to an explicit version here, while the rest
of this POM relies on parent dependencyManagement. To avoid version drift
(especially across the Jetty/CXF upgrade), it would be better to manage the
servlet API version in `tika-parent/pom.xml` and omit the version here so all
modules stay aligned.
##########
tika-pipes/tika-pipes-plugins/tika-pipes-solr/src/main/java/org/apache/tika/pipes/emitter/solr/SolrEmitter.java:
##########
@@ -108,26 +109,39 @@ private static SolrClient
buildSolrClient(SolrEmitterConfig config) throws TikaC
if (config.solrUrls() == null || config.solrUrls().isEmpty()) {
// Use ZooKeeper-based CloudSolrClient
- Http2SolrClient.Builder http2SolrClientBuilder = new
Http2SolrClient.Builder();
+ HttpJettySolrClient.Builder jettyClientBuilder = new
HttpJettySolrClient.Builder();
if (!StringUtils.isBlank(httpClientFactory.getUserName())) {
-
http2SolrClientBuilder.withBasicAuthCredentials(httpClientFactory.getUserName(),
httpClientFactory.getPassword());
+
jettyClientBuilder.withBasicAuthCredentials(httpClientFactory.getUserName(),
httpClientFactory.getPassword());
}
- http2SolrClientBuilder
+ if (!StringUtils.isBlank(config.proxyHost())) {
+ jettyClientBuilder.withProxyConfiguration(config.proxyHost(),
+ config.proxyPort() != null ? config.proxyPort() : 0,
false, false);
+ }
Review Comment:
This proxy configuration passes `0` when `proxyPort` is null. `0` is
typically not a valid proxy port and is inconsistent with the earlier logic
that only sets a proxy port when it’s > 0. Consider failing fast when
`proxyHost` is set without a valid `proxyPort` (or skip configuring the proxy).
##########
tika-pipes/tika-pipes-plugins/tika-pipes-solr/src/main/java/org/apache/tika/pipes/emitter/solr/SolrEmitter.java:
##########
@@ -108,26 +109,39 @@ private static SolrClient
buildSolrClient(SolrEmitterConfig config) throws TikaC
if (config.solrUrls() == null || config.solrUrls().isEmpty()) {
// Use ZooKeeper-based CloudSolrClient
- Http2SolrClient.Builder http2SolrClientBuilder = new
Http2SolrClient.Builder();
+ HttpJettySolrClient.Builder jettyClientBuilder = new
HttpJettySolrClient.Builder();
if (!StringUtils.isBlank(httpClientFactory.getUserName())) {
-
http2SolrClientBuilder.withBasicAuthCredentials(httpClientFactory.getUserName(),
httpClientFactory.getPassword());
+
jettyClientBuilder.withBasicAuthCredentials(httpClientFactory.getUserName(),
httpClientFactory.getPassword());
}
- http2SolrClientBuilder
+ if (!StringUtils.isBlank(config.proxyHost())) {
+ jettyClientBuilder.withProxyConfiguration(config.proxyHost(),
+ config.proxyPort() != null ? config.proxyPort() : 0,
false, false);
+ }
+ jettyClientBuilder
.withRequestTimeout(httpClientFactory.getRequestTimeoutMillis(),
TimeUnit.MILLISECONDS)
.withConnectionTimeout(config.getConnectionTimeoutMillisOrDefault(),
TimeUnit.MILLISECONDS);
- Http2SolrClient http2SolrClient = http2SolrClientBuilder.build();
return new CloudSolrClient.Builder(config.solrZkHosts(),
Optional.ofNullable(config.solrZkChroot()))
- .withHttpClient(http2SolrClient)
+ .withInternalClientBuilder(jettyClientBuilder)
.build();
} else {
- // Use direct URL-based LBHttpSolrClient
- return new LBHttpSolrClient.Builder()
+ // Use direct URL-based LBJettySolrClient
+ HttpJettySolrClient.Builder jettyClientBuilder = new
HttpJettySolrClient.Builder();
+ if (!StringUtils.isBlank(httpClientFactory.getUserName())) {
+
jettyClientBuilder.withBasicAuthCredentials(httpClientFactory.getUserName(),
httpClientFactory.getPassword());
+ }
+ if (!StringUtils.isBlank(config.proxyHost())) {
+ jettyClientBuilder.withProxyConfiguration(config.proxyHost(),
+ config.proxyPort() != null ? config.proxyPort() : 0,
false, false);
+ }
Review Comment:
Same proxy-port issue as above: this block passes `0` when `proxyPort` is
null. Prefer rejecting `proxyHost` without a valid (positive) `proxyPort` to
avoid configuring an invalid proxy.
##########
tika-pipes/tika-pipes-plugins/tika-pipes-solr/src/main/java/org/apache/tika/pipes/iterator/solr/SolrPipesIterator.java:
##########
@@ -180,27 +181,37 @@ private SolrClient createSolrClient() throws
TikaConfigException {
List<String> solrZkHosts = config.getSolrZkHosts() != null ?
config.getSolrZkHosts() : Collections.emptyList();
if (solrUrls.isEmpty()) {
- //TODO -- there's more that we need to pass through, including ssl
etc.
- Http2SolrClient.Builder http2SolrClientBuilder = new
Http2SolrClient.Builder();
+ HttpJettySolrClient.Builder jettyClientBuilder = new
HttpJettySolrClient.Builder();
if (!StringUtils.isBlank(httpClientFactory.getUserName())) {
-
http2SolrClientBuilder.withBasicAuthCredentials(httpClientFactory.getUserName(),
httpClientFactory.getPassword());
+
jettyClientBuilder.withBasicAuthCredentials(httpClientFactory.getUserName(),
httpClientFactory.getPassword());
}
Review Comment:
`HttpClientFactory` supports non-basic auth schemes (e.g. NTLM), but this
always configures Basic Auth when a username is present. That can lead to
confusing behavior if `authScheme` is set to something else. Prefer failing
fast for non-basic schemes (or implement the equivalent Jetty client auth).
##########
tika-pipes/tika-pipes-plugins/tika-pipes-solr/src/main/java/org/apache/tika/pipes/emitter/solr/SolrEmitter.java:
##########
@@ -108,26 +109,39 @@ private static SolrClient
buildSolrClient(SolrEmitterConfig config) throws TikaC
if (config.solrUrls() == null || config.solrUrls().isEmpty()) {
// Use ZooKeeper-based CloudSolrClient
- Http2SolrClient.Builder http2SolrClientBuilder = new
Http2SolrClient.Builder();
+ HttpJettySolrClient.Builder jettyClientBuilder = new
HttpJettySolrClient.Builder();
if (!StringUtils.isBlank(httpClientFactory.getUserName())) {
-
http2SolrClientBuilder.withBasicAuthCredentials(httpClientFactory.getUserName(),
httpClientFactory.getPassword());
+
jettyClientBuilder.withBasicAuthCredentials(httpClientFactory.getUserName(),
httpClientFactory.getPassword());
}
- http2SolrClientBuilder
+ if (!StringUtils.isBlank(config.proxyHost())) {
+ jettyClientBuilder.withProxyConfiguration(config.proxyHost(),
+ config.proxyPort() != null ? config.proxyPort() : 0,
false, false);
+ }
+ jettyClientBuilder
.withRequestTimeout(httpClientFactory.getRequestTimeoutMillis(),
TimeUnit.MILLISECONDS)
.withConnectionTimeout(config.getConnectionTimeoutMillisOrDefault(),
TimeUnit.MILLISECONDS);
- Http2SolrClient http2SolrClient = http2SolrClientBuilder.build();
return new CloudSolrClient.Builder(config.solrZkHosts(),
Optional.ofNullable(config.solrZkChroot()))
- .withHttpClient(http2SolrClient)
+ .withInternalClientBuilder(jettyClientBuilder)
.build();
} else {
- // Use direct URL-based LBHttpSolrClient
- return new LBHttpSolrClient.Builder()
+ // Use direct URL-based LBJettySolrClient
+ HttpJettySolrClient.Builder jettyClientBuilder = new
HttpJettySolrClient.Builder();
Review Comment:
The PR description says the affected Solr call sites are marked with `TODO`
comments for the follow-up (Jetty-native proxy/auth rework), but there isn’t a
TODO here anymore. Either add a short TODO (so it’s discoverable in code) or
update the PR description to match reality.
##########
tika-pipes/tika-pipes-plugins/tika-pipes-solr/src/main/java/org/apache/tika/pipes/emitter/solr/SolrEmitter.java:
##########
@@ -108,26 +109,39 @@ private static SolrClient
buildSolrClient(SolrEmitterConfig config) throws TikaC
if (config.solrUrls() == null || config.solrUrls().isEmpty()) {
// Use ZooKeeper-based CloudSolrClient
- Http2SolrClient.Builder http2SolrClientBuilder = new
Http2SolrClient.Builder();
+ HttpJettySolrClient.Builder jettyClientBuilder = new
HttpJettySolrClient.Builder();
if (!StringUtils.isBlank(httpClientFactory.getUserName())) {
-
http2SolrClientBuilder.withBasicAuthCredentials(httpClientFactory.getUserName(),
httpClientFactory.getPassword());
+
jettyClientBuilder.withBasicAuthCredentials(httpClientFactory.getUserName(),
httpClientFactory.getPassword());
}
Review Comment:
`HttpClientFactory` supports both `basic` and `ntlm` auth schemes, but this
code always configures Basic Auth when a username is present. That can silently
mis-handle configs that set `authScheme=ntlm`. If Jetty/SolrJ’s
`HttpJettySolrClient` doesn’t support NTLM, fail fast with a clear config error
instead of proceeding with Basic Auth.
##########
tika-pipes/tika-pipes-plugins/tika-pipes-solr/src/main/java/org/apache/tika/pipes/iterator/solr/SolrPipesIterator.java:
##########
@@ -180,27 +181,37 @@ private SolrClient createSolrClient() throws
TikaConfigException {
List<String> solrZkHosts = config.getSolrZkHosts() != null ?
config.getSolrZkHosts() : Collections.emptyList();
if (solrUrls.isEmpty()) {
- //TODO -- there's more that we need to pass through, including ssl
etc.
- Http2SolrClient.Builder http2SolrClientBuilder = new
Http2SolrClient.Builder();
+ HttpJettySolrClient.Builder jettyClientBuilder = new
HttpJettySolrClient.Builder();
if (!StringUtils.isBlank(httpClientFactory.getUserName())) {
-
http2SolrClientBuilder.withBasicAuthCredentials(httpClientFactory.getUserName(),
httpClientFactory.getPassword());
+
jettyClientBuilder.withBasicAuthCredentials(httpClientFactory.getUserName(),
httpClientFactory.getPassword());
}
- http2SolrClientBuilder
+ if (!StringUtils.isBlank(httpClientFactory.getProxyHost())) {
+
jettyClientBuilder.withProxyConfiguration(httpClientFactory.getProxyHost(),
+ httpClientFactory.getProxyPort(), false, false);
+ }
+ jettyClientBuilder
.withRequestTimeout(httpClientFactory.getRequestTimeoutMillis(),
TimeUnit.MILLISECONDS)
.withConnectionTimeout(config.getConnectionTimeoutMillis(),
TimeUnit.MILLISECONDS);
-
- Http2SolrClient http2SolrClient = http2SolrClientBuilder.build();
return new CloudSolrClient.Builder(solrZkHosts,
Optional.ofNullable(config.getSolrZkChroot()))
- .withHttpClient(http2SolrClient)
+ .withInternalClientBuilder(jettyClientBuilder)
.build();
-
}
- return new LBHttpSolrClient.Builder()
- .withConnectionTimeout(config.getConnectionTimeoutMillis())
- .withSocketTimeout(config.getSocketTimeoutMillis())
- .withHttpClient(httpClientFactory.build())
- .withBaseSolrUrls(solrUrls.toArray(new String[]{}))
- .build();
+ HttpJettySolrClient.Builder jettyClientBuilder = new
HttpJettySolrClient.Builder();
+ if (!StringUtils.isBlank(httpClientFactory.getUserName())) {
+
jettyClientBuilder.withBasicAuthCredentials(httpClientFactory.getUserName(),
httpClientFactory.getPassword());
+ }
+ if (!StringUtils.isBlank(httpClientFactory.getProxyHost())) {
+
jettyClientBuilder.withProxyConfiguration(httpClientFactory.getProxyHost(),
+ httpClientFactory.getProxyPort(), false, false);
+ }
Review Comment:
Same potential invalid proxy setup in the LB client path:
`httpClientFactory.getProxyPort()` can be `0` unless explicitly configured, but
the proxy is still applied whenever `proxyHost` is non-blank. Consider
requiring a positive proxy port when enabling the proxy.
##########
tika-pipes/tika-pipes-plugins/tika-pipes-solr/src/main/java/org/apache/tika/pipes/iterator/solr/SolrPipesIterator.java:
##########
@@ -180,27 +181,37 @@ private SolrClient createSolrClient() throws
TikaConfigException {
List<String> solrZkHosts = config.getSolrZkHosts() != null ?
config.getSolrZkHosts() : Collections.emptyList();
if (solrUrls.isEmpty()) {
- //TODO -- there's more that we need to pass through, including ssl
etc.
- Http2SolrClient.Builder http2SolrClientBuilder = new
Http2SolrClient.Builder();
+ HttpJettySolrClient.Builder jettyClientBuilder = new
HttpJettySolrClient.Builder();
if (!StringUtils.isBlank(httpClientFactory.getUserName())) {
-
http2SolrClientBuilder.withBasicAuthCredentials(httpClientFactory.getUserName(),
httpClientFactory.getPassword());
+
jettyClientBuilder.withBasicAuthCredentials(httpClientFactory.getUserName(),
httpClientFactory.getPassword());
}
- http2SolrClientBuilder
+ if (!StringUtils.isBlank(httpClientFactory.getProxyHost())) {
+
jettyClientBuilder.withProxyConfiguration(httpClientFactory.getProxyHost(),
+ httpClientFactory.getProxyPort(), false, false);
+ }
Review Comment:
This proxy configuration uses `httpClientFactory.getProxyPort()`, which will
be `0` unless explicitly set. Because `configure()` only sets the proxy port
when it’s > 0, `proxyHost` can be set with an effective port of `0` here. It’s
safer to reject `proxyHost` without a valid port (or skip proxy config) to
avoid an invalid proxy setup.
##########
tika-parent/pom.xml:
##########
@@ -1361,17 +1359,11 @@
<version>3.2.0</version>
<configuration>
<excludeCoordinates>
- <!-- solr emitter -->
- <coordinate>
- <groupId>org.eclipse.jetty</groupId>
- <artifactId>jetty-http</artifactId>
- <version>11.0.26</version>
- </coordinate>
- <!-- [CVE-2025-1948], used in the Solr emitter. No apparent
upgrade available yet. -->
+ <!-- [CVE-2025-1948] check if still present in 12.0.35;
CVE-2026-2332 is fixed. -->
<coordinate>
<groupId>org.eclipse.jetty.http2</groupId>
- <artifactId>http2-common</artifactId>
- <version>11.0.26</version>
+ <artifactId>jetty-http2-common</artifactId>
+ <version>12.0.35</version>
</coordinate>
Review Comment:
The OSS Index exclusion hard-codes the Jetty HTTP2 version (`12.0.35`).
Since this PR already centralizes Jetty versions in properties, hard-coding
here makes future upgrades easier to miss. Prefer referencing
`${jetty.http2.version}` so the exclusion stays in sync with the actual
dependency version.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]