rzepinskip opened a new issue, #19407:
URL: https://github.com/apache/druid/issues/19407
### Affected Version
Apache Druid 36.0.0
### Description
When a Druid data server (Historical or Broker) is under load and rejects a
`/druid/v2/` POST with HTTP 429 (via `LimitRequestsFilter`) or HTTP 503 (via
Jetty's `QoSFilter` suspend timeout), the response body is **HTML**, not JSON,
because Druid relies on Jetty's default `ErrorHandler`. The upstream Druid
Broker/Router does not check `Content-Type` before parsing, so Jackson fails on
the very first byte (`0x3c` = `<`), and the user-visible error is a generic
`Unknown exception` HTTP 500 instead of a meaningful `Too Many Requests` /
capacity-exceeded error.
#### Cluster size
Reproduced on a single-server `micro-quickstart` deployment (1×
Coordinator+Overlord, 1× Broker, 1× Historical, 1× Router, 1× MiddleManager).
#### Configurations in use
Historical `runtime.properties` (intentionally constrained to force
saturation; production-realistic values would be larger but the same code path
triggers under sufficient load):
```properties
druid.server.http.numThreads=4
druid.server.http.enableRequestLimit=true
druid.server.http.maxIdleTime=PT5S
druid.processing.numThreads=1
druid.processing.numMergeBuffers=2
druid.processing.buffer.sizeBytes=64MiB
```
Broker `runtime.properties`:
```properties
druid.broker.http.numConnections=20
druid.broker.http.readTimeout=PT2M
druid.broker.http.unusedConnectionTimeout=PT4S
druid.server.http.numThreads=20
```
Debug logging on Broker / Historical:
- `org.apache.druid.jetty.RequestLog=DEBUG`
- `org.apache.druid.client.DirectDruidClient=DEBUG`
- `org.apache.druid.client.JsonParserIterator=DEBUG`
- `org.eclipse.jetty.server.handler.ErrorHandler=DEBUG`
- `org.eclipse.jetty.util.thread.QueuedThreadPool=DEBUG`
#### Steps to reproduce
1. Apply the constrained Historical config above and restart the Historical.
2. Drive a concurrent load against the Router using a small set of queries
on the `wikipedia` datasource (e.g., 1000 requests at concurrency 200).
3. Observe HTTP 500 responses on the client containing `JsonParseException`,
and matching HTTP 429 in the Historical request log.
A direct probe against the Historical (or any inspection of
`historical.log`) shows the HTML body:
```
HTTP/1.1 429 Too Many Requests
Content-Type: text/html;charset=ISO-8859-1
<html>
<head><meta http-equiv="Content-Type"
content="text/html;charset=ISO-8859-1"/>
<title>Error 429 Too Many Requests</title></head>
<body><h2>HTTP ERROR 429 Too Many Requests</h2>
...
</body></html>
```
#### Error message / stack trace
Broker:
```
WARN [ForkJoinPool-1-worker-3] org.apache.druid.client.JsonParserIterator -
Query [5710bfb8-ade6-4d13-8211-641c7e2f512f] to host [localhost:8083]
interrupted
com.fasterxml.jackson.core.JsonParseException: Invalid type marker byte 0x3c
for expected value token
at [Source: (SequenceInputStream); byte offset: #1]
at
com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:724)
...
at
org.apache.druid.client.JsonParserIterator.init(JsonParserIterator.java:216)
at
org.apache.druid.client.DirectDruidClient.run(DirectDruidClient.java:381)
```
Client (through Router):
```json
HTTP/1.1 500 Internal Server Error
{"error":"Unknown exception",
"errorClass":"com.fasterxml.jackson.core.JsonParseException",
"host":"localhost:8083",
"errorCode":"legacyQueryException",
"persona":"OPERATOR",
"category":"RUNTIME_FAILURE"}
```
#### Debugging done
Cross-checked against Druid sources:
-
`server/src/main/java/org/apache/druid/server/initialization/jetty/LimitRequestsFilter.java`
emits `((HttpServletResponse) response).sendError(429, "Too Many
Requests")`,
which dispatches to Jetty's default `ErrorHandler` — producing HTML.
-
`server/src/main/java/org/apache/druid/server/initialization/jetty/JettyServerModule.java`
(around line 469)
only customizes the `ErrorHandler` when
`druid.server.http.showDetailedJettyErrors=false`,
and even then it merely clears `RequestDispatcher.ERROR_EXCEPTION` and
delegates to
`super.handle(...)`. The body rendering remains Jetty's default (HTML when
no
`Accept: application/json` header is sent, which is the case for
`DirectDruidClient`).
- `server/src/main/java/org/apache/druid/client/DirectDruidClient.java` does
not check
`Content-Type`. The `InputStream` is handed straight to
`JsonParserIterator.init()`
(`server/src/main/java/org/apache/druid/client/JsonParserIterator.java`),
which calls `objectMapper.getFactory().createParser(is); jp.nextToken();` —
exactly where Jackson trips over the leading `<`.
- The `QoSFilter` registered in
`services/src/main/java/org/apache/druid/cli/QueryJettyServerInitializer.java`
produces HTTP 503 on suspend timeout via the same code path, so 503
responses can
trigger the identical failure.
- `docs/configuration/index.md` description of
`druid.server.http.showDetailedJettyErrors`
("the JSON response only includes...") is misleading: the default error
handler still
performs `Accept`-header based content negotiation and falls back to HTML
for clients
that do not advertise `application/json`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]