rzepinskip opened a new issue, #19407:
URL: https://github.com/apache/druid/issues/19407

   ### Affected Version
   
   Apache Druid 36.0.0 
   
   ### Description
   
   When a Druid data server (Historical or Broker) is under load and rejects a 
`/druid/v2/` POST with HTTP 429 (via `LimitRequestsFilter`) or HTTP 503 (via 
Jetty's `QoSFilter` suspend timeout), the response body is **HTML**, not JSON, 
because Druid relies on Jetty's default `ErrorHandler`. The upstream Druid 
Broker/Router does not check `Content-Type` before parsing, so Jackson fails on 
the very first byte (`0x3c` = `<`), and the user-visible error is a generic 
`Unknown exception` HTTP 500 instead of a meaningful `Too Many Requests` / 
capacity-exceeded error.
   
   #### Cluster size
   
   Reproduced on a single-server `micro-quickstart` deployment (1× 
Coordinator+Overlord, 1× Broker, 1× Historical, 1× Router, 1× MiddleManager).
   
   #### Configurations in use
   
   Historical `runtime.properties` (intentionally constrained to force 
saturation; production-realistic values would be larger but the same code path 
triggers under sufficient load):
   
   ```properties
   druid.server.http.numThreads=4
   druid.server.http.enableRequestLimit=true
   druid.server.http.maxIdleTime=PT5S
   druid.processing.numThreads=1
   druid.processing.numMergeBuffers=2
   druid.processing.buffer.sizeBytes=64MiB
   ```
   
   Broker `runtime.properties`:
   
   ```properties
   druid.broker.http.numConnections=20
   druid.broker.http.readTimeout=PT2M
   druid.broker.http.unusedConnectionTimeout=PT4S
   druid.server.http.numThreads=20
   ```
   
   Debug logging on Broker / Historical:
   - `org.apache.druid.jetty.RequestLog=DEBUG`
   - `org.apache.druid.client.DirectDruidClient=DEBUG`
   - `org.apache.druid.client.JsonParserIterator=DEBUG`
   - `org.eclipse.jetty.server.handler.ErrorHandler=DEBUG`
   - `org.eclipse.jetty.util.thread.QueuedThreadPool=DEBUG`
   
   #### Steps to reproduce
   
   1. Apply the constrained Historical config above and restart the Historical.
   2. Drive a concurrent load against the Router using a small set of queries 
on the `wikipedia` datasource (e.g., 1000 requests at concurrency 200).
   3. Observe HTTP 500 responses on the client containing `JsonParseException`, 
and matching HTTP 429 in the Historical request log.
   
   A direct probe against the Historical (or any inspection of 
`historical.log`) shows the HTML body:
   
   ```
   HTTP/1.1 429 Too Many Requests
   Content-Type: text/html;charset=ISO-8859-1
   
   <html>
   <head><meta http-equiv="Content-Type" 
content="text/html;charset=ISO-8859-1"/>
   <title>Error 429 Too Many Requests</title></head>
   <body><h2>HTTP ERROR 429 Too Many Requests</h2>
   ...
   </body></html>
   ```
   
   #### Error message / stack trace
   
   Broker:
   
   ```
   WARN [ForkJoinPool-1-worker-3] org.apache.druid.client.JsonParserIterator -
   Query [5710bfb8-ade6-4d13-8211-641c7e2f512f] to host [localhost:8083] 
interrupted
   
   com.fasterxml.jackson.core.JsonParseException: Invalid type marker byte 0x3c
   for expected value token
    at [Source: (SequenceInputStream); byte offset: #1]
       at 
com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:724)
       ...
       at 
org.apache.druid.client.JsonParserIterator.init(JsonParserIterator.java:216)
       at 
org.apache.druid.client.DirectDruidClient.run(DirectDruidClient.java:381)
   ```
   
   Client (through Router):
   
   ```json
   HTTP/1.1 500 Internal Server Error
   {"error":"Unknown exception",
    "errorClass":"com.fasterxml.jackson.core.JsonParseException",
    "host":"localhost:8083",
    "errorCode":"legacyQueryException",
    "persona":"OPERATOR",
    "category":"RUNTIME_FAILURE"}
   ```
   
   #### Debugging done
   
   Cross-checked against Druid sources:
   
   - 
`server/src/main/java/org/apache/druid/server/initialization/jetty/LimitRequestsFilter.java`
     emits `((HttpServletResponse) response).sendError(429, "Too Many 
Requests")`,
     which dispatches to Jetty's default `ErrorHandler` — producing HTML.
   - 
`server/src/main/java/org/apache/druid/server/initialization/jetty/JettyServerModule.java`
 (around line 469)
     only customizes the `ErrorHandler` when 
`druid.server.http.showDetailedJettyErrors=false`,
     and even then it merely clears `RequestDispatcher.ERROR_EXCEPTION` and 
delegates to
     `super.handle(...)`. The body rendering remains Jetty's default (HTML when 
no
     `Accept: application/json` header is sent, which is the case for 
`DirectDruidClient`).
   - `server/src/main/java/org/apache/druid/client/DirectDruidClient.java` does 
not check
     `Content-Type`. The `InputStream` is handed straight to
     `JsonParserIterator.init()` 
(`server/src/main/java/org/apache/druid/client/JsonParserIterator.java`),
     which calls `objectMapper.getFactory().createParser(is); jp.nextToken();` —
     exactly where Jackson trips over the leading `<`.
   - The `QoSFilter` registered in
     
`services/src/main/java/org/apache/druid/cli/QueryJettyServerInitializer.java`
     produces HTTP 503 on suspend timeout via the same code path, so 503 
responses can
     trigger the identical failure.
   - `docs/configuration/index.md` description of 
`druid.server.http.showDetailedJettyErrors`
     ("the JSON response only includes...") is misleading: the default error 
handler still
     performs `Accept`-header based content negotiation and falls back to HTML 
for clients
     that do not advertise `application/json`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to