wenzhenghu opened a new issue, #63325:
URL: https://github.com/apache/doris/issues/63325

   ## Problem Description
   
   When using Python low-level HTTP clients to perform Stream Load through FE, 
Doris `master` may fail with connection errors such as:
   
   - `BrokenPipeError(32, 'Broken pipe')`
   - `ConnectionResetError(54, 'Connection reset by peer')`
   
   The same client behavior against Doris `3.0` FE can still receive a normal 
`307 Temporary Redirect` response without triggering the connection error.
   
   This indicates a compatibility regression in the FE Stream Load redirect 
path compared with Doris `3.0`.
   
   ## Impact
   
   This problem mainly affects clients that:
   
   - send Stream Load to FE instead of BE
   - use `HTTP/1.1`
   - start sending request body before fully processing the FE redirect
   - use chunked transfer or generator-based streaming
   - run in higher RTT environments
   
   Typical affected clients include:
   
   - Python `http.client`
   - Python `requests` in some streaming modes
   - Logstash HTTP-based output plugins
   
   ## Reproduction
   
   ### Target Instances
   
   Two Doris instances were used for comparison on the same host:
   
   #### Doris 3.0
   - MySQL: `172.16.0.90:9030`
   - FE HTTP: `172.16.0.90:8030`
   - BE HTTP: `172.16.0.90:8040`
   - Version: `doris-3.0.8-rc01-53c80683e85`
   
   #### Doris master
   - MySQL: `172.16.0.90:9034`
   - FE HTTP: `172.16.0.90:8034`
   - BE HTTP: `172.16.0.90:8044`
   - Version: `doris-0.0.0-b06684a15d5`
   
   ### Reproduction Script
   
   A Python reproduction script was prepared using `http.client` and chunked 
upload to simulate a client that keeps sending request body while FE responds 
with redirect:
   
   - client: Python `http.client`
   - request type: `PUT /api/{db}/{table}/_stream_load`
   - transfer mode: `Transfer-Encoding: chunked`
   - header: `Expect: 100-continue`
   
   ### Reproduction Result on Doris 3.0
   
   Request to FE:
   
   ```text
   http://172.16.0.90:8030/api/wzh/stream_load_redirect_repro/_stream_load
   ```
   
   Result:
   
   ```json
   {
     "target": "fe",
     "url": 
"http://172.16.0.90:8030/api/wzh/stream_load_redirect_repro/_stream_load";,
     "client": "httpclient",
     "status_code": 307,
     "elapsed_seconds": 18.205,
     "headers": {
       "Location": 
"http://root:@172.16.0.90:8040/api/wzh/stream_load_redirect_repro/_stream_load?";,
       "Connection": "close"
     },
     "body": ""
   }
   ```
   
   ### Reproduction Result on Doris master
   
   Request to FE:
   
   ```text
   http://172.16.0.90:8034/api/wzhtest/stream_load_redirect_repro/_stream_load
   ```
   
   Result:
   
   ```json
   {
     "target": "fe",
     "url": 
"http://172.16.0.90:8034/api/wzhtest/stream_load_redirect_repro/_stream_load";,
     "client": "httpclient",
     "elapsed_seconds": 0.605,
     "exception_type": "ConnectionResetError",
     "exception": "ConnectionResetError(54, 'Connection reset by peer')"
   }
   ```
   
   In another run with a larger payload and paced chunk sending, Doris master 
also reproduced:
   
   ```json
   {
     "exception_type": "BrokenPipeError",
     "exception": "BrokenPipeError(32, 'Broken pipe')"
   }
   ```
   
   ## Comparison Between Doris 3.0 and master
   
   Using the same host, same network, same Python client style, and same 
reproduction approach:
   
   - Doris `3.0` FE returns normal `307 redirect`
   - Doris `master` FE closes/resets the connection early
   - This strongly suggests a compatibility regression in the FE Stream Load 
redirect path
   
   ## Root Cause Analysis
   
   The current FE Stream Load path on `master` still behaves as:
   
   1. validate request
   2. select target BE
   3. immediately return `307 Temporary Redirect`
   4. do not consume the request body
   
   At the same time, FE is now running on:
   
   - `Spring Boot 3`
   - `Spring Framework 6`
   - `Jetty 12`
   
   Jetty 12 is more sensitive/strict in the `HTTP/1.1` case where the 
application returns a response before consuming the request body.
   If the client is still sending request body data when FE already 
redirects/closes the connection, the client may observe:
   
   - `broken pipe`
   - `connection reset by peer`
   
   This matches the observed difference between Doris `3.0` and `master`.
   
   ## Proposed Fix
   
   Two service-side improvements are proposed:
   
   ### 1. Expose Jetty unconsumed request reads as FE config
   
   Expose Jetty's:
   
   - `maxUnconsumedRequestContentReads`
   
   as an FE configuration, for example:
   
   - `jetty_server_max_unconsumed_request_content_reads`
   
   and apply it to FE `HttpConfiguration`.
   
   ### 2. Add bounded drain compatibility logic for Stream Load redirect
   
   On FE Stream Load redirect path:
   
   - keep FE -> BE redirect architecture unchanged
   - do not proxy the full request body through FE
   - after writing `307`, drain/discard a bounded amount of remaining request 
body
   - control it through an FE config, for example:
     - `stream_load_redirect_bounded_drain_max_bytes`
   
   This should improve compatibility with older/general-purpose HTTP clients 
while preserving the current FE/BE architecture.
   
   ## Environment
   
   - macOS client
   - Python 3.14
   - PyMySQL + Python `http.client`
   - same host and same network path for both Doris instances
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to