re20052 opened a new issue, #64836:
URL: https://github.com/apache/doris/issues/64836

   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Version
   
   Apache Doris 4.0
   
   ### What's Wrong?
   
   BE crashes (SIGSEGV) when a Stream Load uses `strict_mode=true` together 
with a `columns` header in which **every target column is assigned via an 
expression** (no direct-mapped column), and at least one row produces a NULL on 
a derived column.
   
   ### Crash location
   
   `FileScanner::_convert_to_output_block()` in 
`be/src/vec/exec/scan/file_scanner.cpp`.
   
   The strict-mode branch indexes `_src_slot_descs_order_by_dest[dest_index]` 
and `_dest_slot_to_src_slot_index[dest_index]` without checking their size.
   
   ### Root cause (short)
   
   These two members are populated only when FE sends 
`dest_sid_to_src_sid_without_trans`, which only happens if **at least one** 
target column is direct-mapped (no `=` in `columns`). When every target column 
uses an expression, both containers stay empty and the strict-mode branch reads 
them out of bounds.
   
   > Line numbers may differ slightly from upstream master; the function name 
and the two member variables are stable identifiers.
   
   ### What You Expected?
   
   ```markdown
   Stream Load should not crash BE. The strict-mode branch should be guarded by 
a size / existence check on `_src_slot_descs_order_by_dest` and 
`_dest_slot_to_src_slot_index`, and fall through to the regular nullable check 
when no source-column mapping exists for a derived column.
   ```
   
   ### How to Reproduce?
   
   
   ````markdown
   **Table**
   
   ```sql
   CREATE TABLE sl_min (
       k BIGINT NOT NULL,
       v VARCHAR(64) NULL
   ) ENGINE=OLAP
   DUPLICATE KEY(k)
   DISTRIBUTED BY HASH(k) BUCKETS 1
   PROPERTIES ("replication_num" = "1");
   ```
   
   **Source file** `sl_min.csv` (one row, two fields):
   
   ````
   1,\N
   ````
   
   **Stream Load**
   
   ```bash
   curl --location-trusted -u root: \
       -H "label:sl_min_$(date +%s)" \
       -H "format:csv" \
       -H "column_separator:," \
       -H "columns:c1,c2,k=c1,v=c2" \
       -H "strict_mode:true" \
       -H "max_filter_ratio:0" \
       -T ./sl_min.csv \
       "http://<fe_host>:<fe_http_port>/api/<db>/sl_min/_stream_load"
   ```
   
   The BE process crashes with SIGSEGV. curl reports `Warning: Binary output 
can mess up your terminal` because the connection is dropped mid-response.
   
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to