zhaorongsheng opened a new issue, #64006:
URL: https://github.com/apache/doris/issues/64006

   ## Search before asking
   
   I searched for "view schema drift index out of bounds" and "external catalog 
view schema change" in the issue tracker and did not find a duplicate.
   
   ## Doris Version
   
   master (confirmed in `LogicalView.java` as of commit aa9162840f1)
   
   ## What's Wrong
   
   Querying a Doris view that was created with `SELECT *` over an external 
(Hive) catalog table fails with an `IndexOutOfBoundsException` after:
   1. Adding a new column to the underlying Hive table (`ALTER TABLE … ADD 
COLUMNS`)
   2. Refreshing the base table metadata in Doris (`REFRESH TABLE <base_table>`)
   
   The view itself is **not** refreshed; only the base table is refreshed.
   
   **Error:**
   ```
   ERROR 1105 (HY000): errCode = 2, detailMessage = Index 3 out of bounds for 
length 3
   ```
   
   ## Reproducer
   
   ```sql
   -- Step 1: Create Hive table (3 columns + partition)
   -- (executed in Hive)
   CREATE TABLE test.test_view_schema_drift (
     id     bigint,
     name   string,
     age    string
   )
   PARTITIONED BY (dt string)
   STORED AS PARQUET;
   
   -- Step 2: In Doris (Hive catalog context)
   SWITCH hive;
   DESCRIBE test.test_view_schema_drift;   -- shows 3 non-partition columns
   
   CREATE VIEW test.test_view AS
     SELECT * FROM test.test_view_schema_drift
     WHERE dt = date_sub(current_date(), 1);
   
   SELECT * FROM test.test_view WHERE 1=0;
   -- OK: returns 3 columns (empty result)
   
   -- Step 3: Add a column in Hive
   ALTER TABLE test.test_view_schema_drift ADD COLUMNS (score string COMMENT 
'new col');
   
   -- Step 4: Refresh base table in Doris
   SWITCH hive;
   REFRESH TABLE test.test_view_schema_drift;
   
   SELECT * FROM test.test_view_schema_drift WHERE 1=0;
   -- OK: returns 4 columns now
   
   SELECT * FROM test.test_view WHERE 1=0;
   -- FAIL: Index 3 out of bounds for length 3
   ```
   
   ## Root Cause
   
   `LogicalView.computeOutput()` iterates over `childOutput` (the output of the 
**re-analyzed** view body, which reflects the refreshed 4-column base table). 
For each slot it calls `view.getFullSchema().get(i)`.
   
   `view.getFullSchema()` is derived from the view's metadata in the Hive 
metastore, which was created when the base table had 3 columns. Since only 
`REFRESH TABLE base_table` was called (not `REFRESH TABLE view`), the view's 
stored schema still has 3 columns. When `i = 3`, `get(3)` throws 
`IndexOutOfBoundsException`.
   
   ```java
   // LogicalView.java – before fix
   for (int i = 0; i < childOutput.size(); i++) {   // childOutput.size() = 4
       ...
       if (CollectionUtils.isEmpty(view.getFullSchema())) {
           qualified = originSlot.withQualifier(fullQualifiers);
       } else {
           // BUG: view.getFullSchema().size() == 3, crashes at i == 3
           qualified = originSlot.withOneLevelTableAndColumnAndQualifier(
               view, view.getFullSchema().get(i), fullQualifiers);
       }
   }
   ```
   
   The `isEmpty()` guard added in #40715 handles `null`/empty `fullSchema` but 
not the under-sized case introduced by schema drift.
   
   ## Expected Behavior
   
   Querying the view should not throw. The new column (added after view 
creation) should appear in the result set with its qualifier correctly applied 
(falling back to `withQualifier()` as the existing null-guard branch does).
   
   ## Impact / Workaround
   
   **Impact**: Any user who
   1. Creates a Doris `VIEW` (using `SELECT *`) on an external catalog table, 
AND
   2. Later adds columns to the base table and calls `REFRESH TABLE 
<base_table>`
   
   will hit this crash when querying the view.
   
   **Workaround**: Execute `REFRESH TABLE <view>` (or `DROP VIEW + CREATE 
VIEW`) after adding columns, so that the view's stored schema is also refreshed 
before querying.
   
   ## Proposed Fix
   
   Extend the guard condition to also cover `i >= fullSchema.size()`:
   
   ```java
   List<Column> fullSchema = view.getFullSchema();
   if (CollectionUtils.isEmpty(fullSchema) || i >= fullSchema.size()) {
       qualified = originSlot.withQualifier(fullQualifiers);
   } else {
       qualified = originSlot
               .withOneLevelTableAndColumnAndQualifier(view, fullSchema.get(i), 
fullQualifiers);
   }
   ```
   
   PR: (to be linked after submission)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to