zhaorongsheng opened a new issue, #64006:
URL: https://github.com/apache/doris/issues/64006
## Search before asking
I searched for "view schema drift index out of bounds" and "external catalog
view schema change" in the issue tracker and did not find a duplicate.
## Doris Version
master (confirmed in `LogicalView.java` as of commit aa9162840f1)
## What's Wrong
Querying a Doris view that was created with `SELECT *` over an external
(Hive) catalog table fails with an `IndexOutOfBoundsException` after:
1. Adding a new column to the underlying Hive table (`ALTER TABLE … ADD
COLUMNS`)
2. Refreshing the base table metadata in Doris (`REFRESH TABLE <base_table>`)
The view itself is **not** refreshed; only the base table is refreshed.
**Error:**
```
ERROR 1105 (HY000): errCode = 2, detailMessage = Index 3 out of bounds for
length 3
```
## Reproducer
```sql
-- Step 1: Create Hive table (3 columns + partition)
-- (executed in Hive)
CREATE TABLE test.test_view_schema_drift (
id bigint,
name string,
age string
)
PARTITIONED BY (dt string)
STORED AS PARQUET;
-- Step 2: In Doris (Hive catalog context)
SWITCH hive;
DESCRIBE test.test_view_schema_drift; -- shows 3 non-partition columns
CREATE VIEW test.test_view AS
SELECT * FROM test.test_view_schema_drift
WHERE dt = date_sub(current_date(), 1);
SELECT * FROM test.test_view WHERE 1=0;
-- OK: returns 3 columns (empty result)
-- Step 3: Add a column in Hive
ALTER TABLE test.test_view_schema_drift ADD COLUMNS (score string COMMENT
'new col');
-- Step 4: Refresh base table in Doris
SWITCH hive;
REFRESH TABLE test.test_view_schema_drift;
SELECT * FROM test.test_view_schema_drift WHERE 1=0;
-- OK: returns 4 columns now
SELECT * FROM test.test_view WHERE 1=0;
-- FAIL: Index 3 out of bounds for length 3
```
## Root Cause
`LogicalView.computeOutput()` iterates over `childOutput` (the output of the
**re-analyzed** view body, which reflects the refreshed 4-column base table).
For each slot it calls `view.getFullSchema().get(i)`.
`view.getFullSchema()` is derived from the view's metadata in the Hive
metastore, which was created when the base table had 3 columns. Since only
`REFRESH TABLE base_table` was called (not `REFRESH TABLE view`), the view's
stored schema still has 3 columns. When `i = 3`, `get(3)` throws
`IndexOutOfBoundsException`.
```java
// LogicalView.java – before fix
for (int i = 0; i < childOutput.size(); i++) { // childOutput.size() = 4
...
if (CollectionUtils.isEmpty(view.getFullSchema())) {
qualified = originSlot.withQualifier(fullQualifiers);
} else {
// BUG: view.getFullSchema().size() == 3, crashes at i == 3
qualified = originSlot.withOneLevelTableAndColumnAndQualifier(
view, view.getFullSchema().get(i), fullQualifiers);
}
}
```
The `isEmpty()` guard added in #40715 handles `null`/empty `fullSchema` but
not the under-sized case introduced by schema drift.
## Expected Behavior
Querying the view should not throw. The new column (added after view
creation) should appear in the result set with its qualifier correctly applied
(falling back to `withQualifier()` as the existing null-guard branch does).
## Impact / Workaround
**Impact**: Any user who
1. Creates a Doris `VIEW` (using `SELECT *`) on an external catalog table,
AND
2. Later adds columns to the base table and calls `REFRESH TABLE
<base_table>`
will hit this crash when querying the view.
**Workaround**: Execute `REFRESH TABLE <view>` (or `DROP VIEW + CREATE
VIEW`) after adding columns, so that the view's stored schema is also refreshed
before querying.
## Proposed Fix
Extend the guard condition to also cover `i >= fullSchema.size()`:
```java
List<Column> fullSchema = view.getFullSchema();
if (CollectionUtils.isEmpty(fullSchema) || i >= fullSchema.size()) {
qualified = originSlot.withQualifier(fullQualifiers);
} else {
qualified = originSlot
.withOneLevelTableAndColumnAndQualifier(view, fullSchema.get(i),
fullQualifiers);
}
```
PR: (to be linked after submission)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]