tnakama created CALCITE-7547:
--------------------------------
Summary: `BIG_QUERY` conformance should allow field access on
`UNNEST(array_of_struct) AS alias`
Key: CALCITE-7547
URL: https://issues.apache.org/jira/browse/CALCITE-7547
Project: Calcite
Issue Type: Bug
Components: core
Affects Versions: 1.40.0
Environment: Environment: Calcite 1.40.0, Java 21.
Reporter: tnakama
Under SqlConformanceEnum.BIG_QUERY, the following query fails validation
with "Column 'NAME' not found in table 'I'", even though BigQuery itself
supports this pattern:
SELECT i.name FROM t, UNNEST(t.items) AS i
-- t(items ARRAY<ROW<name VARCHAR>>), ROW kind = PEEK_FIELDS_NO_EXPAND
Cause: `SqlConformanceEnum.BIG_QUERY.allowAliasUnnestItems()` returns false.
With the flag off, `SqlUnnestOperator.inferReturnType` flattens the STRUCT
into one column per field, then `AliasNamespace` renames the single column
to the alias name — so `i.name` can no longer be resolved. The flag was
originally added for Presto and was never extended to BIG_QUERY, even
though BigQuery has the same semantics. The same query validates cleanly
under SqlConformanceEnum.PRESTO; the conformance flag is the only
difference.
Reproducer (JUnit 5 + Calcite, no other dependencies):
---
import static org.junit.jupiter.api.Assertions.assertDoesNotThrow;
import static org.junit.jupiter.api.Assertions.assertThrows;
import static org.junit.jupiter.api.Assertions.assertTrue;
import org.apache.calcite.rel.type.RelDataType;
import org.apache.calcite.rel.type.RelDataTypeFactory;
import org.apache.calcite.rel.type.StructKind;
import org.apache.calcite.schema.SchemaPlus;
import org.apache.calcite.schema.impl.AbstractTable;
import org.apache.calcite.sql.SqlNode;
import org.apache.calcite.sql.parser.SqlParser;
import org.apache.calcite.sql.type.SqlTypeName;
import org.apache.calcite.sql.validate.SqlConformance;
import org.apache.calcite.sql.validate.SqlConformanceEnum;
import org.apache.calcite.sql.validate.SqlValidator;
import org.apache.calcite.tools.FrameworkConfig;
import org.apache.calcite.tools.Frameworks;
import org.apache.calcite.tools.Planner;
import org.apache.calcite.tools.ValidationException;
import org.junit.jupiter.api.Test;
class CalciteBigQueryUnnestStructFieldReproducerTest {
private static final String QUERY = "SELECT i.name FROM t, UNNEST(t.items) AS
i";
@Test
void fieldAccessFails_underBigQueryConformance() {
ValidationException ex =
assertThrows(ValidationException.class, () ->
validate(SqlConformanceEnum.BIG_QUERY));
String msg = ex.getMessage() == null ? "" : ex.getMessage();
assertTrue(
msg.toLowerCase().contains("not found"),
() -> "expected validation error like \"Column 'NAME' not found\", got: " +
msg);
}
@Test
void fieldAccessWorks_underPrestoConformance() {
assertDoesNotThrow(() -> validate(SqlConformanceEnum.PRESTO));
}
private static void validate(SqlConformance conformance) throws Exception {
SchemaPlus root = Frameworks.createRootSchema(true);
root.add("T", new ItemsTable());
FrameworkConfig config =
Frameworks.newConfigBuilder()
.defaultSchema(root)
.parserConfig(SqlParser.config().withConformance(conformance))
.sqlValidatorConfig(SqlValidator.Config.DEFAULT.withConformance(conformance))
.build();
try (Planner planner = Frameworks.getPlanner(config)) {
SqlNode parsed = planner.parse(QUERY);
planner.validate(parsed);
}
}
/** In-memory table T(ITEMS ARRAY<ROW<NAME VARCHAR>>). */
private static final class ItemsTable extends AbstractTable {
@Override
public RelDataType getRowType(RelDataTypeFactory tf) {
RelDataType struct =
tf.builder()
.kind(StructKind.PEEK_FIELDS_NO_EXPAND)
.add("NAME", tf.createSqlType(SqlTypeName.VARCHAR))
.build();
RelDataType itemsArray = tf.createArrayType(struct, -1);
return tf.builder().add("ITEMS", itemsArray).build();
}
}
}
---
Proposed fix (either of):
1. Override `allowAliasUnnestItems()` to return true in `BIG_QUERY` (and
arguably any conformance modelling a system where UNNEST aliases bind
to the array element).
2. Decouple this behavior from SqlConformance entirely — base the
decision on StructKind (PEEK_FIELDS / PEEK_FIELDS_NO_EXPAND) of the
array element type, since the PRESTO path already relies on this kind
for downstream field resolution.
Note: enabling the flag exposes a downstream NPE in
`SqlToRelConverter.convertUnnest` for the 2-operand `AS(UNNEST, alias)`
form — see CALCITE-7546. The two are complementary; fixing this one alone
moves the failure from validation to conversion.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)