Can we introduce a server config property to explicitly convey the server’s table/namespace/view identifier behavior? For example, a property like identifier-case-sensitive would allow engines or clients to determine the server’s behavior upfront and decide whether to proceed or fail fast. This would help avoid subtle compatibility issues, e.g., when Spark interacts with a case-insensitive catalog, it could detect the mismatch early and fail fast rather than continue with inconsistent assumptions.
Yufei On Sun, Oct 12, 2025 at 5:03 PM Maninder Parmar < [email protected]> wrote: > Hi Nicolae, > As you rightly pointed out, there's a lot of variation on identifier rules > across catalogs and database engines. Among catalogs, we have Polaris that > supports all the cases and characters, Glue supports lowercase while Unity > supports lowercase for tables and both upper/lower case for columns. > Similarly, database engines don't have uniformity across identifier > enforcement even though there is an existing SQL ANSI spec that defines the > identifier syntax. The variation in identifier syntax and resolution ranges > from fully ANSI compatible engines (Snowflake, Flink (Calcite), DB2, > Oracle), case sensitive resolution (PostgresQL, Spark), case insensitive > lowercase resolution (Trino, hive) and further variation on support for > delimited identifiers. > > In my opinion, it's good for the iceberg to be non-opinionated about the > identifier rules to accommodate different vendors and not attempt another > standardization as there's already one (ANSI) that exists but not uniformly > enforced. > > Thanks, > Maninder > > > > On Thu, Oct 9, 2025, 6:47 AM Nicolae Vartolomei <[email protected]> > wrote: > >> Tangential question is about other identifiers like table names. E.g. >> Some catalogs support dots, others (like Unity) do not. >> >> Thinking out loud: what if we mandate that lowercase and say >> underscores must be supported. The spec then escapes everything else >> (i.e. hex encoding including unicode) and asks the query engine to >> unescape what they can. But in APIs always the escaped variant is >> used. >> >> >> On Thu, Oct 9, 2025 at 2:35 PM Nicolae Vartolomei <[email protected]> >> wrote: >> > >> > Hi there, >> > >> > I noticed the Iceberg specification doesn't address column name case >> sensitivity. I encountered an issue where Glue Iceberg REST converts a >> column named "Foo" to lowercase, which affects other processes relying on >> case-sensitive column name matching. >> > >> > While Iceberg may not explicitly manage collisions, Glue does. This >> leads to errors when creating columns like "Foo" and "foo" in the same >> table due to perceived collisions. >> > >> > I suggest the Iceberg specification include guidance on how >> implementers should handle column name case sensitivity. >> > >> > What do you think? >> > >> > Nicolae >> >
