This seems reasonable to me. One thing that I'm curious about:
>> The immediate motivation is enabling the ADBC data source in Spark ( >> apache/spark#54603 <https://github.com/apache/spark/issues/54603>) >> without hardcoded per-dialect configuration in Spark code, the way the JDBC >> source does today. What are the corresponding definitions in JDBC? (Or should I read this as "the JDBC source currently has to hardcode per-dialect configuration [and we would like to avoid that for ADBC if possible]"?) On Thu, Jun 4, 2026, at 15:09, Tornike Gurgenidze wrote: > Hi, a gentle reminder that the PR's still waiting for a review. > > Thanks, > Torniker > > On Sat, May 16, 2026 at 7:32 AM Tornike Gurgenidze <[email protected]> > wrote: > >> Hi all, >> >> I'd like to propose adding four new SqlInfo codes to FlightSql.proto to >> fill gaps in dialect metadata that clients need when compiling SQL >> per-backend: >> >> - SQL_SUPPORTED_LIMIT_OFFSET (577) — row-limit / offset grammar >> (LIMIT/OFFSET, OFFSET…FETCH, TOP) >> - SQL_SUPPORTED_NULLS_ORDERING (578) — explicit NULLS FIRST / NULLS LAST >> support in ORDER BY (distinct from the existing SQL_NULL_ORDERING (507), >> which reports the server's *default* null ordering) >> - SQL_SUPPORTED_BOOLEAN_LITERAL (579) — accepted boolean literal forms >> (TRUE/FALSE, 1/0) >> - SQL_SUPPORTED_DATETIME_LITERAL (580) — accepted date/time/timestamp >> literal forms (ANSI DATE '…' keyword vs. bare quoted string) >> >> The goal here is intentionally narrow to give clients just enough dialect >> metadata to emit correct SQL for common pushdown operations (predicate >> pushdown, projection pushdown, LIMIT/OFFSET, ORDER BY). It is explicitly >> not an attempt to describe enough of each dialect to support >> general-purpose SQL generation, Substrait is probably the right long-term >> answer for engines that need to push arbitrary plans across backends. These >> codes are a pragmatic solution for the much smaller surface area that >> pushdown requires. >> >> All four are int32 bitmasks (not scalar enums), following the existing >> SQL_SUPPORTED_GROUP_BY / SupportedSqlGrammar convention — dialects >> frequently accept multiple forms (e.g. PostgreSQL supports both >> LIMIT/OFFSET and OFFSET/FETCH; MySQL accepts both TRUE/FALSE and 1/0). The >> accompanying enums are intentionally minimal — just enough for current use >> cases. >> >> The immediate motivation is enabling the ADBC data source in Spark ( >> apache/spark#54603 <https://github.com/apache/spark/issues/54603>) >> without hardcoded per-dialect configuration in Spark code, the way the JDBC >> source does today. Since ADBC reuses Flight SQL's SqlInfo codes, the change >> applies to both. >> >> - Issue: https://github.com/apache/arrow/issues/49792 >> - PR: https://github.com/apache/arrow/pull/49796 >> >> Thanks, >> Tornike >>
