Hi dev@, A quick update about a new Beam SQL feature.
In short, we have wired up the support for plugging table providers through Beam SQL API to allow obtaining table schemas from external sources. *What does it even mean?* Previously, in Java pipelines, you could apply a Beam SQL query to existing PCollections. We have a special SqlTransform to do that, it converts a SQL query to an equivalent PTransform that is applied to the PCollection of Rows . One major inconvenience in this approach is that to query something, it has to be a PCollection. I.e. you have to read the data from a specific source and then convert it to rows. Which can mean multiple complications, like potentially manually converting schemas from source to Beam, or having a completely different logic when changing the source. The new API allows you to plug a schema provider that can resolve the tables and schemas automatically if they already exist somewhere else. This way Beam SQL, with the help of the provider, does the table lookup, then IO configuration, and then schema conversion if needed. As an example, here's a query <https://github.com/apache/beam/blob/116600f32013620e748723b8022a7023fa8e2528/sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlHiveSchemaTest.java#L175,L190>[1] that joins 2 existing PCollections with a table from Hive using HCatalogTableProvider. Hive table lookup is automatic, the table provider in this case will resolve the tables by talking to Hive Metastore and will read the data by configuring and applying the HCatalogIO, converting the records to Rows under the hood. *What's the status of this?* This is a working implementation, but the development is still ongoing, there are bugs, API might change, and there are few more things I can see coming related to this after further design discussions: * refactor of the underlying table/metadata provider code; * working out the design for supporting creating / updating the tables in the metadata provider; * creating a DDL syntax for it; * creating more providers; [1] https://github.com/apache/beam/blob/116600f32013620e748723b8022a7023fa8e2528/sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlHiveSchemaTest.java#L175,L190
