(beam) branch master updated: Update managed-io.md for release 2.71.0-RC3. (#37333)

damccorm Thu, 22 Jan 2026 06:39:36 -0800

This is an automated email from the ASF dual-hosted git repository.

damccorm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git



The following commit(s) were added to refs/heads/master by this push:
     new 7eaf66c8feb Update managed-io.md for release 2.71.0-RC3. (#37333)
7eaf66c8feb is described below

commit 7eaf66c8feb7bac309e4e7100316dd8875613baf
Author: Danny McCormick <[email protected]>
AuthorDate: Thu Jan 22 09:39:23 2026 -0500

    Update managed-io.md for release 2.71.0-RC3. (#37333)
    
    Co-authored-by: damccorm <actions@GitHub Actions 1010719742.local>
---
 .../site/content/en/documentation/io/managed-io.md | 1022 +++++++-------------
 1 file changed, 373 insertions(+), 649 deletions(-)

diff --git a/website/www/site/content/en/documentation/io/managed-io.md 
b/website/www/site/content/en/documentation/io/managed-io.md
index ced0443c695..fab9e79e71a 100644
--- a/website/www/site/content/en/documentation/io/managed-io.md
+++ b/website/www/site/content/en/documentation/io/managed-io.md
@@ -58,6 +58,31 @@ and Beam SQL is invoked via the Managed API under the hood.
       <th>Read Configuration</th>
       <th>Write Configuration</th>
     </tr>
+    <tr>
+      <td><strong>ICEBERG</strong></td>
+      <td>
+        <strong>table</strong> (<code style="color: green">str</code>)<br>
+        catalog_name (<code style="color: green">str</code>)<br>
+        catalog_properties (<code>map[<span style="color: green;">str</span>, 
<span style="color: green;">str</span>]</code>)<br>
+        config_properties (<code>map[<span style="color: green;">str</span>, 
<span style="color: green;">str</span>]</code>)<br>
+        drop (<code>list[<span style="color: green;">str</span>]</code>)<br>
+        filter (<code style="color: green">str</code>)<br>
+        keep (<code>list[<span style="color: green;">str</span>]</code>)<br>
+      </td>
+      <td>
+        <strong>table</strong> (<code style="color: green">str</code>)<br>
+        catalog_name (<code style="color: green">str</code>)<br>
+        catalog_properties (<code>map[<span style="color: green;">str</span>, 
<span style="color: green;">str</span>]</code>)<br>
+        config_properties (<code>map[<span style="color: green;">str</span>, 
<span style="color: green;">str</span>]</code>)<br>
+        direct_write_byte_limit (<code style="color: #f54251">int32</code>)<br>
+        drop (<code>list[<span style="color: green;">str</span>]</code>)<br>
+        keep (<code>list[<span style="color: green;">str</span>]</code>)<br>
+        only (<code style="color: green">str</code>)<br>
+        partition_fields (<code>list[<span style="color: 
green;">str</span>]</code>)<br>
+        table_properties (<code>map[<span style="color: green;">str</span>, 
<span style="color: green;">str</span>]</code>)<br>
+        triggering_frequency_seconds (<code style="color: 
#f54251">int32</code>)<br>
+      </td>
+    </tr>
     <tr>
       <td><strong>KAFKA</strong></td>
       <td>
@@ -86,31 +111,6 @@ and Beam SQL is invoked via the Managed API under the hood.
         schema (<code style="color: green">str</code>)<br>
       </td>
     </tr>
-    <tr>
-      <td><strong>ICEBERG</strong></td>
-      <td>
-        <strong>table</strong> (<code style="color: green">str</code>)<br>
-        catalog_name (<code style="color: green">str</code>)<br>
-        catalog_properties (<code>map[<span style="color: green;">str</span>, 
<span style="color: green;">str</span>]</code>)<br>
-        config_properties (<code>map[<span style="color: green;">str</span>, 
<span style="color: green;">str</span>]</code>)<br>
-        drop (<code>list[<span style="color: green;">str</span>]</code>)<br>
-        filter (<code style="color: green">str</code>)<br>
-        keep (<code>list[<span style="color: green;">str</span>]</code>)<br>
-      </td>
-      <td>
-        <strong>table</strong> (<code style="color: green">str</code>)<br>
-        catalog_name (<code style="color: green">str</code>)<br>
-        catalog_properties (<code>map[<span style="color: green;">str</span>, 
<span style="color: green;">str</span>]</code>)<br>
-        config_properties (<code>map[<span style="color: green;">str</span>, 
<span style="color: green;">str</span>]</code>)<br>
-        direct_write_byte_limit (<code style="color: #f54251">int32</code>)<br>
-        drop (<code>list[<span style="color: green;">str</span>]</code>)<br>
-        keep (<code>list[<span style="color: green;">str</span>]</code>)<br>
-        only (<code style="color: green">str</code>)<br>
-        partition_fields (<code>list[<span style="color: 
green;">str</span>]</code>)<br>
-        table_properties (<code>map[<span style="color: green;">str</span>, 
<span style="color: green;">str</span>]</code>)<br>
-        triggering_frequency_seconds (<code style="color: 
#f54251">int32</code>)<br>
-      </td>
-    </tr>
     <tr>
       <td><strong>ICEBERG_CDC</strong></td>
       <td>
@@ -134,34 +134,12 @@ and Beam SQL is invoked via the Managed API under the 
hood.
       </td>
     </tr>
     <tr>
-      <td><strong>BIGQUERY</strong></td>
-      <td>
-        kms_key (<code style="color: green">str</code>)<br>
-        query (<code style="color: green">str</code>)<br>
-        row_restriction (<code style="color: green">str</code>)<br>
-        fields (<code>list[<span style="color: green;">str</span>]</code>)<br>
-        table (<code style="color: green">str</code>)<br>
-      </td>
-      <td>
-        <strong>table</strong> (<code style="color: green">str</code>)<br>
-        drop (<code>list[<span style="color: green;">str</span>]</code>)<br>
-        keep (<code>list[<span style="color: green;">str</span>]</code>)<br>
-        kms_key (<code style="color: green">str</code>)<br>
-        only (<code style="color: green">str</code>)<br>
-        triggering_frequency_seconds (<code style="color: 
#f54251">int64</code>)<br>
-      </td>
-    </tr>
-    <tr>
-      <td><strong>POSTGRES</strong></td>
+      <td><strong>SQLSERVER</strong></td>
       <td>
         <strong>jdbc_url</strong> (<code style="color: green">str</code>)<br>
-        connection_init_sql (<code>list[<span style="color: 
green;">str</span>]</code>)<br>
         connection_properties (<code style="color: green">str</code>)<br>
         disable_auto_commit (<code style="color: orange">boolean</code>)<br>
-        driver_class_name (<code style="color: green">str</code>)<br>
-        driver_jars (<code style="color: green">str</code>)<br>
         fetch_size (<code style="color: #f54251">int32</code>)<br>
-        jdbc_type (<code style="color: green">str</code>)<br>
         location (<code style="color: green">str</code>)<br>
         num_partitions (<code style="color: #f54251">int32</code>)<br>
         output_parallelization (<code style="color: orange">boolean</code>)<br>
@@ -174,11 +152,7 @@ and Beam SQL is invoked via the Managed API under the hood.
         <strong>jdbc_url</strong> (<code style="color: green">str</code>)<br>
         autosharding (<code style="color: orange">boolean</code>)<br>
         batch_size (<code style="color: #f54251">int64</code>)<br>
-        connection_init_sql (<code>list[<span style="color: 
green;">str</span>]</code>)<br>
         connection_properties (<code style="color: green">str</code>)<br>
-        driver_class_name (<code style="color: green">str</code>)<br>
-        driver_jars (<code style="color: green">str</code>)<br>
-        jdbc_type (<code style="color: green">str</code>)<br>
         location (<code style="color: green">str</code>)<br>
         password (<code style="color: green">str</code>)<br>
         username (<code style="color: green">str</code>)<br>
@@ -186,16 +160,13 @@ and Beam SQL is invoked via the Managed API under the 
hood.
       </td>
     </tr>
     <tr>
-      <td><strong>SQLSERVER</strong></td>
+      <td><strong>MYSQL</strong></td>
       <td>
         <strong>jdbc_url</strong> (<code style="color: green">str</code>)<br>
         connection_init_sql (<code>list[<span style="color: 
green;">str</span>]</code>)<br>
         connection_properties (<code style="color: green">str</code>)<br>
         disable_auto_commit (<code style="color: orange">boolean</code>)<br>
-        driver_class_name (<code style="color: green">str</code>)<br>
-        driver_jars (<code style="color: green">str</code>)<br>
         fetch_size (<code style="color: #f54251">int32</code>)<br>
-        jdbc_type (<code style="color: green">str</code>)<br>
         location (<code style="color: green">str</code>)<br>
         num_partitions (<code style="color: #f54251">int32</code>)<br>
         output_parallelization (<code style="color: orange">boolean</code>)<br>
@@ -210,9 +181,6 @@ and Beam SQL is invoked via the Managed API under the hood.
         batch_size (<code style="color: #f54251">int64</code>)<br>
         connection_init_sql (<code>list[<span style="color: 
green;">str</span>]</code>)<br>
         connection_properties (<code style="color: green">str</code>)<br>
-        driver_class_name (<code style="color: green">str</code>)<br>
-        driver_jars (<code style="color: green">str</code>)<br>
-        jdbc_type (<code style="color: green">str</code>)<br>
         location (<code style="color: green">str</code>)<br>
         password (<code style="color: green">str</code>)<br>
         username (<code style="color: green">str</code>)<br>
@@ -220,16 +188,29 @@ and Beam SQL is invoked via the Managed API under the 
hood.
       </td>
     </tr>
     <tr>
-      <td><strong>MYSQL</strong></td>
+      <td><strong>BIGQUERY</strong></td>
+      <td>
+        kms_key (<code style="color: green">str</code>)<br>
+        query (<code style="color: green">str</code>)<br>
+        row_restriction (<code style="color: green">str</code>)<br>
+        fields (<code>list[<span style="color: green;">str</span>]</code>)<br>
+        table (<code style="color: green">str</code>)<br>
+      </td>
+      <td>
+        <strong>table</strong> (<code style="color: green">str</code>)<br>
+        drop (<code>list[<span style="color: green;">str</span>]</code>)<br>
+        keep (<code>list[<span style="color: green;">str</span>]</code>)<br>
+        kms_key (<code style="color: green">str</code>)<br>
+        only (<code style="color: green">str</code>)<br>
+        triggering_frequency_seconds (<code style="color: 
#f54251">int64</code>)<br>
+      </td>
+    </tr>
+    <tr>
+      <td><strong>POSTGRES</strong></td>
       <td>
         <strong>jdbc_url</strong> (<code style="color: green">str</code>)<br>
-        connection_init_sql (<code>list[<span style="color: 
green;">str</span>]</code>)<br>
         connection_properties (<code style="color: green">str</code>)<br>
-        disable_auto_commit (<code style="color: orange">boolean</code>)<br>
-        driver_class_name (<code style="color: green">str</code>)<br>
-        driver_jars (<code style="color: green">str</code>)<br>
         fetch_size (<code style="color: #f54251">int32</code>)<br>
-        jdbc_type (<code style="color: green">str</code>)<br>
         location (<code style="color: green">str</code>)<br>
         num_partitions (<code style="color: #f54251">int32</code>)<br>
         output_parallelization (<code style="color: orange">boolean</code>)<br>
@@ -242,11 +223,7 @@ and Beam SQL is invoked via the Managed API under the hood.
         <strong>jdbc_url</strong> (<code style="color: green">str</code>)<br>
         autosharding (<code style="color: orange">boolean</code>)<br>
         batch_size (<code style="color: #f54251">int64</code>)<br>
-        connection_init_sql (<code>list[<span style="color: 
green;">str</span>]</code>)<br>
         connection_properties (<code style="color: green">str</code>)<br>
-        driver_class_name (<code style="color: green">str</code>)<br>
-        driver_jars (<code style="color: green">str</code>)<br>
-        jdbc_type (<code style="color: green">str</code>)<br>
         location (<code style="color: green">str</code>)<br>
         password (<code style="color: green">str</code>)<br>
         username (<code style="color: green">str</code>)<br>
@@ -258,7 +235,7 @@ and Beam SQL is invoked via the Managed API under the hood.
 
 ## Configuration Details
 
-### `KAFKA` Write
+### `ICEBERG` Write
 
 <div class="table-container-wrapper">
   <table class="table table-bordered">
@@ -269,245 +246,135 @@ and Beam SQL is invoked via the Managed API under the 
hood.
     </tr>
     <tr>
       <td>
-        <strong>bootstrap_servers</strong>
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        A list of host/port pairs to use for establishing the initial 
connection to the Kafka cluster. The client will make use of all servers 
irrespective of which servers are specified here for bootstrapping—this list 
only impacts the initial hosts used to discover the full set of servers. | 
Format: host1:port1,host2:port2,...
-      </td>
-    </tr>
-    <tr>
-      <td>
-        <strong>format</strong>
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        The encoding format for the data stored in Kafka. Valid options are: 
RAW,JSON,AVRO,PROTO
-      </td>
-    </tr>
-    <tr>
-      <td>
-        <strong>topic</strong>
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        n/a
-      </td>
-    </tr>
-    <tr>
-      <td>
-        file_descriptor_path
+        <strong>table</strong>
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        The path to the Protocol Buffer File Descriptor Set file. This file is 
used for schema definition and message serialization.
+        A fully-qualified table identifier. You may also provide a template to 
write to multiple dynamic destinations, for example: 
`dataset.my_{col1}_{col2.nested}_table`.
       </td>
     </tr>
     <tr>
       <td>
-        message_name
+        catalog_name
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        The name of the Protocol Buffer message to be used for schema 
extraction and data conversion.
+        Name of the catalog containing the table.
       </td>
     </tr>
     <tr>
       <td>
-        producer_config_updates
+        catalog_properties
       </td>
       <td>
         <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
       </td>
       <td>
-        A list of key-value pairs that act as configuration parameters for 
Kafka producers. Most of these configurations will not be needed, but if you 
need to customize your Kafka producer, you may use this. See a detailed list: 
https://docs.confluent.io/platform/current/installation/configuration/producer-configs.html
+        Properties used to set up the Iceberg catalog.
       </td>
     </tr>
     <tr>
       <td>
-        schema
+        config_properties
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
       </td>
       <td>
-        n/a
+        Properties passed to the Hadoop Configuration.
       </td>
     </tr>
-  </table>
-</div>
-
-### `KAFKA` Read
-
-<div class="table-container-wrapper">
-  <table class="table table-bordered">
-    <tr>
-      <th>Configuration</th>
-      <th>Type</th>
-      <th>Description</th>
-    </tr>
     <tr>
       <td>
-        <strong>bootstrap_servers</strong>
+        direct_write_byte_limit
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: #f54251">int32</code>
       </td>
       <td>
-        A list of host/port pairs to use for establishing the initial 
connection to the Kafka cluster. The client will make use of all servers 
irrespective of which servers are specified here for bootstrapping—this list 
only impacts the initial hosts used to discover the full set of servers. This 
list should be in the form `host1:port1,host2:port2,...`
+        For a streaming pipeline, sets the limit for lifting bundles into the 
direct write path.
       </td>
     </tr>
     <tr>
       <td>
-        <strong>topic</strong>
+        drop
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code>list[<span style="color: green;">str</span>]</code>
       </td>
       <td>
-        n/a
+        A list of field names to drop from the input record before writing. Is 
mutually exclusive with 'keep' and 'only'.
       </td>
     </tr>
     <tr>
       <td>
-        allow_duplicates
+        keep
       </td>
       <td>
-        <code style="color: orange">boolean</code>
+        <code>list[<span style="color: green;">str</span>]</code>
       </td>
       <td>
-        If the Kafka read allows duplicates.
+        A list of field names to keep in the input record. All other fields 
are dropped before writing. Is mutually exclusive with 'drop' and 'only'.
       </td>
     </tr>
     <tr>
       <td>
-        confluent_schema_registry_subject
+        only
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        n/a
+        The name of a single record field that should be written. Is mutually 
exclusive with 'keep' and 'drop'.
       </td>
     </tr>
     <tr>
       <td>
-        confluent_schema_registry_url
+        partition_fields
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code>list[<span style="color: green;">str</span>]</code>
       </td>
       <td>
-        n/a
+        Fields used to create a partition spec that is applied when tables are 
created. For a field 'foo', the available partition transforms are:
+
+- `foo`
+- `truncate(foo, N)`
+- `bucket(foo, N)`
+- `hour(foo)`
+- `day(foo)`
+- `month(foo)`
+- `year(foo)`
+- `void(foo)`
+
+For more information on partition transforms, please visit 
https://iceberg.apache.org/spec/#partition-transforms.
       </td>
     </tr>
     <tr>
       <td>
-        consumer_config_updates
+        table_properties
       </td>
       <td>
         <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
       </td>
       <td>
-        A list of key-value pairs that act as configuration parameters for 
Kafka consumers. Most of these configurations will not be needed, but if you 
need to customize your Kafka consumer, you may use this. See a detailed list: 
https://docs.confluent.io/platform/current/installation/configuration/consumer-configs.html
-      </td>
-    </tr>
-    <tr>
-      <td>
-        file_descriptor_path
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        The path to the Protocol Buffer File Descriptor Set file. This file is 
used for schema definition and message serialization.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        format
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        The encoding format for the data stored in Kafka. Valid options are: 
RAW,STRING,AVRO,JSON,PROTO
-      </td>
-    </tr>
-    <tr>
-      <td>
-        message_name
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        The name of the Protocol Buffer message to be used for schema 
extraction and data conversion.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        offset_deduplication
-      </td>
-      <td>
-        <code style="color: orange">boolean</code>
-      </td>
-      <td>
-        If the redistribute is using offset deduplication mode.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        redistribute_by_record_key
-      </td>
-      <td>
-        <code style="color: orange">boolean</code>
-      </td>
-      <td>
-        If the redistribute keys by the Kafka record key.
+        Iceberg table properties to be set on the table when it is created.
+For more information on table properties, please visit 
https://iceberg.apache.org/docs/latest/configuration/#table-properties.
       </td>
     </tr>
     <tr>
       <td>
-        redistribute_num_keys
+        triggering_frequency_seconds
       </td>
       <td>
         <code style="color: #f54251">int32</code>
       </td>
       <td>
-        The number of keys for redistributing Kafka inputs.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        redistributed
-      </td>
-      <td>
-        <code style="color: orange">boolean</code>
-      </td>
-      <td>
-        If the Kafka read should be redistributed.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        schema
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        The schema in which the data is encoded in the Kafka topic. For AVRO 
data, this is a schema defined with AVRO schema syntax 
(https://avro.apache.org/docs/1.10.2/spec.html#schemas). For JSON data, this is 
a schema defined with JSON-schema syntax (https://json-schema.org/). If a URL 
to Confluent Schema Registry is provided, then this field is ignored, and the 
schema is fetched from Confluent Schema Registry.
+        For a streaming pipeline, sets the frequency at which snapshots are 
produced.
       </td>
     </tr>
   </table>
@@ -602,7 +469,7 @@ and Beam SQL is invoked via the Managed API under the hood.
   </table>
 </div>
 
-### `ICEBERG` Write
+### `KAFKA` Read
 
 <div class="table-container-wrapper">
   <table class="table table-bordered">
@@ -613,307 +480,162 @@ and Beam SQL is invoked via the Managed API under the 
hood.
     </tr>
     <tr>
       <td>
-        <strong>table</strong>
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        A fully-qualified table identifier. You may also provide a template to 
write to multiple dynamic destinations, for example: 
`dataset.my_{col1}_{col2.nested}_table`.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        catalog_name
+        <strong>bootstrap_servers</strong>
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Name of the catalog containing the table.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        catalog_properties
-      </td>
-      <td>
-        <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
-      </td>
-      <td>
-        Properties used to set up the Iceberg catalog.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        config_properties
-      </td>
-      <td>
-        <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
-      </td>
-      <td>
-        Properties passed to the Hadoop Configuration.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        direct_write_byte_limit
-      </td>
-      <td>
-        <code style="color: #f54251">int32</code>
-      </td>
-      <td>
-        For a streaming pipeline, sets the limit for lifting bundles into the 
direct write path.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        drop
-      </td>
-      <td>
-        <code>list[<span style="color: green;">str</span>]</code>
-      </td>
-      <td>
-        A list of field names to drop from the input record before writing. Is 
mutually exclusive with 'keep' and 'only'.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        keep
-      </td>
-      <td>
-        <code>list[<span style="color: green;">str</span>]</code>
-      </td>
-      <td>
-        A list of field names to keep in the input record. All other fields 
are dropped before writing. Is mutually exclusive with 'drop' and 'only'.
+        A list of host/port pairs to use for establishing the initial 
connection to the Kafka cluster. The client will make use of all servers 
irrespective of which servers are specified here for bootstrapping—this list 
only impacts the initial hosts used to discover the full set of servers. This 
list should be in the form `host1:port1,host2:port2,...`
       </td>
     </tr>
     <tr>
       <td>
-        only
+        <strong>topic</strong>
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        The name of a single record field that should be written. Is mutually 
exclusive with 'keep' and 'drop'.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        partition_fields
-      </td>
-      <td>
-        <code>list[<span style="color: green;">str</span>]</code>
-      </td>
-      <td>
-        Fields used to create a partition spec that is applied when tables are 
created. For a field 'foo', the available partition transforms are:
-
-- `foo`
-- `truncate(foo, N)`
-- `bucket(foo, N)`
-- `hour(foo)`
-- `day(foo)`
-- `month(foo)`
-- `year(foo)`
-- `void(foo)`
-
-For more information on partition transforms, please visit 
https://iceberg.apache.org/spec/#partition-transforms.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        table_properties
-      </td>
-      <td>
-        <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
-      </td>
-      <td>
-        Iceberg table properties to be set on the table when it is created.
-For more information on table properties, please visit 
https://iceberg.apache.org/docs/latest/configuration/#table-properties.
+        n/a
       </td>
     </tr>
     <tr>
       <td>
-        triggering_frequency_seconds
+        allow_duplicates
       </td>
       <td>
-        <code style="color: #f54251">int32</code>
+        <code style="color: orange">boolean</code>
       </td>
       <td>
-        For a streaming pipeline, sets the frequency at which snapshots are 
produced.
+        If the Kafka read allows duplicates.
       </td>
     </tr>
-  </table>
-</div>
-
-### `ICEBERG_CDC` Read
-
-<div class="table-container-wrapper">
-  <table class="table table-bordered">
-    <tr>
-      <th>Configuration</th>
-      <th>Type</th>
-      <th>Description</th>
-    </tr>
     <tr>
       <td>
-        <strong>table</strong>
+        confluent_schema_registry_subject
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Identifier of the Iceberg table.
+        n/a
       </td>
     </tr>
     <tr>
       <td>
-        catalog_name
+        confluent_schema_registry_url
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Name of the catalog containing the table.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        catalog_properties
-      </td>
-      <td>
-        <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
-      </td>
-      <td>
-        Properties used to set up the Iceberg catalog.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        config_properties
-      </td>
-      <td>
-        <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
-      </td>
-      <td>
-        Properties passed to the Hadoop Configuration.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        drop
-      </td>
-      <td>
-        <code>list[<span style="color: green;">str</span>]</code>
-      </td>
-      <td>
-        A subset of column names to exclude from reading. If null or empty, 
all columns will be read.
+        n/a
       </td>
     </tr>
     <tr>
       <td>
-        filter
+        consumer_config_updates
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
       </td>
       <td>
-        SQL-like predicate to filter data at scan time. Example: "id > 5 AND 
status = 'ACTIVE'". Uses Apache Calcite syntax: 
https://calcite.apache.org/docs/reference.html
+        A list of key-value pairs that act as configuration parameters for 
Kafka consumers. Most of these configurations will not be needed, but if you 
need to customize your Kafka consumer, you may use this. See a detailed list: 
https://docs.confluent.io/platform/current/installation/configuration/consumer-configs.html
       </td>
     </tr>
     <tr>
       <td>
-        from_snapshot
+        file_descriptor_path
       </td>
       <td>
-        <code style="color: #f54251">int64</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        Starts reading from this snapshot ID (inclusive).
+        The path to the Protocol Buffer File Descriptor Set file. This file is 
used for schema definition and message serialization.
       </td>
     </tr>
     <tr>
       <td>
-        from_timestamp
+        format
       </td>
       <td>
-        <code style="color: #f54251">int64</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        Starts reading from the first snapshot (inclusive) that was created 
after this timestamp (in milliseconds).
+        The encoding format for the data stored in Kafka. Valid options are: 
RAW,STRING,AVRO,JSON,PROTO
       </td>
     </tr>
     <tr>
       <td>
-        keep
+        message_name
       </td>
       <td>
-        <code>list[<span style="color: green;">str</span>]</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        A subset of column names to read exclusively. If null or empty, all 
columns will be read.
+        The name of the Protocol Buffer message to be used for schema 
extraction and data conversion.
       </td>
     </tr>
     <tr>
       <td>
-        poll_interval_seconds
+        offset_deduplication
       </td>
       <td>
-        <code style="color: #f54251">int32</code>
+        <code style="color: orange">boolean</code>
       </td>
       <td>
-        The interval at which to poll for new snapshots. Defaults to 60 
seconds.
+        If the redistribute is using offset deduplication mode.
       </td>
     </tr>
     <tr>
       <td>
-        starting_strategy
+        redistribute_by_record_key
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: orange">boolean</code>
       </td>
       <td>
-        The source's starting strategy. Valid options are: "earliest" or 
"latest". Can be overriden by setting a starting snapshot or timestamp. 
Defaults to earliest for batch, and latest for streaming.
+        If the redistribute keys by the Kafka record key.
       </td>
     </tr>
     <tr>
       <td>
-        streaming
+        redistribute_num_keys
       </td>
       <td>
-        <code style="color: orange">boolean</code>
+        <code style="color: #f54251">int32</code>
       </td>
       <td>
-        Enables streaming reads, where source continuously polls for snapshots 
forever.
+        The number of keys for redistributing Kafka inputs.
       </td>
     </tr>
     <tr>
       <td>
-        to_snapshot
+        redistributed
       </td>
       <td>
-        <code style="color: #f54251">int64</code>
+        <code style="color: orange">boolean</code>
       </td>
       <td>
-        Reads up to this snapshot ID (inclusive).
+        If the Kafka read should be redistributed.
       </td>
     </tr>
     <tr>
       <td>
-        to_timestamp
+        schema
       </td>
       <td>
-        <code style="color: #f54251">int64</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        Reads up to the latest snapshot (inclusive) created before this 
timestamp (in milliseconds).
+        The schema in which the data is encoded in the Kafka topic. For AVRO 
data, this is a schema defined with AVRO schema syntax 
(https://avro.apache.org/docs/1.10.2/spec.html#schemas). For JSON data, this is 
a schema defined with JSON-schema syntax (https://json-schema.org/). If a URL 
to Confluent Schema Registry is provided, then this field is ignored, and the 
schema is fetched from Confluent Schema Registry.
       </td>
     </tr>
   </table>
 </div>
 
-### `BIGQUERY` Read
+### `KAFKA` Write
 
 <div class="table-container-wrapper">
   <table class="table table-bordered">
@@ -924,63 +646,85 @@ For more information on table properties, please visit 
https://iceberg.apache.or
     </tr>
     <tr>
       <td>
-        kms_key
+        <strong>bootstrap_servers</strong>
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Use this Cloud KMS key to encrypt your data
+        A list of host/port pairs to use for establishing the initial 
connection to the Kafka cluster. The client will make use of all servers 
irrespective of which servers are specified here for bootstrapping—this list 
only impacts the initial hosts used to discover the full set of servers. | 
Format: host1:port1,host2:port2,...
       </td>
     </tr>
     <tr>
       <td>
-        query
+        <strong>format</strong>
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        The SQL query to be executed to read from the BigQuery table.
+        The encoding format for the data stored in Kafka. Valid options are: 
RAW,JSON,AVRO,PROTO
       </td>
     </tr>
     <tr>
       <td>
-        row_restriction
+        <strong>topic</strong>
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Read only rows that match this filter, which must be compatible with 
Google standard SQL. This is not supported when reading via query.
+        n/a
       </td>
     </tr>
     <tr>
       <td>
-        fields
+        file_descriptor_path
       </td>
       <td>
-        <code>list[<span style="color: green;">str</span>]</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        Read only the specified fields (columns) from a BigQuery table. Fields 
may not be returned in the order specified. If no value is specified, then all 
fields are returned. Example: "col1, col2, col3"
+        The path to the Protocol Buffer File Descriptor Set file. This file is 
used for schema definition and message serialization.
       </td>
     </tr>
     <tr>
       <td>
-        table
+        message_name
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        The fully-qualified name of the BigQuery table to read from. Format: 
[${PROJECT}:]${DATASET}.${TABLE}
+        The name of the Protocol Buffer message to be used for schema 
extraction and data conversion.
+      </td>
+    </tr>
+    <tr>
+      <td>
+        producer_config_updates
+      </td>
+      <td>
+        <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
+      </td>
+      <td>
+        A list of key-value pairs that act as configuration parameters for 
Kafka producers. Most of these configurations will not be needed, but if you 
need to customize your Kafka producer, you may use this. See a detailed list: 
https://docs.confluent.io/platform/current/installation/configuration/producer-configs.html
+      </td>
+    </tr>
+    <tr>
+      <td>
+        schema
+      </td>
+      <td>
+        <code style="color: green">str</code>
+      </td>
+      <td>
+        n/a
       </td>
     </tr>
   </table>
 </div>
 
-### `BIGQUERY` Write
+### `ICEBERG_CDC` Read
 
 <div class="table-container-wrapper">
   <table class="table table-bordered">
@@ -997,306 +741,306 @@ For more information on table properties, please visit 
https://iceberg.apache.or
         <code style="color: green">str</code>
       </td>
       <td>
-        The bigquery table to write to. Format: 
[${PROJECT}:]${DATASET}.${TABLE}
+        Identifier of the Iceberg table.
       </td>
     </tr>
     <tr>
       <td>
-        drop
+        catalog_name
       </td>
       <td>
-        <code>list[<span style="color: green;">str</span>]</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        A list of field names to drop from the input record before writing. Is 
mutually exclusive with 'keep' and 'only'.
+        Name of the catalog containing the table.
       </td>
     </tr>
     <tr>
       <td>
-        keep
+        catalog_properties
       </td>
       <td>
-        <code>list[<span style="color: green;">str</span>]</code>
+        <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
       </td>
       <td>
-        A list of field names to keep in the input record. All other fields 
are dropped before writing. Is mutually exclusive with 'drop' and 'only'.
+        Properties used to set up the Iceberg catalog.
       </td>
     </tr>
     <tr>
       <td>
-        kms_key
+        config_properties
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
       </td>
       <td>
-        Use this Cloud KMS key to encrypt your data
+        Properties passed to the Hadoop Configuration.
       </td>
     </tr>
     <tr>
       <td>
-        only
+        drop
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code>list[<span style="color: green;">str</span>]</code>
       </td>
       <td>
-        The name of a single record field that should be written. Is mutually 
exclusive with 'keep' and 'drop'.
+        A subset of column names to exclude from reading. If null or empty, 
all columns will be read.
       </td>
     </tr>
     <tr>
       <td>
-        triggering_frequency_seconds
+        filter
       </td>
       <td>
-        <code style="color: #f54251">int64</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        Determines how often to 'commit' progress into BigQuery. Default is 
every 5 seconds.
+        SQL-like predicate to filter data at scan time. Example: "id > 5 AND 
status = 'ACTIVE'". Uses Apache Calcite syntax: 
https://calcite.apache.org/docs/reference.html
       </td>
     </tr>
-  </table>
-</div>
-
-### `POSTGRES` Write
-
-<div class="table-container-wrapper">
-  <table class="table table-bordered">
-    <tr>
-      <th>Configuration</th>
-      <th>Type</th>
-      <th>Description</th>
-    </tr>
     <tr>
       <td>
-        <strong>jdbc_url</strong>
+        from_snapshot
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: #f54251">int64</code>
       </td>
       <td>
-        Connection URL for the JDBC sink.
+        Starts reading from this snapshot ID (inclusive).
       </td>
     </tr>
     <tr>
       <td>
-        autosharding
+        from_timestamp
       </td>
       <td>
-        <code style="color: orange">boolean</code>
+        <code style="color: #f54251">int64</code>
       </td>
       <td>
-        If true, enables using a dynamically determined number of shards to 
write.
+        Starts reading from the first snapshot (inclusive) that was created 
after this timestamp (in milliseconds).
       </td>
     </tr>
     <tr>
       <td>
-        batch_size
+        keep
       </td>
       <td>
-        <code style="color: #f54251">int64</code>
+        <code>list[<span style="color: green;">str</span>]</code>
       </td>
       <td>
-        n/a
+        A subset of column names to read exclusively. If null or empty, all 
columns will be read.
       </td>
     </tr>
     <tr>
       <td>
-        connection_init_sql
+        poll_interval_seconds
       </td>
       <td>
-        <code>list[<span style="color: green;">str</span>]</code>
+        <code style="color: #f54251">int32</code>
       </td>
       <td>
-        Sets the connection init sql statements used by the Driver. Only MySQL 
and MariaDB support this.
+        The interval at which to poll for new snapshots. Defaults to 60 
seconds.
       </td>
     </tr>
     <tr>
       <td>
-        connection_properties
+        starting_strategy
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Used to set connection properties passed to the JDBC driver not 
already defined as standalone parameter (e.g. username and password can be set 
using parameters above accordingly). Format of the string must be 
"key1=value1;key2=value2;".
+        The source's starting strategy. Valid options are: "earliest" or 
"latest". Can be overriden by setting a starting snapshot or timestamp. 
Defaults to earliest for batch, and latest for streaming.
       </td>
     </tr>
     <tr>
       <td>
-        driver_class_name
+        streaming
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: orange">boolean</code>
       </td>
       <td>
-        Name of a Java Driver class to use to connect to the JDBC source. For 
example, "com.mysql.jdbc.Driver".
+        Enables streaming reads, where source continuously polls for snapshots 
forever.
       </td>
     </tr>
     <tr>
       <td>
-        driver_jars
+        to_snapshot
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: #f54251">int64</code>
       </td>
       <td>
-        Comma separated path(s) for the JDBC driver jar(s). This can be a 
local path or GCS (gs://) path.
+        Reads up to this snapshot ID (inclusive).
       </td>
     </tr>
     <tr>
       <td>
-        jdbc_type
+        to_timestamp
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: #f54251">int64</code>
       </td>
       <td>
-        Type of JDBC source. When specified, an appropriate default Driver 
will be packaged with the transform. One of mysql, postgres, oracle, or mssql.
+        Reads up to the latest snapshot (inclusive) created before this 
timestamp (in milliseconds).
       </td>
     </tr>
+  </table>
+</div>
+
+### `SQLSERVER` Write
+
+<div class="table-container-wrapper">
+  <table class="table table-bordered">
+    <tr>
+      <th>Configuration</th>
+      <th>Type</th>
+      <th>Description</th>
+    </tr>
     <tr>
       <td>
-        location
+        <strong>jdbc_url</strong>
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Name of the table to write to.
+        Connection URL for the JDBC sink.
       </td>
     </tr>
     <tr>
       <td>
-        password
+        autosharding
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: orange">boolean</code>
       </td>
       <td>
-        Password for the JDBC source.
+        If true, enables using a dynamically determined number of shards to 
write.
       </td>
     </tr>
     <tr>
       <td>
-        username
+        batch_size
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: #f54251">int64</code>
       </td>
       <td>
-        Username for the JDBC source.
+        n/a
       </td>
     </tr>
     <tr>
       <td>
-        write_statement
+        connection_properties
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        SQL query used to insert records into the JDBC sink.
+        Used to set connection properties passed to the JDBC driver not 
already defined as standalone parameter (e.g. username and password can be set 
using parameters above accordingly). Format of the string must be 
"key1=value1;key2=value2;".
       </td>
     </tr>
-  </table>
-</div>
-
-### `POSTGRES` Read
-
-<div class="table-container-wrapper">
-  <table class="table table-bordered">
-    <tr>
-      <th>Configuration</th>
-      <th>Type</th>
-      <th>Description</th>
-    </tr>
     <tr>
       <td>
-        <strong>jdbc_url</strong>
+        location
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Connection URL for the JDBC source.
+        Name of the table to write to.
       </td>
     </tr>
     <tr>
       <td>
-        connection_init_sql
+        password
       </td>
       <td>
-        <code>list[<span style="color: green;">str</span>]</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        Sets the connection init sql statements used by the Driver. Only MySQL 
and MariaDB support this.
+        Password for the JDBC source.
       </td>
     </tr>
     <tr>
       <td>
-        connection_properties
+        username
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Used to set connection properties passed to the JDBC driver not 
already defined as standalone parameter (e.g. username and password can be set 
using parameters above accordingly). Format of the string must be 
"key1=value1;key2=value2;".
+        Username for the JDBC source.
       </td>
     </tr>
     <tr>
       <td>
-        disable_auto_commit
+        write_statement
       </td>
       <td>
-        <code style="color: orange">boolean</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        Whether to disable auto commit on read. Defaults to true if not 
provided. The need for this config varies depending on the database platform. 
Informix requires this to be set to false while Postgres requires this to be 
set to true.
+        SQL query used to insert records into the JDBC sink.
       </td>
     </tr>
+  </table>
+</div>
+
+### `SQLSERVER` Read
+
+<div class="table-container-wrapper">
+  <table class="table table-bordered">
+    <tr>
+      <th>Configuration</th>
+      <th>Type</th>
+      <th>Description</th>
+    </tr>
     <tr>
       <td>
-        driver_class_name
+        <strong>jdbc_url</strong>
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Name of a Java Driver class to use to connect to the JDBC source. For 
example, "com.mysql.jdbc.Driver".
+        Connection URL for the JDBC source.
       </td>
     </tr>
     <tr>
       <td>
-        driver_jars
+        connection_properties
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Comma separated path(s) for the JDBC driver jar(s). This can be a 
local path or GCS (gs://) path.
+        Used to set connection properties passed to the JDBC driver not 
already defined as standalone parameter (e.g. username and password can be set 
using parameters above accordingly). Format of the string must be 
"key1=value1;key2=value2;".
       </td>
     </tr>
     <tr>
       <td>
-        fetch_size
+        disable_auto_commit
       </td>
       <td>
-        <code style="color: #f54251">int32</code>
+        <code style="color: orange">boolean</code>
       </td>
       <td>
-        This method is used to override the size of the data that is going to 
be fetched and loaded in memory per every database call. It should ONLY be used 
if the default value throws memory errors.
+        Whether to disable auto commit on read. Defaults to true if not 
provided. The need for this config varies depending on the database platform. 
Informix requires this to be set to false while Postgres requires this to be 
set to true.
       </td>
     </tr>
     <tr>
       <td>
-        jdbc_type
+        fetch_size
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: #f54251">int32</code>
       </td>
       <td>
-        Type of JDBC source. When specified, an appropriate default Driver 
will be packaged with the transform. One of mysql, postgres, oracle, or mssql.
+        This method is used to override the size of the data that is going to 
be fetched and loaded in memory per every database call. It should ONLY be used 
if the default value throws memory errors.
       </td>
     </tr>
     <tr>
@@ -1379,7 +1123,7 @@ For more information on table properties, please visit 
https://iceberg.apache.or
   </table>
 </div>
 
-### `SQLSERVER` Read
+### `MYSQL` Read
 
 <div class="table-container-wrapper">
   <table class="table table-bordered">
@@ -1432,28 +1176,6 @@ For more information on table properties, please visit 
https://iceberg.apache.or
         Whether to disable auto commit on read. Defaults to true if not 
provided. The need for this config varies depending on the database platform. 
Informix requires this to be set to false while Postgres requires this to be 
set to true.
       </td>
     </tr>
-    <tr>
-      <td>
-        driver_class_name
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        Name of a Java Driver class to use to connect to the JDBC source. For 
example, "com.mysql.jdbc.Driver".
-      </td>
-    </tr>
-    <tr>
-      <td>
-        driver_jars
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        Comma separated path(s) for the JDBC driver jar(s). This can be a 
local path or GCS (gs://) path.
-      </td>
-    </tr>
     <tr>
       <td>
         fetch_size
@@ -1465,17 +1187,6 @@ For more information on table properties, please visit 
https://iceberg.apache.or
         This method is used to override the size of the data that is going to 
be fetched and loaded in memory per every database call. It should ONLY be used 
if the default value throws memory errors.
       </td>
     </tr>
-    <tr>
-      <td>
-        jdbc_type
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        Type of JDBC source. When specified, an appropriate default Driver 
will be packaged with the transform. One of mysql, postgres, oracle, or mssql.
-      </td>
-    </tr>
     <tr>
       <td>
         location
@@ -1556,7 +1267,7 @@ For more information on table properties, please visit 
https://iceberg.apache.or
   </table>
 </div>
 
-### `SQLSERVER` Write
+### `MYSQL` Write
 
 <div class="table-container-wrapper">
   <table class="table table-bordered">
@@ -1622,223 +1333,258 @@ For more information on table properties, please 
visit https://iceberg.apache.or
     </tr>
     <tr>
       <td>
-        driver_class_name
+        location
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Name of a Java Driver class to use to connect to the JDBC source. For 
example, "com.mysql.jdbc.Driver".
+        Name of the table to write to.
       </td>
     </tr>
     <tr>
       <td>
-        driver_jars
+        password
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Comma separated path(s) for the JDBC driver jar(s). This can be a 
local path or GCS (gs://) path.
+        Password for the JDBC source.
       </td>
     </tr>
     <tr>
       <td>
-        jdbc_type
+        username
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Type of JDBC source. When specified, an appropriate default Driver 
will be packaged with the transform. One of mysql, postgres, oracle, or mssql.
+        Username for the JDBC source.
       </td>
     </tr>
     <tr>
       <td>
-        location
+        write_statement
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Name of the table to write to.
+        SQL query used to insert records into the JDBC sink.
       </td>
     </tr>
+  </table>
+</div>
+
+### `BIGQUERY` Write
+
+<div class="table-container-wrapper">
+  <table class="table table-bordered">
+    <tr>
+      <th>Configuration</th>
+      <th>Type</th>
+      <th>Description</th>
+    </tr>
     <tr>
       <td>
-        password
+        <strong>table</strong>
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Password for the JDBC source.
+        The bigquery table to write to. Format: 
[${PROJECT}:]${DATASET}.${TABLE}
       </td>
     </tr>
     <tr>
       <td>
-        username
+        drop
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code>list[<span style="color: green;">str</span>]</code>
       </td>
       <td>
-        Username for the JDBC source.
+        A list of field names to drop from the input record before writing. Is 
mutually exclusive with 'keep' and 'only'.
       </td>
     </tr>
     <tr>
       <td>
-        write_statement
+        keep
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code>list[<span style="color: green;">str</span>]</code>
       </td>
       <td>
-        SQL query used to insert records into the JDBC sink.
+        A list of field names to keep in the input record. All other fields 
are dropped before writing. Is mutually exclusive with 'drop' and 'only'.
       </td>
     </tr>
-  </table>
-</div>
-
-### `MYSQL` Read
-
-<div class="table-container-wrapper">
-  <table class="table table-bordered">
     <tr>
-      <th>Configuration</th>
-      <th>Type</th>
-      <th>Description</th>
+      <td>
+        kms_key
+      </td>
+      <td>
+        <code style="color: green">str</code>
+      </td>
+      <td>
+        Use this Cloud KMS key to encrypt your data
+      </td>
     </tr>
     <tr>
       <td>
-        <strong>jdbc_url</strong>
+        only
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Connection URL for the JDBC source.
+        The name of a single record field that should be written. Is mutually 
exclusive with 'keep' and 'drop'.
       </td>
     </tr>
     <tr>
       <td>
-        connection_init_sql
+        triggering_frequency_seconds
       </td>
       <td>
-        <code>list[<span style="color: green;">str</span>]</code>
+        <code style="color: #f54251">int64</code>
       </td>
       <td>
-        Sets the connection init sql statements used by the Driver. Only MySQL 
and MariaDB support this.
+        Determines how often to 'commit' progress into BigQuery. Default is 
every 5 seconds.
       </td>
     </tr>
+  </table>
+</div>
+
+### `BIGQUERY` Read
+
+<div class="table-container-wrapper">
+  <table class="table table-bordered">
+    <tr>
+      <th>Configuration</th>
+      <th>Type</th>
+      <th>Description</th>
+    </tr>
     <tr>
       <td>
-        connection_properties
+        kms_key
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Used to set connection properties passed to the JDBC driver not 
already defined as standalone parameter (e.g. username and password can be set 
using parameters above accordingly). Format of the string must be 
"key1=value1;key2=value2;".
+        Use this Cloud KMS key to encrypt your data
       </td>
     </tr>
     <tr>
       <td>
-        disable_auto_commit
+        query
       </td>
       <td>
-        <code style="color: orange">boolean</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        Whether to disable auto commit on read. Defaults to true if not 
provided. The need for this config varies depending on the database platform. 
Informix requires this to be set to false while Postgres requires this to be 
set to true.
+        The SQL query to be executed to read from the BigQuery table.
       </td>
     </tr>
     <tr>
       <td>
-        driver_class_name
+        row_restriction
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Name of a Java Driver class to use to connect to the JDBC source. For 
example, "com.mysql.jdbc.Driver".
+        Read only rows that match this filter, which must be compatible with 
Google standard SQL. This is not supported when reading via query.
       </td>
     </tr>
     <tr>
       <td>
-        driver_jars
+        fields
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code>list[<span style="color: green;">str</span>]</code>
       </td>
       <td>
-        Comma separated path(s) for the JDBC driver jar(s). This can be a 
local path or GCS (gs://) path.
+        Read only the specified fields (columns) from a BigQuery table. Fields 
may not be returned in the order specified. If no value is specified, then all 
fields are returned. Example: "col1, col2, col3"
       </td>
     </tr>
     <tr>
       <td>
-        fetch_size
+        table
       </td>
       <td>
-        <code style="color: #f54251">int32</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        This method is used to override the size of the data that is going to 
be fetched and loaded in memory per every database call. It should ONLY be used 
if the default value throws memory errors.
+        The fully-qualified name of the BigQuery table to read from. Format: 
[${PROJECT}:]${DATASET}.${TABLE}
       </td>
     </tr>
+  </table>
+</div>
+
+### `POSTGRES` Write
+
+<div class="table-container-wrapper">
+  <table class="table table-bordered">
+    <tr>
+      <th>Configuration</th>
+      <th>Type</th>
+      <th>Description</th>
+    </tr>
     <tr>
       <td>
-        jdbc_type
+        <strong>jdbc_url</strong>
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Type of JDBC source. When specified, an appropriate default Driver 
will be packaged with the transform. One of mysql, postgres, oracle, or mssql.
+        Connection URL for the JDBC sink.
       </td>
     </tr>
     <tr>
       <td>
-        location
+        autosharding
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: orange">boolean</code>
       </td>
       <td>
-        Name of the table to read from.
+        If true, enables using a dynamically determined number of shards to 
write.
       </td>
     </tr>
     <tr>
       <td>
-        num_partitions
+        batch_size
       </td>
       <td>
-        <code style="color: #f54251">int32</code>
+        <code style="color: #f54251">int64</code>
       </td>
       <td>
-        The number of partitions
+        n/a
       </td>
     </tr>
     <tr>
       <td>
-        output_parallelization
+        connection_properties
       </td>
       <td>
-        <code style="color: orange">boolean</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        Whether to reshuffle the resulting PCollection so results are 
distributed to all workers.
+        Used to set connection properties passed to the JDBC driver not 
already defined as standalone parameter (e.g. username and password can be set 
using parameters above accordingly). Format of the string must be 
"key1=value1;key2=value2;".
       </td>
     </tr>
     <tr>
       <td>
-        partition_column
+        location
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Name of a column of numeric type that will be used for partitioning.
+        Name of the table to write to.
       </td>
     </tr>
     <tr>
@@ -1854,30 +1600,30 @@ For more information on table properties, please visit 
https://iceberg.apache.or
     </tr>
     <tr>
       <td>
-        read_query
+        username
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        SQL query used to query the JDBC source.
+        Username for the JDBC source.
       </td>
     </tr>
     <tr>
       <td>
-        username
+        write_statement
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Username for the JDBC source.
+        SQL query used to insert records into the JDBC sink.
       </td>
     </tr>
   </table>
 </div>
 
-### `MYSQL` Write
+### `POSTGRES` Read
 
 <div class="table-container-wrapper">
   <table class="table table-bordered">
@@ -1894,95 +1640,73 @@ For more information on table properties, please visit 
https://iceberg.apache.or
         <code style="color: green">str</code>
       </td>
       <td>
-        Connection URL for the JDBC sink.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        autosharding
-      </td>
-      <td>
-        <code style="color: orange">boolean</code>
-      </td>
-      <td>
-        If true, enables using a dynamically determined number of shards to 
write.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        batch_size
-      </td>
-      <td>
-        <code style="color: #f54251">int64</code>
-      </td>
-      <td>
-        n/a
+        Connection URL for the JDBC source.
       </td>
     </tr>
     <tr>
       <td>
-        connection_init_sql
+        connection_properties
       </td>
       <td>
-        <code>list[<span style="color: green;">str</span>]</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        Sets the connection init sql statements used by the Driver. Only MySQL 
and MariaDB support this.
+        Used to set connection properties passed to the JDBC driver not 
already defined as standalone parameter (e.g. username and password can be set 
using parameters above accordingly). Format of the string must be 
"key1=value1;key2=value2;".
       </td>
     </tr>
     <tr>
       <td>
-        connection_properties
+        fetch_size
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: #f54251">int32</code>
       </td>
       <td>
-        Used to set connection properties passed to the JDBC driver not 
already defined as standalone parameter (e.g. username and password can be set 
using parameters above accordingly). Format of the string must be 
"key1=value1;key2=value2;".
+        This method is used to override the size of the data that is going to 
be fetched and loaded in memory per every database call. It should ONLY be used 
if the default value throws memory errors.
       </td>
     </tr>
     <tr>
       <td>
-        driver_class_name
+        location
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Name of a Java Driver class to use to connect to the JDBC source. For 
example, "com.mysql.jdbc.Driver".
+        Name of the table to read from.
       </td>
     </tr>
     <tr>
       <td>
-        driver_jars
+        num_partitions
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: #f54251">int32</code>
       </td>
       <td>
-        Comma separated path(s) for the JDBC driver jar(s). This can be a 
local path or GCS (gs://) path.
+        The number of partitions
       </td>
     </tr>
     <tr>
       <td>
-        jdbc_type
+        output_parallelization
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: orange">boolean</code>
       </td>
       <td>
-        Type of JDBC source. When specified, an appropriate default Driver 
will be packaged with the transform. One of mysql, postgres, oracle, or mssql.
+        Whether to reshuffle the resulting PCollection so results are 
distributed to all workers.
       </td>
     </tr>
     <tr>
       <td>
-        location
+        partition_column
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Name of the table to write to.
+        Name of a column of numeric type that will be used for partitioning.
       </td>
     </tr>
     <tr>
@@ -1998,24 +1722,24 @@ For more information on table properties, please visit 
https://iceberg.apache.or
     </tr>
     <tr>
       <td>
-        username
+        read_query
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Username for the JDBC source.
+        SQL query used to query the JDBC source.
       </td>
     </tr>
     <tr>
       <td>
-        write_statement
+        username
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        SQL query used to insert records into the JDBC sink.
+        Username for the JDBC source.
       </td>
     </tr>
   </table>

(beam) branch master updated: Update managed-io.md for release 2.71.0-RC3. (#37333)

Reply via email to