(beam) 01/01: Update managed-io.md for release 2.71.0-RC2.

damccorm Wed, 14 Jan 2026 13:02:12 -0800

This is an automated email from the ASF dual-hosted git repository.

damccorm pushed a commit to branch updates_managed_io_docs_2.71.0_rc2
in repository https://gitbox.apache.org/repos/asf/beam.git


commit f381a58bc0e9affec2a9a7fdbdb5b5c4f56d9902
Author: damccorm <actions@GitHub Actions 1010597242.local>
AuthorDate: Wed Jan 14 21:01:48 2026 +0000

    Update managed-io.md for release 2.71.0-RC2.
---
 .../site/content/en/documentation/io/managed-io.md | 958 ++++++++-------------
 1 file changed, 341 insertions(+), 617 deletions(-)

diff --git a/website/www/site/content/en/documentation/io/managed-io.md 
b/website/www/site/content/en/documentation/io/managed-io.md
index ced0443c695..f93d27fc44c 100644
--- a/website/www/site/content/en/documentation/io/managed-io.md
+++ b/website/www/site/content/en/documentation/io/managed-io.md
@@ -59,31 +59,25 @@ and Beam SQL is invoked via the Managed API under the hood.
       <th>Write Configuration</th>
     </tr>
     <tr>
-      <td><strong>KAFKA</strong></td>
+      <td><strong>ICEBERG_CDC</strong></td>
       <td>
-        <strong>bootstrap_servers</strong> (<code style="color: 
green">str</code>)<br>
-        <strong>topic</strong> (<code style="color: green">str</code>)<br>
-        allow_duplicates (<code style="color: orange">boolean</code>)<br>
-        confluent_schema_registry_subject (<code style="color: 
green">str</code>)<br>
-        confluent_schema_registry_url (<code style="color: 
green">str</code>)<br>
-        consumer_config_updates (<code>map[<span style="color: 
green;">str</span>, <span style="color: green;">str</span>]</code>)<br>
-        file_descriptor_path (<code style="color: green">str</code>)<br>
-        format (<code style="color: green">str</code>)<br>
-        message_name (<code style="color: green">str</code>)<br>
-        offset_deduplication (<code style="color: orange">boolean</code>)<br>
-        redistribute_by_record_key (<code style="color: 
orange">boolean</code>)<br>
-        redistribute_num_keys (<code style="color: #f54251">int32</code>)<br>
-        redistributed (<code style="color: orange">boolean</code>)<br>
-        schema (<code style="color: green">str</code>)<br>
+        <strong>table</strong> (<code style="color: green">str</code>)<br>
+        catalog_name (<code style="color: green">str</code>)<br>
+        catalog_properties (<code>map[<span style="color: green;">str</span>, 
<span style="color: green;">str</span>]</code>)<br>
+        config_properties (<code>map[<span style="color: green;">str</span>, 
<span style="color: green;">str</span>]</code>)<br>
+        drop (<code>list[<span style="color: green;">str</span>]</code>)<br>
+        filter (<code style="color: green">str</code>)<br>
+        from_snapshot (<code style="color: #f54251">int64</code>)<br>
+        from_timestamp (<code style="color: #f54251">int64</code>)<br>
+        keep (<code>list[<span style="color: green;">str</span>]</code>)<br>
+        poll_interval_seconds (<code style="color: #f54251">int32</code>)<br>
+        starting_strategy (<code style="color: green">str</code>)<br>
+        streaming (<code style="color: orange">boolean</code>)<br>
+        to_snapshot (<code style="color: #f54251">int64</code>)<br>
+        to_timestamp (<code style="color: #f54251">int64</code>)<br>
       </td>
       <td>
-        <strong>bootstrap_servers</strong> (<code style="color: 
green">str</code>)<br>
-        <strong>format</strong> (<code style="color: green">str</code>)<br>
-        <strong>topic</strong> (<code style="color: green">str</code>)<br>
-        file_descriptor_path (<code style="color: green">str</code>)<br>
-        message_name (<code style="color: green">str</code>)<br>
-        producer_config_updates (<code>map[<span style="color: 
green;">str</span>, <span style="color: green;">str</span>]</code>)<br>
-        schema (<code style="color: green">str</code>)<br>
+        Unavailable
       </td>
     </tr>
     <tr>
@@ -112,56 +106,39 @@ and Beam SQL is invoked via the Managed API under the 
hood.
       </td>
     </tr>
     <tr>
-      <td><strong>ICEBERG_CDC</strong></td>
-      <td>
-        <strong>table</strong> (<code style="color: green">str</code>)<br>
-        catalog_name (<code style="color: green">str</code>)<br>
-        catalog_properties (<code>map[<span style="color: green;">str</span>, 
<span style="color: green;">str</span>]</code>)<br>
-        config_properties (<code>map[<span style="color: green;">str</span>, 
<span style="color: green;">str</span>]</code>)<br>
-        drop (<code>list[<span style="color: green;">str</span>]</code>)<br>
-        filter (<code style="color: green">str</code>)<br>
-        from_snapshot (<code style="color: #f54251">int64</code>)<br>
-        from_timestamp (<code style="color: #f54251">int64</code>)<br>
-        keep (<code>list[<span style="color: green;">str</span>]</code>)<br>
-        poll_interval_seconds (<code style="color: #f54251">int32</code>)<br>
-        starting_strategy (<code style="color: green">str</code>)<br>
-        streaming (<code style="color: orange">boolean</code>)<br>
-        to_snapshot (<code style="color: #f54251">int64</code>)<br>
-        to_timestamp (<code style="color: #f54251">int64</code>)<br>
-      </td>
-      <td>
-        Unavailable
-      </td>
-    </tr>
-    <tr>
-      <td><strong>BIGQUERY</strong></td>
+      <td><strong>KAFKA</strong></td>
       <td>
-        kms_key (<code style="color: green">str</code>)<br>
-        query (<code style="color: green">str</code>)<br>
-        row_restriction (<code style="color: green">str</code>)<br>
-        fields (<code>list[<span style="color: green;">str</span>]</code>)<br>
-        table (<code style="color: green">str</code>)<br>
+        <strong>bootstrap_servers</strong> (<code style="color: 
green">str</code>)<br>
+        <strong>topic</strong> (<code style="color: green">str</code>)<br>
+        allow_duplicates (<code style="color: orange">boolean</code>)<br>
+        confluent_schema_registry_subject (<code style="color: 
green">str</code>)<br>
+        confluent_schema_registry_url (<code style="color: 
green">str</code>)<br>
+        consumer_config_updates (<code>map[<span style="color: 
green;">str</span>, <span style="color: green;">str</span>]</code>)<br>
+        file_descriptor_path (<code style="color: green">str</code>)<br>
+        format (<code style="color: green">str</code>)<br>
+        message_name (<code style="color: green">str</code>)<br>
+        offset_deduplication (<code style="color: orange">boolean</code>)<br>
+        redistribute_by_record_key (<code style="color: 
orange">boolean</code>)<br>
+        redistribute_num_keys (<code style="color: #f54251">int32</code>)<br>
+        redistributed (<code style="color: orange">boolean</code>)<br>
+        schema (<code style="color: green">str</code>)<br>
       </td>
       <td>
-        <strong>table</strong> (<code style="color: green">str</code>)<br>
-        drop (<code>list[<span style="color: green;">str</span>]</code>)<br>
-        keep (<code>list[<span style="color: green;">str</span>]</code>)<br>
-        kms_key (<code style="color: green">str</code>)<br>
-        only (<code style="color: green">str</code>)<br>
-        triggering_frequency_seconds (<code style="color: 
#f54251">int64</code>)<br>
+        <strong>bootstrap_servers</strong> (<code style="color: 
green">str</code>)<br>
+        <strong>format</strong> (<code style="color: green">str</code>)<br>
+        <strong>topic</strong> (<code style="color: green">str</code>)<br>
+        file_descriptor_path (<code style="color: green">str</code>)<br>
+        message_name (<code style="color: green">str</code>)<br>
+        producer_config_updates (<code>map[<span style="color: 
green;">str</span>, <span style="color: green;">str</span>]</code>)<br>
+        schema (<code style="color: green">str</code>)<br>
       </td>
     </tr>
     <tr>
       <td><strong>POSTGRES</strong></td>
       <td>
         <strong>jdbc_url</strong> (<code style="color: green">str</code>)<br>
-        connection_init_sql (<code>list[<span style="color: 
green;">str</span>]</code>)<br>
         connection_properties (<code style="color: green">str</code>)<br>
-        disable_auto_commit (<code style="color: orange">boolean</code>)<br>
-        driver_class_name (<code style="color: green">str</code>)<br>
-        driver_jars (<code style="color: green">str</code>)<br>
         fetch_size (<code style="color: #f54251">int32</code>)<br>
-        jdbc_type (<code style="color: green">str</code>)<br>
         location (<code style="color: green">str</code>)<br>
         num_partitions (<code style="color: #f54251">int32</code>)<br>
         output_parallelization (<code style="color: orange">boolean</code>)<br>
@@ -174,11 +151,7 @@ and Beam SQL is invoked via the Managed API under the hood.
         <strong>jdbc_url</strong> (<code style="color: green">str</code>)<br>
         autosharding (<code style="color: orange">boolean</code>)<br>
         batch_size (<code style="color: #f54251">int64</code>)<br>
-        connection_init_sql (<code>list[<span style="color: 
green;">str</span>]</code>)<br>
         connection_properties (<code style="color: green">str</code>)<br>
-        driver_class_name (<code style="color: green">str</code>)<br>
-        driver_jars (<code style="color: green">str</code>)<br>
-        jdbc_type (<code style="color: green">str</code>)<br>
         location (<code style="color: green">str</code>)<br>
         password (<code style="color: green">str</code>)<br>
         username (<code style="color: green">str</code>)<br>
@@ -189,13 +162,9 @@ and Beam SQL is invoked via the Managed API under the hood.
       <td><strong>SQLSERVER</strong></td>
       <td>
         <strong>jdbc_url</strong> (<code style="color: green">str</code>)<br>
-        connection_init_sql (<code>list[<span style="color: 
green;">str</span>]</code>)<br>
         connection_properties (<code style="color: green">str</code>)<br>
         disable_auto_commit (<code style="color: orange">boolean</code>)<br>
-        driver_class_name (<code style="color: green">str</code>)<br>
-        driver_jars (<code style="color: green">str</code>)<br>
         fetch_size (<code style="color: #f54251">int32</code>)<br>
-        jdbc_type (<code style="color: green">str</code>)<br>
         location (<code style="color: green">str</code>)<br>
         num_partitions (<code style="color: #f54251">int32</code>)<br>
         output_parallelization (<code style="color: orange">boolean</code>)<br>
@@ -208,11 +177,7 @@ and Beam SQL is invoked via the Managed API under the hood.
         <strong>jdbc_url</strong> (<code style="color: green">str</code>)<br>
         autosharding (<code style="color: orange">boolean</code>)<br>
         batch_size (<code style="color: #f54251">int64</code>)<br>
-        connection_init_sql (<code>list[<span style="color: 
green;">str</span>]</code>)<br>
         connection_properties (<code style="color: green">str</code>)<br>
-        driver_class_name (<code style="color: green">str</code>)<br>
-        driver_jars (<code style="color: green">str</code>)<br>
-        jdbc_type (<code style="color: green">str</code>)<br>
         location (<code style="color: green">str</code>)<br>
         password (<code style="color: green">str</code>)<br>
         username (<code style="color: green">str</code>)<br>
@@ -226,10 +191,7 @@ and Beam SQL is invoked via the Managed API under the hood.
         connection_init_sql (<code>list[<span style="color: 
green;">str</span>]</code>)<br>
         connection_properties (<code style="color: green">str</code>)<br>
         disable_auto_commit (<code style="color: orange">boolean</code>)<br>
-        driver_class_name (<code style="color: green">str</code>)<br>
-        driver_jars (<code style="color: green">str</code>)<br>
         fetch_size (<code style="color: #f54251">int32</code>)<br>
-        jdbc_type (<code style="color: green">str</code>)<br>
         location (<code style="color: green">str</code>)<br>
         num_partitions (<code style="color: #f54251">int32</code>)<br>
         output_parallelization (<code style="color: orange">boolean</code>)<br>
@@ -244,110 +206,36 @@ and Beam SQL is invoked via the Managed API under the 
hood.
         batch_size (<code style="color: #f54251">int64</code>)<br>
         connection_init_sql (<code>list[<span style="color: 
green;">str</span>]</code>)<br>
         connection_properties (<code style="color: green">str</code>)<br>
-        driver_class_name (<code style="color: green">str</code>)<br>
-        driver_jars (<code style="color: green">str</code>)<br>
-        jdbc_type (<code style="color: green">str</code>)<br>
         location (<code style="color: green">str</code>)<br>
         password (<code style="color: green">str</code>)<br>
         username (<code style="color: green">str</code>)<br>
         write_statement (<code style="color: green">str</code>)<br>
       </td>
     </tr>
-  </table>
-</div>
-
-## Configuration Details
-
-### `KAFKA` Write
-
-<div class="table-container-wrapper">
-  <table class="table table-bordered">
-    <tr>
-      <th>Configuration</th>
-      <th>Type</th>
-      <th>Description</th>
-    </tr>
-    <tr>
-      <td>
-        <strong>bootstrap_servers</strong>
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        A list of host/port pairs to use for establishing the initial 
connection to the Kafka cluster. The client will make use of all servers 
irrespective of which servers are specified here for bootstrapping—this list 
only impacts the initial hosts used to discover the full set of servers. | 
Format: host1:port1,host2:port2,...
-      </td>
-    </tr>
-    <tr>
-      <td>
-        <strong>format</strong>
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        The encoding format for the data stored in Kafka. Valid options are: 
RAW,JSON,AVRO,PROTO
-      </td>
-    </tr>
-    <tr>
-      <td>
-        <strong>topic</strong>
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        n/a
-      </td>
-    </tr>
-    <tr>
-      <td>
-        file_descriptor_path
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        The path to the Protocol Buffer File Descriptor Set file. This file is 
used for schema definition and message serialization.
-      </td>
-    </tr>
     <tr>
+      <td><strong>BIGQUERY</strong></td>
       <td>
-        message_name
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        The name of the Protocol Buffer message to be used for schema 
extraction and data conversion.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        producer_config_updates
-      </td>
-      <td>
-        <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
-      </td>
-      <td>
-        A list of key-value pairs that act as configuration parameters for 
Kafka producers. Most of these configurations will not be needed, but if you 
need to customize your Kafka producer, you may use this. See a detailed list: 
https://docs.confluent.io/platform/current/installation/configuration/producer-configs.html
-      </td>
-    </tr>
-    <tr>
-      <td>
-        schema
-      </td>
-      <td>
-        <code style="color: green">str</code>
+        kms_key (<code style="color: green">str</code>)<br>
+        query (<code style="color: green">str</code>)<br>
+        row_restriction (<code style="color: green">str</code>)<br>
+        fields (<code>list[<span style="color: green;">str</span>]</code>)<br>
+        table (<code style="color: green">str</code>)<br>
       </td>
       <td>
-        n/a
+        <strong>table</strong> (<code style="color: green">str</code>)<br>
+        drop (<code>list[<span style="color: green;">str</span>]</code>)<br>
+        keep (<code>list[<span style="color: green;">str</span>]</code>)<br>
+        kms_key (<code style="color: green">str</code>)<br>
+        only (<code style="color: green">str</code>)<br>
+        triggering_frequency_seconds (<code style="color: 
#f54251">int64</code>)<br>
       </td>
     </tr>
   </table>
 </div>
 
-### `KAFKA` Read
+## Configuration Details
+
+### `ICEBERG_CDC` Read
 
 <div class="table-container-wrapper">
   <table class="table table-bordered">
@@ -358,245 +246,156 @@ and Beam SQL is invoked via the Managed API under the 
hood.
     </tr>
     <tr>
       <td>
-        <strong>bootstrap_servers</strong>
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        A list of host/port pairs to use for establishing the initial 
connection to the Kafka cluster. The client will make use of all servers 
irrespective of which servers are specified here for bootstrapping—this list 
only impacts the initial hosts used to discover the full set of servers. This 
list should be in the form `host1:port1,host2:port2,...`
-      </td>
-    </tr>
-    <tr>
-      <td>
-        <strong>topic</strong>
+        <strong>table</strong>
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        n/a
-      </td>
-    </tr>
-    <tr>
-      <td>
-        allow_duplicates
-      </td>
-      <td>
-        <code style="color: orange">boolean</code>
-      </td>
-      <td>
-        If the Kafka read allows duplicates.
+        Identifier of the Iceberg table.
       </td>
     </tr>
     <tr>
       <td>
-        confluent_schema_registry_subject
+        catalog_name
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        n/a
+        Name of the catalog containing the table.
       </td>
     </tr>
     <tr>
       <td>
-        confluent_schema_registry_url
+        catalog_properties
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
       </td>
       <td>
-        n/a
+        Properties used to set up the Iceberg catalog.
       </td>
     </tr>
     <tr>
       <td>
-        consumer_config_updates
+        config_properties
       </td>
       <td>
         <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
       </td>
       <td>
-        A list of key-value pairs that act as configuration parameters for 
Kafka consumers. Most of these configurations will not be needed, but if you 
need to customize your Kafka consumer, you may use this. See a detailed list: 
https://docs.confluent.io/platform/current/installation/configuration/consumer-configs.html
+        Properties passed to the Hadoop Configuration.
       </td>
     </tr>
     <tr>
       <td>
-        file_descriptor_path
+        drop
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code>list[<span style="color: green;">str</span>]</code>
       </td>
       <td>
-        The path to the Protocol Buffer File Descriptor Set file. This file is 
used for schema definition and message serialization.
+        A subset of column names to exclude from reading. If null or empty, 
all columns will be read.
       </td>
     </tr>
     <tr>
       <td>
-        format
+        filter
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        The encoding format for the data stored in Kafka. Valid options are: 
RAW,STRING,AVRO,JSON,PROTO
+        SQL-like predicate to filter data at scan time. Example: "id > 5 AND 
status = 'ACTIVE'". Uses Apache Calcite syntax: 
https://calcite.apache.org/docs/reference.html
       </td>
     </tr>
     <tr>
       <td>
-        message_name
+        from_snapshot
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: #f54251">int64</code>
       </td>
       <td>
-        The name of the Protocol Buffer message to be used for schema 
extraction and data conversion.
+        Starts reading from this snapshot ID (inclusive).
       </td>
     </tr>
     <tr>
       <td>
-        offset_deduplication
+        from_timestamp
       </td>
       <td>
-        <code style="color: orange">boolean</code>
+        <code style="color: #f54251">int64</code>
       </td>
       <td>
-        If the redistribute is using offset deduplication mode.
+        Starts reading from the first snapshot (inclusive) that was created 
after this timestamp (in milliseconds).
       </td>
     </tr>
     <tr>
       <td>
-        redistribute_by_record_key
+        keep
       </td>
       <td>
-        <code style="color: orange">boolean</code>
+        <code>list[<span style="color: green;">str</span>]</code>
       </td>
       <td>
-        If the redistribute keys by the Kafka record key.
+        A subset of column names to read exclusively. If null or empty, all 
columns will be read.
       </td>
     </tr>
     <tr>
       <td>
-        redistribute_num_keys
+        poll_interval_seconds
       </td>
       <td>
         <code style="color: #f54251">int32</code>
       </td>
       <td>
-        The number of keys for redistributing Kafka inputs.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        redistributed
-      </td>
-      <td>
-        <code style="color: orange">boolean</code>
-      </td>
-      <td>
-        If the Kafka read should be redistributed.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        schema
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        The schema in which the data is encoded in the Kafka topic. For AVRO 
data, this is a schema defined with AVRO schema syntax 
(https://avro.apache.org/docs/1.10.2/spec.html#schemas). For JSON data, this is 
a schema defined with JSON-schema syntax (https://json-schema.org/). If a URL 
to Confluent Schema Registry is provided, then this field is ignored, and the 
schema is fetched from Confluent Schema Registry.
-      </td>
-    </tr>
-  </table>
-</div>
-
-### `ICEBERG` Read
-
-<div class="table-container-wrapper">
-  <table class="table table-bordered">
-    <tr>
-      <th>Configuration</th>
-      <th>Type</th>
-      <th>Description</th>
-    </tr>
-    <tr>
-      <td>
-        <strong>table</strong>
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        Identifier of the Iceberg table.
+        The interval at which to poll for new snapshots. Defaults to 60 
seconds.
       </td>
     </tr>
     <tr>
       <td>
-        catalog_name
+        starting_strategy
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Name of the catalog containing the table.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        catalog_properties
-      </td>
-      <td>
-        <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
-      </td>
-      <td>
-        Properties used to set up the Iceberg catalog.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        config_properties
-      </td>
-      <td>
-        <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
-      </td>
-      <td>
-        Properties passed to the Hadoop Configuration.
+        The source's starting strategy. Valid options are: "earliest" or 
"latest". Can be overriden by setting a starting snapshot or timestamp. 
Defaults to earliest for batch, and latest for streaming.
       </td>
     </tr>
     <tr>
       <td>
-        drop
+        streaming
       </td>
       <td>
-        <code>list[<span style="color: green;">str</span>]</code>
+        <code style="color: orange">boolean</code>
       </td>
       <td>
-        A subset of column names to exclude from reading. If null or empty, 
all columns will be read.
+        Enables streaming reads, where source continuously polls for snapshots 
forever.
       </td>
     </tr>
     <tr>
       <td>
-        filter
+        to_snapshot
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: #f54251">int64</code>
       </td>
       <td>
-        SQL-like predicate to filter data at scan time. Example: "id > 5 AND 
status = 'ACTIVE'". Uses Apache Calcite syntax: 
https://calcite.apache.org/docs/reference.html
+        Reads up to this snapshot ID (inclusive).
       </td>
     </tr>
     <tr>
       <td>
-        keep
+        to_timestamp
       </td>
       <td>
-        <code>list[<span style="color: green;">str</span>]</code>
+        <code style="color: #f54251">int64</code>
       </td>
       <td>
-        A subset of column names to read exclusively. If null or empty, all 
columns will be read.
+        Reads up to the latest snapshot (inclusive) created before this 
timestamp (in milliseconds).
       </td>
     </tr>
   </table>
@@ -747,7 +546,7 @@ For more information on table properties, please visit 
https://iceberg.apache.or
   </table>
 </div>
 
-### `ICEBERG_CDC` Read
+### `ICEBERG` Read
 
 <div class="table-container-wrapper">
   <table class="table table-bordered">
@@ -824,96 +623,108 @@ For more information on table properties, please visit 
https://iceberg.apache.or
     </tr>
     <tr>
       <td>
-        from_snapshot
+        keep
       </td>
       <td>
-        <code style="color: #f54251">int64</code>
+        <code>list[<span style="color: green;">str</span>]</code>
       </td>
       <td>
-        Starts reading from this snapshot ID (inclusive).
+        A subset of column names to read exclusively. If null or empty, all 
columns will be read.
       </td>
     </tr>
+  </table>
+</div>
+
+### `KAFKA` Write
+
+<div class="table-container-wrapper">
+  <table class="table table-bordered">
+    <tr>
+      <th>Configuration</th>
+      <th>Type</th>
+      <th>Description</th>
+    </tr>
     <tr>
       <td>
-        from_timestamp
+        <strong>bootstrap_servers</strong>
       </td>
       <td>
-        <code style="color: #f54251">int64</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        Starts reading from the first snapshot (inclusive) that was created 
after this timestamp (in milliseconds).
+        A list of host/port pairs to use for establishing the initial 
connection to the Kafka cluster. The client will make use of all servers 
irrespective of which servers are specified here for bootstrapping—this list 
only impacts the initial hosts used to discover the full set of servers. | 
Format: host1:port1,host2:port2,...
       </td>
     </tr>
     <tr>
       <td>
-        keep
+        <strong>format</strong>
       </td>
       <td>
-        <code>list[<span style="color: green;">str</span>]</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        A subset of column names to read exclusively. If null or empty, all 
columns will be read.
+        The encoding format for the data stored in Kafka. Valid options are: 
RAW,JSON,AVRO,PROTO
       </td>
     </tr>
     <tr>
       <td>
-        poll_interval_seconds
+        <strong>topic</strong>
       </td>
       <td>
-        <code style="color: #f54251">int32</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        The interval at which to poll for new snapshots. Defaults to 60 
seconds.
+        n/a
       </td>
     </tr>
     <tr>
       <td>
-        starting_strategy
+        file_descriptor_path
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        The source's starting strategy. Valid options are: "earliest" or 
"latest". Can be overriden by setting a starting snapshot or timestamp. 
Defaults to earliest for batch, and latest for streaming.
+        The path to the Protocol Buffer File Descriptor Set file. This file is 
used for schema definition and message serialization.
       </td>
     </tr>
     <tr>
       <td>
-        streaming
+        message_name
       </td>
       <td>
-        <code style="color: orange">boolean</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        Enables streaming reads, where source continuously polls for snapshots 
forever.
+        The name of the Protocol Buffer message to be used for schema 
extraction and data conversion.
       </td>
     </tr>
     <tr>
       <td>
-        to_snapshot
+        producer_config_updates
       </td>
       <td>
-        <code style="color: #f54251">int64</code>
+        <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
       </td>
       <td>
-        Reads up to this snapshot ID (inclusive).
+        A list of key-value pairs that act as configuration parameters for 
Kafka producers. Most of these configurations will not be needed, but if you 
need to customize your Kafka producer, you may use this. See a detailed list: 
https://docs.confluent.io/platform/current/installation/configuration/producer-configs.html
       </td>
     </tr>
     <tr>
       <td>
-        to_timestamp
+        schema
       </td>
       <td>
-        <code style="color: #f54251">int64</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        Reads up to the latest snapshot (inclusive) created before this 
timestamp (in milliseconds).
+        n/a
       </td>
     </tr>
   </table>
 </div>
 
-### `BIGQUERY` Read
+### `KAFKA` Read
 
 <div class="table-container-wrapper">
   <table class="table table-bordered">
@@ -924,235 +735,212 @@ For more information on table properties, please visit 
https://iceberg.apache.or
     </tr>
     <tr>
       <td>
-        kms_key
+        <strong>bootstrap_servers</strong>
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Use this Cloud KMS key to encrypt your data
+        A list of host/port pairs to use for establishing the initial 
connection to the Kafka cluster. The client will make use of all servers 
irrespective of which servers are specified here for bootstrapping—this list 
only impacts the initial hosts used to discover the full set of servers. This 
list should be in the form `host1:port1,host2:port2,...`
       </td>
     </tr>
     <tr>
       <td>
-        query
+        <strong>topic</strong>
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        The SQL query to be executed to read from the BigQuery table.
+        n/a
       </td>
     </tr>
     <tr>
       <td>
-        row_restriction
+        allow_duplicates
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: orange">boolean</code>
       </td>
       <td>
-        Read only rows that match this filter, which must be compatible with 
Google standard SQL. This is not supported when reading via query.
+        If the Kafka read allows duplicates.
       </td>
     </tr>
     <tr>
       <td>
-        fields
+        confluent_schema_registry_subject
       </td>
       <td>
-        <code>list[<span style="color: green;">str</span>]</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        Read only the specified fields (columns) from a BigQuery table. Fields 
may not be returned in the order specified. If no value is specified, then all 
fields are returned. Example: "col1, col2, col3"
+        n/a
       </td>
     </tr>
     <tr>
       <td>
-        table
+        confluent_schema_registry_url
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        The fully-qualified name of the BigQuery table to read from. Format: 
[${PROJECT}:]${DATASET}.${TABLE}
+        n/a
       </td>
     </tr>
-  </table>
-</div>
-
-### `BIGQUERY` Write
-
-<div class="table-container-wrapper">
-  <table class="table table-bordered">
-    <tr>
-      <th>Configuration</th>
-      <th>Type</th>
-      <th>Description</th>
-    </tr>
     <tr>
       <td>
-        <strong>table</strong>
+        consumer_config_updates
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code>map[<span style="color: green;">str</span>, <span style="color: 
green;">str</span>]</code>
       </td>
       <td>
-        The bigquery table to write to. Format: 
[${PROJECT}:]${DATASET}.${TABLE}
+        A list of key-value pairs that act as configuration parameters for 
Kafka consumers. Most of these configurations will not be needed, but if you 
need to customize your Kafka consumer, you may use this. See a detailed list: 
https://docs.confluent.io/platform/current/installation/configuration/consumer-configs.html
       </td>
     </tr>
     <tr>
       <td>
-        drop
+        file_descriptor_path
       </td>
       <td>
-        <code>list[<span style="color: green;">str</span>]</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        A list of field names to drop from the input record before writing. Is 
mutually exclusive with 'keep' and 'only'.
+        The path to the Protocol Buffer File Descriptor Set file. This file is 
used for schema definition and message serialization.
       </td>
     </tr>
     <tr>
       <td>
-        keep
+        format
       </td>
       <td>
-        <code>list[<span style="color: green;">str</span>]</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        A list of field names to keep in the input record. All other fields 
are dropped before writing. Is mutually exclusive with 'drop' and 'only'.
+        The encoding format for the data stored in Kafka. Valid options are: 
RAW,STRING,AVRO,JSON,PROTO
       </td>
     </tr>
     <tr>
       <td>
-        kms_key
+        message_name
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Use this Cloud KMS key to encrypt your data
+        The name of the Protocol Buffer message to be used for schema 
extraction and data conversion.
       </td>
     </tr>
     <tr>
       <td>
-        only
+        offset_deduplication
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: orange">boolean</code>
       </td>
       <td>
-        The name of a single record field that should be written. Is mutually 
exclusive with 'keep' and 'drop'.
+        If the redistribute is using offset deduplication mode.
       </td>
     </tr>
     <tr>
       <td>
-        triggering_frequency_seconds
+        redistribute_by_record_key
       </td>
       <td>
-        <code style="color: #f54251">int64</code>
+        <code style="color: orange">boolean</code>
       </td>
       <td>
-        Determines how often to 'commit' progress into BigQuery. Default is 
every 5 seconds.
+        If the redistribute keys by the Kafka record key.
       </td>
     </tr>
-  </table>
-</div>
-
-### `POSTGRES` Write
-
-<div class="table-container-wrapper">
-  <table class="table table-bordered">
-    <tr>
-      <th>Configuration</th>
-      <th>Type</th>
-      <th>Description</th>
-    </tr>
     <tr>
       <td>
-        <strong>jdbc_url</strong>
+        redistribute_num_keys
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: #f54251">int32</code>
       </td>
       <td>
-        Connection URL for the JDBC sink.
+        The number of keys for redistributing Kafka inputs.
       </td>
     </tr>
     <tr>
       <td>
-        autosharding
+        redistributed
       </td>
       <td>
         <code style="color: orange">boolean</code>
       </td>
       <td>
-        If true, enables using a dynamically determined number of shards to 
write.
+        If the Kafka read should be redistributed.
       </td>
     </tr>
     <tr>
       <td>
-        batch_size
+        schema
       </td>
       <td>
-        <code style="color: #f54251">int64</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        n/a
+        The schema in which the data is encoded in the Kafka topic. For AVRO 
data, this is a schema defined with AVRO schema syntax 
(https://avro.apache.org/docs/1.10.2/spec.html#schemas). For JSON data, this is 
a schema defined with JSON-schema syntax (https://json-schema.org/). If a URL 
to Confluent Schema Registry is provided, then this field is ignored, and the 
schema is fetched from Confluent Schema Registry.
       </td>
     </tr>
+  </table>
+</div>
+
+### `POSTGRES` Write
+
+<div class="table-container-wrapper">
+  <table class="table table-bordered">
     <tr>
-      <td>
-        connection_init_sql
-      </td>
-      <td>
-        <code>list[<span style="color: green;">str</span>]</code>
-      </td>
-      <td>
-        Sets the connection init sql statements used by the Driver. Only MySQL 
and MariaDB support this.
-      </td>
+      <th>Configuration</th>
+      <th>Type</th>
+      <th>Description</th>
     </tr>
     <tr>
       <td>
-        connection_properties
+        <strong>jdbc_url</strong>
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Used to set connection properties passed to the JDBC driver not 
already defined as standalone parameter (e.g. username and password can be set 
using parameters above accordingly). Format of the string must be 
"key1=value1;key2=value2;".
+        Connection URL for the JDBC sink.
       </td>
     </tr>
     <tr>
       <td>
-        driver_class_name
+        autosharding
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: orange">boolean</code>
       </td>
       <td>
-        Name of a Java Driver class to use to connect to the JDBC source. For 
example, "com.mysql.jdbc.Driver".
+        If true, enables using a dynamically determined number of shards to 
write.
       </td>
     </tr>
     <tr>
       <td>
-        driver_jars
+        batch_size
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: #f54251">int64</code>
       </td>
       <td>
-        Comma separated path(s) for the JDBC driver jar(s). This can be a 
local path or GCS (gs://) path.
+        n/a
       </td>
     </tr>
     <tr>
       <td>
-        jdbc_type
+        connection_properties
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Type of JDBC source. When specified, an appropriate default Driver 
will be packaged with the transform. One of mysql, postgres, oracle, or mssql.
+        Used to set connection properties passed to the JDBC driver not 
already defined as standalone parameter (e.g. username and password can be set 
using parameters above accordingly). Format of the string must be 
"key1=value1;key2=value2;".
       </td>
     </tr>
     <tr>
@@ -1224,123 +1012,168 @@ For more information on table properties, please 
visit https://iceberg.apache.or
     </tr>
     <tr>
       <td>
-        connection_init_sql
+        connection_properties
       </td>
       <td>
-        <code>list[<span style="color: green;">str</span>]</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        Sets the connection init sql statements used by the Driver. Only MySQL 
and MariaDB support this.
+        Used to set connection properties passed to the JDBC driver not 
already defined as standalone parameter (e.g. username and password can be set 
using parameters above accordingly). Format of the string must be 
"key1=value1;key2=value2;".
       </td>
     </tr>
     <tr>
       <td>
-        connection_properties
+        fetch_size
+      </td>
+      <td>
+        <code style="color: #f54251">int32</code>
+      </td>
+      <td>
+        This method is used to override the size of the data that is going to 
be fetched and loaded in memory per every database call. It should ONLY be used 
if the default value throws memory errors.
+      </td>
+    </tr>
+    <tr>
+      <td>
+        location
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Used to set connection properties passed to the JDBC driver not 
already defined as standalone parameter (e.g. username and password can be set 
using parameters above accordingly). Format of the string must be 
"key1=value1;key2=value2;".
+        Name of the table to read from.
       </td>
     </tr>
     <tr>
       <td>
-        disable_auto_commit
+        num_partitions
+      </td>
+      <td>
+        <code style="color: #f54251">int32</code>
+      </td>
+      <td>
+        The number of partitions
+      </td>
+    </tr>
+    <tr>
+      <td>
+        output_parallelization
       </td>
       <td>
         <code style="color: orange">boolean</code>
       </td>
       <td>
-        Whether to disable auto commit on read. Defaults to true if not 
provided. The need for this config varies depending on the database platform. 
Informix requires this to be set to false while Postgres requires this to be 
set to true.
+        Whether to reshuffle the resulting PCollection so results are 
distributed to all workers.
       </td>
     </tr>
     <tr>
       <td>
-        driver_class_name
+        partition_column
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Name of a Java Driver class to use to connect to the JDBC source. For 
example, "com.mysql.jdbc.Driver".
+        Name of a column of numeric type that will be used for partitioning.
       </td>
     </tr>
     <tr>
       <td>
-        driver_jars
+        password
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Comma separated path(s) for the JDBC driver jar(s). This can be a 
local path or GCS (gs://) path.
+        Password for the JDBC source.
       </td>
     </tr>
     <tr>
       <td>
-        fetch_size
+        read_query
       </td>
       <td>
-        <code style="color: #f54251">int32</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        This method is used to override the size of the data that is going to 
be fetched and loaded in memory per every database call. It should ONLY be used 
if the default value throws memory errors.
+        SQL query used to query the JDBC source.
       </td>
     </tr>
     <tr>
       <td>
-        jdbc_type
+        username
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Type of JDBC source. When specified, an appropriate default Driver 
will be packaged with the transform. One of mysql, postgres, oracle, or mssql.
+        Username for the JDBC source.
       </td>
     </tr>
+  </table>
+</div>
+
+### `SQLSERVER` Write
+
+<div class="table-container-wrapper">
+  <table class="table table-bordered">
+    <tr>
+      <th>Configuration</th>
+      <th>Type</th>
+      <th>Description</th>
+    </tr>
     <tr>
       <td>
-        location
+        <strong>jdbc_url</strong>
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Name of the table to read from.
+        Connection URL for the JDBC sink.
       </td>
     </tr>
     <tr>
       <td>
-        num_partitions
+        autosharding
       </td>
       <td>
-        <code style="color: #f54251">int32</code>
+        <code style="color: orange">boolean</code>
       </td>
       <td>
-        The number of partitions
+        If true, enables using a dynamically determined number of shards to 
write.
       </td>
     </tr>
     <tr>
       <td>
-        output_parallelization
+        batch_size
       </td>
       <td>
-        <code style="color: orange">boolean</code>
+        <code style="color: #f54251">int64</code>
       </td>
       <td>
-        Whether to reshuffle the resulting PCollection so results are 
distributed to all workers.
+        n/a
       </td>
     </tr>
     <tr>
       <td>
-        partition_column
+        connection_properties
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Name of a column of numeric type that will be used for partitioning.
+        Used to set connection properties passed to the JDBC driver not 
already defined as standalone parameter (e.g. username and password can be set 
using parameters above accordingly). Format of the string must be 
"key1=value1;key2=value2;".
+      </td>
+    </tr>
+    <tr>
+      <td>
+        location
+      </td>
+      <td>
+        <code style="color: green">str</code>
+      </td>
+      <td>
+        Name of the table to write to.
       </td>
     </tr>
     <tr>
@@ -1356,24 +1189,24 @@ For more information on table properties, please visit 
https://iceberg.apache.or
     </tr>
     <tr>
       <td>
-        read_query
+        username
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        SQL query used to query the JDBC source.
+        Username for the JDBC source.
       </td>
     </tr>
     <tr>
       <td>
-        username
+        write_statement
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Username for the JDBC source.
+        SQL query used to insert records into the JDBC sink.
       </td>
     </tr>
   </table>
@@ -1399,17 +1232,6 @@ For more information on table properties, please visit 
https://iceberg.apache.or
         Connection URL for the JDBC source.
       </td>
     </tr>
-    <tr>
-      <td>
-        connection_init_sql
-      </td>
-      <td>
-        <code>list[<span style="color: green;">str</span>]</code>
-      </td>
-      <td>
-        Sets the connection init sql statements used by the Driver. Only MySQL 
and MariaDB support this.
-      </td>
-    </tr>
     <tr>
       <td>
         connection_properties
@@ -1432,28 +1254,6 @@ For more information on table properties, please visit 
https://iceberg.apache.or
         Whether to disable auto commit on read. Defaults to true if not 
provided. The need for this config varies depending on the database platform. 
Informix requires this to be set to false while Postgres requires this to be 
set to true.
       </td>
     </tr>
-    <tr>
-      <td>
-        driver_class_name
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        Name of a Java Driver class to use to connect to the JDBC source. For 
example, "com.mysql.jdbc.Driver".
-      </td>
-    </tr>
-    <tr>
-      <td>
-        driver_jars
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        Comma separated path(s) for the JDBC driver jar(s). This can be a 
local path or GCS (gs://) path.
-      </td>
-    </tr>
     <tr>
       <td>
         fetch_size
@@ -1465,17 +1265,6 @@ For more information on table properties, please visit 
https://iceberg.apache.or
         This method is used to override the size of the data that is going to 
be fetched and loaded in memory per every database call. It should ONLY be used 
if the default value throws memory errors.
       </td>
     </tr>
-    <tr>
-      <td>
-        jdbc_type
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        Type of JDBC source. When specified, an appropriate default Driver 
will be packaged with the transform. One of mysql, postgres, oracle, or mssql.
-      </td>
-    </tr>
     <tr>
       <td>
         location
@@ -1556,7 +1345,7 @@ For more information on table properties, please visit 
https://iceberg.apache.or
   </table>
 </div>
 
-### `SQLSERVER` Write
+### `MYSQL` Read
 
 <div class="table-container-wrapper">
   <table class="table table-bordered">
@@ -1573,95 +1362,95 @@ For more information on table properties, please visit 
https://iceberg.apache.or
         <code style="color: green">str</code>
       </td>
       <td>
-        Connection URL for the JDBC sink.
+        Connection URL for the JDBC source.
       </td>
     </tr>
     <tr>
       <td>
-        autosharding
+        connection_init_sql
       </td>
       <td>
-        <code style="color: orange">boolean</code>
+        <code>list[<span style="color: green;">str</span>]</code>
       </td>
       <td>
-        If true, enables using a dynamically determined number of shards to 
write.
+        Sets the connection init sql statements used by the Driver. Only MySQL 
and MariaDB support this.
       </td>
     </tr>
     <tr>
       <td>
-        batch_size
+        connection_properties
       </td>
       <td>
-        <code style="color: #f54251">int64</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        n/a
+        Used to set connection properties passed to the JDBC driver not 
already defined as standalone parameter (e.g. username and password can be set 
using parameters above accordingly). Format of the string must be 
"key1=value1;key2=value2;".
       </td>
     </tr>
     <tr>
       <td>
-        connection_init_sql
+        disable_auto_commit
       </td>
       <td>
-        <code>list[<span style="color: green;">str</span>]</code>
+        <code style="color: orange">boolean</code>
       </td>
       <td>
-        Sets the connection init sql statements used by the Driver. Only MySQL 
and MariaDB support this.
+        Whether to disable auto commit on read. Defaults to true if not 
provided. The need for this config varies depending on the database platform. 
Informix requires this to be set to false while Postgres requires this to be 
set to true.
       </td>
     </tr>
     <tr>
       <td>
-        connection_properties
+        fetch_size
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: #f54251">int32</code>
       </td>
       <td>
-        Used to set connection properties passed to the JDBC driver not 
already defined as standalone parameter (e.g. username and password can be set 
using parameters above accordingly). Format of the string must be 
"key1=value1;key2=value2;".
+        This method is used to override the size of the data that is going to 
be fetched and loaded in memory per every database call. It should ONLY be used 
if the default value throws memory errors.
       </td>
     </tr>
     <tr>
       <td>
-        driver_class_name
+        location
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Name of a Java Driver class to use to connect to the JDBC source. For 
example, "com.mysql.jdbc.Driver".
+        Name of the table to read from.
       </td>
     </tr>
     <tr>
       <td>
-        driver_jars
+        num_partitions
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: #f54251">int32</code>
       </td>
       <td>
-        Comma separated path(s) for the JDBC driver jar(s). This can be a 
local path or GCS (gs://) path.
+        The number of partitions
       </td>
     </tr>
     <tr>
       <td>
-        jdbc_type
+        output_parallelization
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: orange">boolean</code>
       </td>
       <td>
-        Type of JDBC source. When specified, an appropriate default Driver 
will be packaged with the transform. One of mysql, postgres, oracle, or mssql.
+        Whether to reshuffle the resulting PCollection so results are 
distributed to all workers.
       </td>
     </tr>
     <tr>
       <td>
-        location
+        partition_column
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Name of the table to write to.
+        Name of a column of numeric type that will be used for partitioning.
       </td>
     </tr>
     <tr>
@@ -1677,30 +1466,30 @@ For more information on table properties, please visit 
https://iceberg.apache.or
     </tr>
     <tr>
       <td>
-        username
+        read_query
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Username for the JDBC source.
+        SQL query used to query the JDBC source.
       </td>
     </tr>
     <tr>
       <td>
-        write_statement
+        username
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        SQL query used to insert records into the JDBC sink.
+        Username for the JDBC source.
       </td>
     </tr>
   </table>
 </div>
 
-### `MYSQL` Read
+### `MYSQL` Write
 
 <div class="table-container-wrapper">
   <table class="table table-bordered">
@@ -1717,84 +1506,51 @@ For more information on table properties, please visit 
https://iceberg.apache.or
         <code style="color: green">str</code>
       </td>
       <td>
-        Connection URL for the JDBC source.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        connection_init_sql
-      </td>
-      <td>
-        <code>list[<span style="color: green;">str</span>]</code>
-      </td>
-      <td>
-        Sets the connection init sql statements used by the Driver. Only MySQL 
and MariaDB support this.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        connection_properties
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        Used to set connection properties passed to the JDBC driver not 
already defined as standalone parameter (e.g. username and password can be set 
using parameters above accordingly). Format of the string must be 
"key1=value1;key2=value2;".
+        Connection URL for the JDBC sink.
       </td>
     </tr>
     <tr>
       <td>
-        disable_auto_commit
+        autosharding
       </td>
       <td>
         <code style="color: orange">boolean</code>
       </td>
       <td>
-        Whether to disable auto commit on read. Defaults to true if not 
provided. The need for this config varies depending on the database platform. 
Informix requires this to be set to false while Postgres requires this to be 
set to true.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        driver_class_name
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        Name of a Java Driver class to use to connect to the JDBC source. For 
example, "com.mysql.jdbc.Driver".
+        If true, enables using a dynamically determined number of shards to 
write.
       </td>
     </tr>
     <tr>
       <td>
-        driver_jars
+        batch_size
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: #f54251">int64</code>
       </td>
       <td>
-        Comma separated path(s) for the JDBC driver jar(s). This can be a 
local path or GCS (gs://) path.
+        n/a
       </td>
     </tr>
     <tr>
       <td>
-        fetch_size
+        connection_init_sql
       </td>
       <td>
-        <code style="color: #f54251">int32</code>
+        <code>list[<span style="color: green;">str</span>]</code>
       </td>
       <td>
-        This method is used to override the size of the data that is going to 
be fetched and loaded in memory per every database call. It should ONLY be used 
if the default value throws memory errors.
+        Sets the connection init sql statements used by the Driver. Only MySQL 
and MariaDB support this.
       </td>
     </tr>
     <tr>
       <td>
-        jdbc_type
+        connection_properties
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Type of JDBC source. When specified, an appropriate default Driver 
will be packaged with the transform. One of mysql, postgres, oracle, or mssql.
+        Used to set connection properties passed to the JDBC driver not 
already defined as standalone parameter (e.g. username and password can be set 
using parameters above accordingly). Format of the string must be 
"key1=value1;key2=value2;".
       </td>
     </tr>
     <tr>
@@ -1805,40 +1561,7 @@ For more information on table properties, please visit 
https://iceberg.apache.or
         <code style="color: green">str</code>
       </td>
       <td>
-        Name of the table to read from.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        num_partitions
-      </td>
-      <td>
-        <code style="color: #f54251">int32</code>
-      </td>
-      <td>
-        The number of partitions
-      </td>
-    </tr>
-    <tr>
-      <td>
-        output_parallelization
-      </td>
-      <td>
-        <code style="color: orange">boolean</code>
-      </td>
-      <td>
-        Whether to reshuffle the resulting PCollection so results are 
distributed to all workers.
-      </td>
-    </tr>
-    <tr>
-      <td>
-        partition_column
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        Name of a column of numeric type that will be used for partitioning.
+        Name of the table to write to.
       </td>
     </tr>
     <tr>
@@ -1854,30 +1577,30 @@ For more information on table properties, please visit 
https://iceberg.apache.or
     </tr>
     <tr>
       <td>
-        read_query
+        username
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        SQL query used to query the JDBC source.
+        Username for the JDBC source.
       </td>
     </tr>
     <tr>
       <td>
-        username
+        write_statement
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Username for the JDBC source.
+        SQL query used to insert records into the JDBC sink.
       </td>
     </tr>
   </table>
 </div>
 
-### `MYSQL` Write
+### `BIGQUERY` Read
 
 <div class="table-container-wrapper">
   <table class="table table-bordered">
@@ -1888,134 +1611,135 @@ For more information on table properties, please 
visit https://iceberg.apache.or
     </tr>
     <tr>
       <td>
-        <strong>jdbc_url</strong>
+        kms_key
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Connection URL for the JDBC sink.
+        Use this Cloud KMS key to encrypt your data
       </td>
     </tr>
     <tr>
       <td>
-        autosharding
+        query
       </td>
       <td>
-        <code style="color: orange">boolean</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        If true, enables using a dynamically determined number of shards to 
write.
+        The SQL query to be executed to read from the BigQuery table.
       </td>
     </tr>
     <tr>
       <td>
-        batch_size
+        row_restriction
       </td>
       <td>
-        <code style="color: #f54251">int64</code>
+        <code style="color: green">str</code>
       </td>
       <td>
-        n/a
+        Read only rows that match this filter, which must be compatible with 
Google standard SQL. This is not supported when reading via query.
       </td>
     </tr>
     <tr>
       <td>
-        connection_init_sql
+        fields
       </td>
       <td>
         <code>list[<span style="color: green;">str</span>]</code>
       </td>
       <td>
-        Sets the connection init sql statements used by the Driver. Only MySQL 
and MariaDB support this.
+        Read only the specified fields (columns) from a BigQuery table. Fields 
may not be returned in the order specified. If no value is specified, then all 
fields are returned. Example: "col1, col2, col3"
       </td>
     </tr>
     <tr>
       <td>
-        connection_properties
+        table
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Used to set connection properties passed to the JDBC driver not 
already defined as standalone parameter (e.g. username and password can be set 
using parameters above accordingly). Format of the string must be 
"key1=value1;key2=value2;".
+        The fully-qualified name of the BigQuery table to read from. Format: 
[${PROJECT}:]${DATASET}.${TABLE}
       </td>
     </tr>
+  </table>
+</div>
+
+### `BIGQUERY` Write
+
+<div class="table-container-wrapper">
+  <table class="table table-bordered">
     <tr>
-      <td>
-        driver_class_name
-      </td>
-      <td>
-        <code style="color: green">str</code>
-      </td>
-      <td>
-        Name of a Java Driver class to use to connect to the JDBC source. For 
example, "com.mysql.jdbc.Driver".
-      </td>
+      <th>Configuration</th>
+      <th>Type</th>
+      <th>Description</th>
     </tr>
     <tr>
       <td>
-        driver_jars
+        <strong>table</strong>
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Comma separated path(s) for the JDBC driver jar(s). This can be a 
local path or GCS (gs://) path.
+        The bigquery table to write to. Format: 
[${PROJECT}:]${DATASET}.${TABLE}
       </td>
     </tr>
     <tr>
       <td>
-        jdbc_type
+        drop
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code>list[<span style="color: green;">str</span>]</code>
       </td>
       <td>
-        Type of JDBC source. When specified, an appropriate default Driver 
will be packaged with the transform. One of mysql, postgres, oracle, or mssql.
+        A list of field names to drop from the input record before writing. Is 
mutually exclusive with 'keep' and 'only'.
       </td>
     </tr>
     <tr>
       <td>
-        location
+        keep
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code>list[<span style="color: green;">str</span>]</code>
       </td>
       <td>
-        Name of the table to write to.
+        A list of field names to keep in the input record. All other fields 
are dropped before writing. Is mutually exclusive with 'drop' and 'only'.
       </td>
     </tr>
     <tr>
       <td>
-        password
+        kms_key
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Password for the JDBC source.
+        Use this Cloud KMS key to encrypt your data
       </td>
     </tr>
     <tr>
       <td>
-        username
+        only
       </td>
       <td>
         <code style="color: green">str</code>
       </td>
       <td>
-        Username for the JDBC source.
+        The name of a single record field that should be written. Is mutually 
exclusive with 'keep' and 'drop'.
       </td>
     </tr>
     <tr>
       <td>
-        write_statement
+        triggering_frequency_seconds
       </td>
       <td>
-        <code style="color: green">str</code>
+        <code style="color: #f54251">int64</code>
       </td>
       <td>
-        SQL query used to insert records into the JDBC sink.
+        Determines how often to 'commit' progress into BigQuery. Default is 
every 5 seconds.
       </td>
     </tr>
   </table>

(beam) 01/01: Update managed-io.md for release 2.71.0-RC2.

Reply via email to