This is an automated email from the ASF dual-hosted git repository.
Abacn pushed a commit to branch release-docs
in repository https://gitbox.apache.org/repos/asf/beam-site.git
The following commit(s) were added to refs/heads/release-docs by this push:
new de97ff5ec7 Fix yamldoc for Beam 2.74.0 (#706)
de97ff5ec7 is described below
commit de97ff5ec7420d83ae4c2f713085941d861d1a90
Author: Yi Hu <[email protected]>
AuthorDate: Tue Jun 9 13:56:42 2026 -0400
Fix yamldoc for Beam 2.74.0 (#706)
---
yamldoc/2.74.0/index.html | 1253 ++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 1188 insertions(+), 65 deletions(-)
diff --git a/yamldoc/2.74.0/index.html b/yamldoc/2.74.0/index.html
index 1f5f362a85..981876377b 100644
--- a/yamldoc/2.74.0/index.html
+++ b/yamldoc/2.74.0/index.html
@@ -630,15 +630,28 @@ in which case the fields will be named according to the
requested values.</p>
<h3 id="configuration_8">Configuration</h3>
<ul>
<li>
-<p><strong>keep</strong> <code>?</code> (Optional) : An expression evaluating
to true for those records that should be kept.</p>
+<p><strong>language</strong> <code>string</code> (Optional) </p>
</li>
<li>
-<p><strong>language</strong> <code>string</code> (Optional) : The language of
the above expression.
- Defaults to generic.</p>
+<p><strong>keep</strong> <code>Row</code> </p>
+<p>Row fields:</p>
+<ul>
+<li>
+<p><strong>callable</strong> <code>string</code> (Optional) : Source code of
a public class implementing Function<Row, T> for some schema-compatible T.</p>
</li>
<li>
-<p><strong>error_handling</strong> <code>Row</code> (Optional) : Whether and
where to output records that throw errors when
- the above expressions are evaluated. </p>
+<p><strong>expression</strong> <code>string</code> (Optional) : Source code
of a java expression in terms of the schema fields.</p>
+</li>
+<li>
+<p><strong>name</strong> <code>string</code> (Optional) : Fully qualified
name of either a class implementing Function<Row, T> (e.g. com.pkg.MyFunction),
or a method taking a single Row argument (e.g. com.pkg.MyClass::methodName). If
a method is passed, it must either be static or belong to a class with a public
nullary constructor.</p>
+</li>
+<li>
+<p><strong>path</strong> <code>string</code> (Optional) : Path to a jar file
implementing the function referenced in name.</p>
+</li>
+</ul>
+</li>
+<li>
+<p><strong>error_handling</strong> <code>Row</code> (Optional) : This option
specifies whether and where to output error rows. </p>
<p>Row fields:</p>
<ul>
<li><strong>output</strong> <code>string</code> : Name to use for the output
error collection</li>
@@ -649,8 +662,12 @@ in which case the fields will be named according to the
requested values.</p>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">Filter</span>
<span class="nt">input</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
<span class="nt">config</span><span class="p">:</span>
-<span class="w"> </span><span class="nt">keep</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">keep</span>
<span class="w"> </span><span class="nt">language</span><span
class="p">:</span><span class="w"> </span><span
class="s">"language"</span>
+<span class="w"> </span><span class="nt">keep</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">callable</span><span
class="p">:</span><span class="w"> </span><span
class="s">"callable"</span>
+<span class="w"> </span><span class="nt">expression</span><span
class="p">:</span><span class="w"> </span><span
class="s">"expression"</span>
+<span class="w"> </span><span class="nt">name</span><span
class="p">:</span><span class="w"> </span><span
class="s">"name"</span>
+<span class="w"> </span><span class="nt">path</span><span
class="p">:</span><span class="w"> </span><span
class="s">"path"</span>
<span class="w"> </span><span class="nt">error_handling</span><span
class="p">:</span>
<span class="w"> </span><span class="nt">output</span><span
class="p">:</span><span class="w"> </span><span
class="s">"output"</span>
</code></pre></div>
@@ -666,15 +683,76 @@ be implicitly flattened.</p>
<h3 id="usage_9">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">Flatten</span>
<span class="nt">input</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
</code></pre></div>
<hr><h2 id="icebergaddfiles">IcebergAddFiles</h2>
<h3 id="configuration_10">Configuration</h3>
+<ul>
+<li>
+<p><strong>table</strong> <code>string</code> : A fully-qualified table
identifier.</p>
+</li>
+<li>
+<p><strong>catalog_properties</strong> <code>Map[string, string]</code>
(Optional) : Properties used to set up the Iceberg catalog.</p>
+</li>
+<li>
+<p><strong>config_properties</strong> <code>Map[string, string]</code>
(Optional) : Properties passed to the Hadoop </p>
+</li>
+<li>
+<p><strong>triggering_frequency_seconds</strong> <code>int32</code> (Optional)
: For a streaming pipeline, sets the frequency at which incoming files are
appended (default 600, or 10min).</p>
+</li>
+<li>
+<p><strong>location_prefix</strong> <code>string</code> (Optional) : The
prefix shared among all partitions. For example, a data file may have the
following
location:%n'gs://bucket/namespace/table/data/id=13/name=beam/data_file.parquet'%n%nThe
provided prefix should go up until the partition
information:%n'gs://bucket/namespace/table/data/'.%nIf not provided, will try
determining each DataFile's partition from its metrics metadata.</p>
+</li>
+<li>
+<p><strong>partition_fields</strong> <code>Array[string]</code> (Optional) :
Fields used to create a partition spec that is applied when tables are created.
For a field 'foo', the available partition transforms are:</p>
+<ul>
+<li><code>foo</code></li>
+<li><code>truncate(foo, N)</code></li>
+<li><code>bucket(foo, N)</code></li>
+<li><code>hour(foo)</code></li>
+<li><code>day(foo)</code></li>
+<li><code>month(foo)</code></li>
+<li><code>year(foo)</code></li>
+<li><code>void(foo)</code></li>
+</ul>
+<p>For more information on partition transforms, please visit
https://iceberg.apache.org/spec/#partition-transforms.</p>
+</li>
+<li>
+<p><strong>table_properties</strong> <code>Map[string, string]</code>
(Optional) : Iceberg table properties to be set on the table when it is
created.
+ For more information on table properties, please visit
https://iceberg.apache.org/docs/latest/configuration/#table-properties.</p>
+</li>
+<li>
+<p><strong>error_handling</strong> <code>Row</code> (Optional) : This option
specifies whether and where to output unwritable rows. </p>
+<p>Row fields:</p>
+<ul>
+<li><strong>output</strong> <code>string</code> : Name to use for the output
error collection</li>
+</ul>
+</li>
+</ul>
<h3 id="usage_10">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">IcebergAddFiles</span>
-<span class="nt">input</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
+<span class="w"> </span><span class="nt">catalog_properties</span><span
class="p">:</span>
+<span class="w"> </span><span class="nt">a</span><span
class="p">:</span><span class="w"> </span><span
class="s">"catalog_properties_value_a"</span>
+<span class="w"> </span><span class="nt">b</span><span
class="p">:</span><span class="w"> </span><span
class="s">"catalog_properties_value_b"</span>
+<span class="w"> </span><span class="nt">c</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">...</span>
+<span class="w"> </span><span class="nt">config_properties</span><span
class="p">:</span>
+<span class="w"> </span><span class="nt">a</span><span
class="p">:</span><span class="w"> </span><span
class="s">"config_properties_value_a"</span>
+<span class="w"> </span><span class="nt">b</span><span
class="p">:</span><span class="w"> </span><span
class="s">"config_properties_value_b"</span>
+<span class="w"> </span><span class="nt">c</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">...</span>
+<span class="w"> </span><span
class="nt">triggering_frequency_seconds</span><span class="p">:</span><span
class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">triggering_frequency_seconds</span>
+<span class="w"> </span><span class="nt">location_prefix</span><span
class="p">:</span><span class="w"> </span><span
class="s">"location_prefix"</span>
+<span class="w"> </span><span class="nt">partition_fields</span><span
class="p">:</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"partition_fields"</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"partition_fields"</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="w"> </span><span class="nt">table_properties</span><span
class="p">:</span>
+<span class="w"> </span><span class="nt">a</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table_properties_value_a"</span>
+<span class="w"> </span><span class="nt">b</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table_properties_value_b"</span>
+<span class="w"> </span><span class="nt">c</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">...</span>
+<span class="w"> </span><span class="nt">error_handling</span><span
class="p">:</span>
+<span class="w"> </span><span class="nt">output</span><span
class="p">:</span><span class="w"> </span><span
class="s">"output"</span>
</code></pre></div>
<hr><h2 id="join">Join</h2>
@@ -841,20 +919,24 @@ chain-style pipelines.</p>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">MapToFields</span>
<span class="nt">input</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
<span class="nt">config</span><span class="p">:</span>
-<span class="w"> </span><span class="nt">fields</span><span class="p">:</span>
-<span class="w"> </span><span class="nt">a</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">fields_value_a</span>
-<span class="w"> </span><span class="nt">b</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">fields_value_b</span>
-<span class="w"> </span><span class="nt">c</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">...</span>
+<span class="w"> </span><span class="nt">language</span><span
class="p">:</span><span class="w"> </span><span
class="s">"language"</span>
<span class="w"> </span><span class="nt">append</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
<span class="w"> </span><span class="nt">drop</span><span class="p">:</span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"drop"</span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"drop"</span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
-<span class="w"> </span><span class="nt">language</span><span
class="p">:</span><span class="w"> </span><span
class="s">"language"</span>
-<span class="w"> </span><span class="nt">dependencies</span><span
class="p">:</span>
-<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"dependencies"</span>
-<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"dependencies"</span>
-<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="w"> </span><span class="nt">fields</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">a</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">expression</span><span
class="p">:</span><span class="w"> </span><span
class="s">"expression"</span>
+<span class="w"> </span><span class="nt">callable</span><span
class="p">:</span><span class="w"> </span><span
class="s">"callable"</span>
+<span class="w"> </span><span class="nt">name</span><span
class="p">:</span><span class="w"> </span><span
class="s">"name"</span>
+<span class="w"> </span><span class="nt">path</span><span
class="p">:</span><span class="w"> </span><span
class="s">"path"</span>
+<span class="w"> </span><span class="nt">b</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">expression</span><span
class="p">:</span><span class="w"> </span><span
class="s">"expression"</span>
+<span class="w"> </span><span class="nt">callable</span><span
class="p">:</span><span class="w"> </span><span
class="s">"callable"</span>
+<span class="w"> </span><span class="nt">name</span><span
class="p">:</span><span class="w"> </span><span
class="s">"name"</span>
+<span class="w"> </span><span class="nt">path</span><span
class="p">:</span><span class="w"> </span><span
class="s">"path"</span>
+<span class="w"> </span><span class="nt">c</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">...</span>
<span class="w"> </span><span class="nt">error_handling</span><span
class="p">:</span>
<span class="w"> </span><span class="nt">output</span><span
class="p">:</span><span class="w"> </span><span
class="s">"output"</span>
</code></pre></div>
@@ -1293,54 +1375,128 @@ If query is set, neither row_restriction nor fields
should be set.</p>
<h3 id="configuration_25">Configuration</h3>
<ul>
<li>
-<p><strong>table</strong> <code>string</code> (Optional) : The table to read
from, specified as <code>DATASET.TABLE</code>
- or <code>PROJECT:DATASET.TABLE</code>.</p>
+<p><strong>query</strong> <code>string</code> (Optional) : The SQL query to
be executed to read from the BigQuery table.</p>
</li>
<li>
-<p><strong>query</strong> <code>string</code> (Optional) : A query to be used
instead of the table argument.</p>
+<p><strong>table</strong> <code>string</code> (Optional) : The
fully-qualified name of the BigQuery table to read from. Format:
[${PROJECT}:]${DATASET}.${TABLE}</p>
</li>
<li>
-<p><strong>row_restriction</strong> <code>string</code> (Optional) : Optional
SQL text filtering statement, similar to a
- WHERE clause in a query. Aggregates are not supported. Restricted to a
- maximum length for 1 MB.</p>
+<p><strong>fields</strong> <code>Array[string]</code> (Optional) : Read only
the specified fields (columns) from a BigQuery table. Fields may not be
returned in the order specified. If no value is specified, then all fields are
returned. Example: "col1, col2, col3"</p>
</li>
<li>
-<p><strong>fields</strong> <code>Array[string]</code> (Optional)</p>
+<p><strong>row_restriction</strong> <code>string</code> (Optional) : Read
only rows that match this filter, which must be compatible with Google standard
SQL. This is not supported when reading via query.</p>
</li>
</ul>
<h3 id="usage_25">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">ReadFromBigQuery</span>
<span class="nt">config</span><span class="p">:</span>
-<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"query"</span>
-<span class="w"> </span><span class="nt">row_restriction</span><span
class="p">:</span><span class="w"> </span><span
class="s">"row_restriction"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
<span class="w"> </span><span class="nt">fields</span><span class="p">:</span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"field"</span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"field"</span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="w"> </span><span class="nt">row_restriction</span><span
class="p">:</span><span class="w"> </span><span
class="s">"row_restriction"</span>
</code></pre></div>
<hr><h2 id="writetobigquery">WriteToBigQuery</h2>
<h3 id="configuration_26">Configuration</h3>
+<ul>
+<li>
+<p><strong>table</strong> <code>string</code> : The bigquery table to write
to. Format: [${PROJECT}:]${DATASET}.${TABLE}</p>
+</li>
+<li>
+<p><strong>create_disposition</strong> <code>string</code> (Optional) :
Optional field that specifies whether the job is allowed to create new tables.
The following values are supported: CREATE_IF_NEEDED (the job may create the
table), CREATE_NEVER (the job must fail if the table does not exist
already).</p>
+</li>
+<li>
+<p><strong>write_disposition</strong> <code>string</code> (Optional) :
Specifies the action that occurs if the destination table already exists. The
following values are supported: WRITE_TRUNCATE (overwrites the table data),
WRITE_APPEND (append the data to the table), WRITE_EMPTY (job must fail if the
table is not empty).</p>
+</li>
+<li>
+<p><strong>error_handling</strong> <code>Row</code> (Optional) : This option
specifies whether and where to output unwritable rows. </p>
+<p>Row fields:</p>
+<ul>
+<li><strong>output</strong> <code>string</code> : Name to use for the output
error collection</li>
+</ul>
+</li>
+<li>
+<p><strong>num_streams</strong> <code>int32</code> (Optional) : Specifies the
number of write streams that the Storage API sink will use. This parameter is
only applicable when writing unbounded data.</p>
+</li>
+</ul>
<h3 id="usage_26">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">WriteToBigQuery</span>
-<span class="nt">input</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
+<span class="w"> </span><span class="nt">create_disposition</span><span
class="p">:</span><span class="w"> </span><span
class="s">"create_disposition"</span>
+<span class="w"> </span><span class="nt">write_disposition</span><span
class="p">:</span><span class="w"> </span><span
class="s">"write_disposition"</span>
+<span class="w"> </span><span class="nt">error_handling</span><span
class="p">:</span>
+<span class="w"> </span><span class="nt">output</span><span
class="p">:</span><span class="w"> </span><span
class="s">"output"</span>
+<span class="w"> </span><span class="nt">num_streams</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">num_streams</span>
</code></pre></div>
<hr><h2 id="readfrombigtable">ReadFromBigTable</h2>
+<p>Reads data from a Google Cloud Bigtable table.
+The transform requires the project ID, instance ID, and table ID parameters.
+Optionally, the output can be flattened or nested rows.
+Example usage:
+ - type: ReadFromBigTable
+ config:
+ project: "my-gcp-project"
+ instance: "my-bigtable-instance"
+ table: "my-table"</p>
<h3 id="configuration_27">Configuration</h3>
+<ul>
+<li>
+<p><strong>project</strong> <code>string</code> : Google Cloud project ID
containing the Bigtable instance.</p>
+</li>
+<li>
+<p><strong>instance</strong> <code>string</code> : Bigtable instance ID to
connect to.</p>
+</li>
+<li>
+<p><strong>table</strong> <code>string</code> : Bigtable table ID to read
from.</p>
+</li>
+<li>
+<p><strong>flatten</strong> <code>boolean</code> (Optional) : If set to
false, output rows are nested; if true or omitted, output rows are
flattened.</p>
+</li>
+</ul>
<h3 id="usage_27">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">ReadFromBigTable</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">project</span><span
class="p">:</span><span class="w"> </span><span
class="s">"project"</span>
+<span class="w"> </span><span class="nt">instance</span><span
class="p">:</span><span class="w"> </span><span
class="s">"instance"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
+<span class="w"> </span><span class="nt">flatten</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
</code></pre></div>
<hr><h2 id="writetobigtable">WriteToBigTable</h2>
+<p>Writes data to a Google Cloud Bigtable table.
+This transform requires the Google Cloud project ID, Bigtable instance ID, and
table ID.
+The input PCollection should be schema-compliant mutations or keyed rows.
+Example usage:
+ - type: WriteToBigTable
+ input: input
+ config:
+ project: "my-gcp-project"
+ instance: "my-bigtable-instance"
+ table: "my-table"</p>
<h3 id="configuration_28">Configuration</h3>
+<ul>
+<li>
+<p><strong>project</strong> <code>string</code> : Google Cloud project ID
containing the Bigtable instance.</p>
+</li>
+<li>
+<p><strong>instance</strong> <code>string</code> : Bigtable instance ID where
the table is located.</p>
+</li>
+<li>
+<p><strong>table</strong> <code>string</code> : Bigtable table ID to write
data into.</p>
+</li>
+</ul>
<h3 id="usage_28">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">WriteToBigTable</span>
<span class="nt">input</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">project</span><span
class="p">:</span><span class="w"> </span><span
class="s">"project"</span>
+<span class="w"> </span><span class="nt">instance</span><span
class="p">:</span><span class="w"> </span><span
class="s">"instance"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
</code></pre></div>
<hr><h2 id="readfromcsv">ReadFromCsv</h2>
@@ -1390,8 +1546,8 @@ comma-separated values (csv) files.</p>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">WriteToCsv</span>
<span class="nt">input</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">delimiter</span><span
class="p">:</span><span class="w"> </span><span
class="s">"delimiter"</span>
<span class="w"> </span><span class="nt">path</span><span
class="p">:</span><span class="w"> </span><span
class="s">"path"</span>
-<span class="w"> </span><span class="nt">delimiter</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">delimiter</span>
</code></pre></div>
<hr><h2 id="readfromiceberg">ReadFromIceberg</h2>
@@ -1570,44 +1726,310 @@
https://iceberg.apache.org/spec/#partition-transforms.</li>
<hr><h2 id="readfromicebergcdc">ReadFromIcebergCDC</h2>
<h3 id="configuration_33">Configuration</h3>
+<ul>
+<li>
+<p><strong>table</strong> <code>string</code> : Identifier of the Iceberg
table.</p>
+</li>
+<li>
+<p><strong>catalog_name</strong> <code>string</code> (Optional) : Name of the
catalog containing the table.</p>
+</li>
+<li>
+<p><strong>catalog_properties</strong> <code>Map[string, string]</code>
(Optional) : Properties used to set up the Iceberg catalog.</p>
+</li>
+<li>
+<p><strong>config_properties</strong> <code>Map[string, string]</code>
(Optional) : Properties passed to the Hadoop Configuration.</p>
+</li>
+<li>
+<p><strong>drop</strong> <code>Array[string]</code> (Optional) : A subset of
column names to exclude from reading. If null or empty, all columns will be
read.</p>
+</li>
+<li>
+<p><strong>filter</strong> <code>string</code> (Optional) : SQL-like
predicate to filter data at scan time. Example: "id > 5 AND status =
'ACTIVE'". Uses Apache Calcite syntax:
https://calcite.apache.org/docs/reference.html</p>
+</li>
+<li>
+<p><strong>from_snapshot</strong> <code>int64</code> (Optional) : Starts
reading from this snapshot ID (inclusive).</p>
+</li>
+<li>
+<p><strong>from_timestamp</strong> <code>int64</code> (Optional) : Starts
reading from the first snapshot (inclusive) that was created after this
timestamp (in milliseconds).</p>
+</li>
+<li>
+<p><strong>keep</strong> <code>Array[string]</code> (Optional) : A subset of
column names to read exclusively. If null or empty, all columns will be
read.</p>
+</li>
+<li>
+<p><strong>poll_interval_seconds</strong> <code>int32</code> (Optional) : The
interval at which to poll for new snapshots. Defaults to 60 seconds.</p>
+</li>
+<li>
+<p><strong>starting_strategy</strong> <code>string</code> (Optional) : The
source's starting strategy. Valid options are: "earliest" or "latest". Can be
overriden by setting a starting snapshot or timestamp. Defaults to earliest for
batch, and latest for streaming.</p>
+</li>
+<li>
+<p><strong>streaming</strong> <code>boolean</code> (Optional) : Enables
streaming reads, where source continuously polls for snapshots forever.</p>
+</li>
+<li>
+<p><strong>to_snapshot</strong> <code>int64</code> (Optional) : Reads up to
this snapshot ID (inclusive).</p>
+</li>
+<li>
+<p><strong>to_timestamp</strong> <code>int64</code> (Optional) : Reads up to
the latest snapshot (inclusive) created before this timestamp (in
milliseconds).</p>
+</li>
+</ul>
<h3 id="usage_33">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">ReadFromIcebergCDC</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
+<span class="w"> </span><span class="nt">catalog_name</span><span
class="p">:</span><span class="w"> </span><span
class="s">"catalog_name"</span>
+<span class="w"> </span><span class="nt">catalog_properties</span><span
class="p">:</span>
+<span class="w"> </span><span class="nt">a</span><span
class="p">:</span><span class="w"> </span><span
class="s">"catalog_properties_value_a"</span>
+<span class="w"> </span><span class="nt">b</span><span
class="p">:</span><span class="w"> </span><span
class="s">"catalog_properties_value_b"</span>
+<span class="w"> </span><span class="nt">c</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">...</span>
+<span class="w"> </span><span class="nt">config_properties</span><span
class="p">:</span>
+<span class="w"> </span><span class="nt">a</span><span
class="p">:</span><span class="w"> </span><span
class="s">"config_properties_value_a"</span>
+<span class="w"> </span><span class="nt">b</span><span
class="p">:</span><span class="w"> </span><span
class="s">"config_properties_value_b"</span>
+<span class="w"> </span><span class="nt">c</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">...</span>
+<span class="w"> </span><span class="nt">drop</span><span class="p">:</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"drop"</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"drop"</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="w"> </span><span class="nt">filter</span><span
class="p">:</span><span class="w"> </span><span
class="s">"filter"</span>
+<span class="w"> </span><span class="nt">from_snapshot</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">from_snapshot</span>
+<span class="w"> </span><span class="nt">from_timestamp</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">from_timestamp</span>
+<span class="w"> </span><span class="nt">keep</span><span class="p">:</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"keep"</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"keep"</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="w"> </span><span class="nt">poll_interval_seconds</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">poll_interval_seconds</span>
+<span class="w"> </span><span class="nt">starting_strategy</span><span
class="p">:</span><span class="w"> </span><span
class="s">"starting_strategy"</span>
+<span class="w"> </span><span class="nt">streaming</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
+<span class="w"> </span><span class="nt">to_snapshot</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">to_snapshot</span>
+<span class="w"> </span><span class="nt">to_timestamp</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">to_timestamp</span>
</code></pre></div>
<hr><h2 id="readfromjdbc">ReadFromJdbc</h2>
+<p>Read from a JDBC source using a SQL query or by directly accessing a single
table.</p>
+<p>This transform can be used to read from a JDBC source using either a given
JDBC driver jar and class name, or by using one of the default packaged drivers
given a <code>jdbc_type</code>.</p>
+<h4 id="using-a-default-driver">Using a default driver</h4>
+<p>This transform comes packaged with drivers for several popular JDBC
distributions. The following distributions can be declared as the
<code>jdbc_type</code>: mysql, oracle, postgres, mssql.</p>
+<p>For example, reading a MySQL source using a SQL query: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">ReadFromJdbc</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">jdbc_type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">mysql</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:mysql://my-host:3306/database"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"SELECT</span><span class="nv"> </span><span
class="s">*</span><span class="nv"> </span><span class="s">FROM</span><span
class="nv"> </span><span class="s">table"</span>
+</code></pre></div>
+
+<p><strong>Note</strong>: See the following transforms which are built on top
of this transform and simplify this logic for several popular JDBC
distributions:</p>
+<ul>
+<li><a href="#readfrommysql">ReadFromMySql</a></li>
+<li><a href="#readfrompostgres">ReadFromPostgres</a></li>
+<li><a href="#readfromoracle">ReadFromOracle</a></li>
+<li><a href="#readfromsqlserver">ReadFromSqlServer</a></li>
+</ul>
+<h4 id="declaring-custom-jdbc-drivers">Declaring custom JDBC drivers</h4>
+<p>If reading from a JDBC source not listed above, or if it is necessary to
use a custom driver not packaged with Beam, one must define a JDBC driver and
class name.</p>
+<p>For example, reading a MySQL source table: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">ReadFromJdbc</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">driver_jars</span><span
class="p">:</span><span class="w"> </span><span
class="s">"path/to/some/jdbc.jar"</span>
+<span class="w"> </span><span class="nt">driver_class_name</span><span
class="p">:</span><span class="w"> </span><span
class="s">"com.mysql.jdbc.Driver"</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:mysql://my-host:3306/database"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"my-table"</span>
+</code></pre></div>
+
+<h4 id="connection-properties">Connection Properties</h4>
+<p>Connection properties are properties sent to the Driver used to connect to
the JDBC source. For example, to set the character encoding to UTF-8, one could
write: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">ReadFromJdbc</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">connectionProperties</span><span
class="p">:</span><span class="w"> </span><span
class="s">"characterEncoding=UTF-8;"</span>
+<span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">...</span>
+</code></pre></div>
+
+<p>All properties should be semi-colon-delimited (e.g.
"key1=value1;key2=value2;")</p>
<h3 id="configuration_34">Configuration</h3>
+<ul>
+<li>
+<p><strong>url</strong> <code>string</code> : Connection URL for the JDBC
source.</p>
+</li>
+<li>
+<p><strong>connection_init_sql</strong> <code>Array[string]</code> (Optional)
: Sets the connection init sql statements used by the Driver. Only MySQL and
MariaDB support this.</p>
+</li>
+<li>
+<p><strong>connection_properties</strong> <code>string</code> (Optional) :
Used to set connection properties passed to the JDBC driver not already defined
as standalone parameter (e.g. username and password can be set using parameters
above accordingly). Format of the string must be "key1=value1;key2=value2;".</p>
+</li>
+<li>
+<p><strong>disable_auto_commit</strong> <code>boolean</code> (Optional) :
Whether to disable auto commit on read. Defaults to true if not provided. The
need for this config varies depending on the database platform. Informix
requires this to be set to false while Postgres requires this to be set to
true.</p>
+</li>
+<li>
+<p><strong>driver_class_name</strong> <code>string</code> (Optional) : Name
of a Java Driver class to use to connect to the JDBC source. For example,
"com.mysql.jdbc.Driver".</p>
+</li>
+<li>
+<p><strong>driver_jars</strong> <code>string</code> (Optional) : Comma
separated path(s) for the JDBC driver jar(s). This can be a local path or GCS
(gs://) path.</p>
+</li>
+<li>
+<p><strong>fetch_size</strong> <code>int32</code> (Optional) : This method is
used to override the size of the data that is going to be fetched and loaded in
memory per every database call. It should ONLY be used if the default value
throws memory errors.</p>
+</li>
+<li>
+<p><strong>output_parallelization</strong> <code>boolean</code> (Optional) :
Whether to reshuffle the resulting PCollection so results are distributed to
all workers.</p>
+</li>
+<li>
+<p><strong>password</strong> <code>string</code> (Optional) : Password for
the JDBC source.</p>
+</li>
+<li>
+<p><strong>query</strong> <code>string</code> (Optional) : SQL query used to
query the JDBC source.</p>
+</li>
+<li>
+<p><strong>table</strong> <code>string</code> (Optional) : Name of the table
to read from.</p>
+</li>
+<li>
+<p><strong>partition_column</strong> <code>string</code> (Optional) : Name of
a column of numeric type that will be used for partitioning.</p>
+</li>
+<li>
+<p><strong>num_partitions</strong> <code>int32</code> (Optional) : The number
of partitions</p>
+</li>
+<li>
+<p><strong>type</strong> <code>string</code> (Optional) : Type of JDBC
source. When specified, an appropriate default Driver will be packaged with the
transform. One of mysql, postgres, oracle, or mssql.</p>
+</li>
+<li>
+<p><strong>username</strong> <code>string</code> (Optional) : Username for
the JDBC source.</p>
+</li>
+</ul>
<h3 id="usage_34">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">ReadFromJdbc</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span class="s">"url"</span>
+<span class="w"> </span><span class="nt">connection_init_sql</span><span
class="p">:</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"connection_init_sql"</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"connection_init_sql"</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="w"> </span><span class="nt">connection_properties</span><span
class="p">:</span><span class="w"> </span><span
class="s">"connection_properties"</span>
+<span class="w"> </span><span class="nt">disable_auto_commit</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
+<span class="w"> </span><span class="nt">driver_class_name</span><span
class="p">:</span><span class="w"> </span><span
class="s">"driver_class_name"</span>
+<span class="w"> </span><span class="nt">driver_jars</span><span
class="p">:</span><span class="w"> </span><span
class="s">"driver_jars"</span>
+<span class="w"> </span><span class="nt">fetch_size</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">fetch_size</span>
+<span class="w"> </span><span class="nt">output_parallelization</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
+<span class="w"> </span><span class="nt">password</span><span
class="p">:</span><span class="w"> </span><span
class="s">"password"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"query"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
+<span class="w"> </span><span class="nt">partition_column</span><span
class="p">:</span><span class="w"> </span><span
class="s">"partition_column"</span>
+<span class="w"> </span><span class="nt">num_partitions</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">num_partitions</span>
+<span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span
class="s">"type"</span>
+<span class="w"> </span><span class="nt">username</span><span
class="p">:</span><span class="w"> </span><span
class="s">"username"</span>
</code></pre></div>
<hr><h2 id="writetojdbc">WriteToJdbc</h2>
-<h3 id="configuration_35">Configuration</h3>
-<h3 id="usage_35">Usage</h3>
-<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">WriteToJdbc</span>
-<span class="nt">input</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<p>Write to a JDBC sink using a SQL query or by directly accessing a single
table.</p>
+<p>This transform can be used to write to a JDBC sink using either a given
JDBC driver jar and class name, or by using one of the default packaged drivers
given a <code>jdbc_type</code>.</p>
+<h4 id="using-a-default-driver_1">Using a default driver</h4>
+<p>This transform comes packaged with drivers for several popular JDBC
distributions. The following distributions can be declared as the
<code>jdbc_type</code>: mysql, oracle, postgres, mssql.</p>
+<p>For example, writing to a MySQL sink using a SQL query: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">WriteToJdbc</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">jdbc_type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">mysql</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:mysql://my-host:3306/database"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"INSERT</span><span class="nv"> </span><span
class="s">INTO</span><span class="nv"> </span><span class="s">table</span><span
class="nv"> </span><span class="s">VALUES(?,</span><span class="nv">
</span><span class="s">?)"</span>
</code></pre></div>
-<hr><h2 id="readfromjson">ReadFromJson</h2>
-<p>A PTransform for reading json values from files into a PCollection.</p>
-<h3 id="configuration_36">Configuration</h3>
+<p><strong>Note</strong>: See the following transforms which are built on top
of this transform and simplify this logic for several popular JDBC
distributions:</p>
<ul>
-<li><strong>path</strong> <code>string</code> : The file path to read from.
The path can contain glob
- characters such as <code>*</code> and <code>?</code>.</li>
+<li><a href="#writetomysql">WriteToMySql</a></li>
+<li><a href="#writetopostgres">WriteToPostgres</a></li>
+<li><a href="#writetooracle">WriteToOracle</a></li>
+<li><a href="#writetosqlserver">WriteToSqlServer</a></li>
</ul>
-<h3 id="usage_36">Usage</h3>
-<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">ReadFromJson</span>
-<span class="nt">config</span><span class="p">:</span>
-<span class="w"> </span><span class="nt">path</span><span
class="p">:</span><span class="w"> </span><span
class="s">"path"</span>
+<h4 id="declaring-custom-jdbc-drivers_1">Declaring custom JDBC drivers</h4>
+<p>If writing to a JDBC sink not listed above, or if it is necessary to use a
custom driver not packaged with Beam, one must define a JDBC driver and class
name.</p>
+<p>For example, writing to a MySQL table: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">WriteToJdbc</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">driver_jars</span><span
class="p">:</span><span class="w"> </span><span
class="s">"path/to/some/jdbc.jar"</span>
+<span class="w"> </span><span class="nt">driver_class_name</span><span
class="p">:</span><span class="w"> </span><span
class="s">"com.mysql.jdbc.Driver"</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:mysql://my-host:3306/database"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"my-table"</span>
</code></pre></div>
-<hr><h2 id="writetojson">WriteToJson</h2>
-<p>A PTransform for writing a PCollection as json values to files.</p>
-<h3 id="configuration_37">Configuration</h3>
+<h4 id="connection-properties_1">Connection Properties</h4>
+<p>Connection properties are properties sent to the Driver used to connect to
the JDBC source. For example, to set the character encoding to UTF-8, one could
write: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">WriteToJdbc</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">connectionProperties</span><span
class="p">:</span><span class="w"> </span><span
class="s">"characterEncoding=UTF-8;"</span>
+<span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">...</span>
+</code></pre></div>
+
+<p>All properties should be semi-colon-delimited (e.g.
"key1=value1;key2=value2;")</p>
+<h3 id="configuration_35">Configuration</h3>
<ul>
-<li><strong>path</strong> <code>string</code> : The file path to write to.
The files written will
+<li>
+<p><strong>url</strong> <code>string</code> : Connection URL for the JDBC
sink.</p>
+</li>
+<li>
+<p><strong>auto_sharding</strong> <code>boolean</code> (Optional) : If true,
enables using a dynamically determined number of shards to write.</p>
+</li>
+<li>
+<p><strong>connection_init_sql</strong> <code>Array[string]</code> (Optional)
: Sets the connection init sql statements used by the Driver. Only MySQL and
MariaDB support this.</p>
+</li>
+<li>
+<p><strong>connection_properties</strong> <code>string</code> (Optional) :
Used to set connection properties passed to the JDBC driver not already defined
as standalone parameter (e.g. username and password can be set using parameters
above accordingly). Format of the string must be "key1=value1;key2=value2;".</p>
+</li>
+<li>
+<p><strong>driver_class_name</strong> <code>string</code> (Optional) : Name
of a Java Driver class to use to connect to the JDBC source. For example,
"com.mysql.jdbc.Driver".</p>
+</li>
+<li>
+<p><strong>driver_jars</strong> <code>string</code> (Optional) : Comma
separated path(s) for the JDBC driver jar(s). This can be a local path or GCS
(gs://) path.</p>
+</li>
+<li>
+<p><strong>password</strong> <code>string</code> (Optional) : Password for
the JDBC source.</p>
+</li>
+<li>
+<p><strong>table</strong> <code>string</code> (Optional) : Name of the table
to write to.</p>
+</li>
+<li>
+<p><strong>batch_size</strong> <code>int64</code> (Optional) </p>
+</li>
+<li>
+<p><strong>type</strong> <code>string</code> (Optional) : Type of JDBC
source. When specified, an appropriate default Driver will be packaged with the
transform. One of mysql, postgres, oracle, or mssql.</p>
+</li>
+<li>
+<p><strong>username</strong> <code>string</code> (Optional) : Username for
the JDBC source.</p>
+</li>
+<li>
+<p><strong>query</strong> <code>string</code> (Optional) : SQL query used to
insert records into the JDBC sink.</p>
+</li>
+</ul>
+<h3 id="usage_35">Usage</h3>
+<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">WriteToJdbc</span>
+<span class="nt">input</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span class="s">"url"</span>
+<span class="w"> </span><span class="nt">auto_sharding</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
+<span class="w"> </span><span class="nt">connection_init_sql</span><span
class="p">:</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"connection_init_sql"</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"connection_init_sql"</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="w"> </span><span class="nt">connection_properties</span><span
class="p">:</span><span class="w"> </span><span
class="s">"connection_properties"</span>
+<span class="w"> </span><span class="nt">driver_class_name</span><span
class="p">:</span><span class="w"> </span><span
class="s">"driver_class_name"</span>
+<span class="w"> </span><span class="nt">driver_jars</span><span
class="p">:</span><span class="w"> </span><span
class="s">"driver_jars"</span>
+<span class="w"> </span><span class="nt">password</span><span
class="p">:</span><span class="w"> </span><span
class="s">"password"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
+<span class="w"> </span><span class="nt">batch_size</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">batch_size</span>
+<span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span
class="s">"type"</span>
+<span class="w"> </span><span class="nt">username</span><span
class="p">:</span><span class="w"> </span><span
class="s">"username"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"query"</span>
+</code></pre></div>
+
+<hr><h2 id="readfromjson">ReadFromJson</h2>
+<p>A PTransform for reading json values from files into a PCollection.</p>
+<h3 id="configuration_36">Configuration</h3>
+<ul>
+<li><strong>path</strong> <code>string</code> : The file path to read from.
The path can contain glob
+ characters such as <code>*</code> and <code>?</code>.</li>
+</ul>
+<h3 id="usage_36">Usage</h3>
+<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">ReadFromJson</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">path</span><span
class="p">:</span><span class="w"> </span><span
class="s">"path"</span>
+</code></pre></div>
+
+<hr><h2 id="writetojson">WriteToJson</h2>
+<p>A PTransform for writing a PCollection as json values to files.</p>
+<h3 id="configuration_37">Configuration</h3>
+<ul>
+<li><strong>path</strong> <code>string</code> : The file path to write to.
The files written will
begin with this prefix, followed by a shard identifier (see
<code>num_shards</code>) according to the <code>file_naming</code>
parameter.</li>
</ul>
@@ -1620,47 +2042,393 @@
https://iceberg.apache.org/spec/#partition-transforms.</li>
<hr><h2 id="readfromkafka">ReadFromKafka</h2>
<h3 id="configuration_38">Configuration</h3>
+<ul>
+<li>
+<p><strong>schema</strong> <code>string</code> (Optional) : The schema in
which the data is encoded in the Kafka topic. For AVRO data, this is a schema
defined with AVRO schema syntax
(https://avro.apache.org/docs/1.10.2/spec.html#schemas). For JSON data, this is
a schema defined with JSON-schema syntax (https://json-schema.org/). If a URL
to Confluent Schema Registry is provided, then this field is ignored, and the
schema is fetched from Confluent Schema Registry.</p>
+</li>
+<li>
+<p><strong>consumer_config</strong> <code>Map[string, string]</code>
(Optional) : A list of key-value pairs that act as configuration parameters
for Kafka consumers. Most of these configurations will not be needed, but if
you need to customize your Kafka consumer, you may use this. See a detailed
list:
https://docs.confluent.io/platform/current/installation/configuration/consumer-configs.html</p>
+</li>
+<li>
+<p><strong>format</strong> <code>string</code> (Optional) : The encoding
format for the data stored in Kafka. Valid options are:
RAW,STRING,AVRO,JSON,PROTO</p>
+</li>
+<li>
+<p><strong>topic</strong> <code>string</code> </p>
+</li>
+<li>
+<p><strong>bootstrap_servers</strong> <code>string</code> : A list of
host/port pairs to use for establishing the initial connection to the Kafka
cluster. The client will make use of all servers irrespective of which servers
are specified here for bootstrapping—this list only impacts the initial hosts
used to discover the full set of servers. This list should be in the form
<code>host1:port1,host2:port2,...</code></p>
+</li>
+<li>
+<p><strong>confluent_schema_registry_url</strong> <code>string</code>
(Optional) </p>
+</li>
+<li>
+<p><strong>confluent_schema_registry_subject</strong> <code>string</code>
(Optional) </p>
+</li>
+<li>
+<p><strong>auto_offset_reset_config</strong> <code>string</code> (Optional) :
What to do when there is no initial offset in Kafka or if the current offset
does not exist any more on the server. (1) earliest: automatically reset the
offset to the earliest offset. (2) latest: automatically reset the offset to
the latest offset (3) none: throw exception to the consumer if no previous
offset is found for the consumer’s group</p>
+</li>
+<li>
+<p><strong>error_handling</strong> <code>Row</code> (Optional) : This option
specifies whether and where to output unwritable rows. </p>
+<p>Row fields:</p>
+<ul>
+<li><strong>output</strong> <code>string</code> : Name to use for the output
error collection</li>
+</ul>
+</li>
+<li>
+<p><strong>file_descriptor_path</strong> <code>string</code> (Optional) : The
path to the Protocol Buffer File Descriptor Set file. This file is used for
schema definition and message serialization.</p>
+</li>
+<li>
+<p><strong>message_name</strong> <code>string</code> (Optional) : The name of
the Protocol Buffer message to be used for schema extraction and data
conversion.</p>
+</li>
+<li>
+<p><strong>max_read_time_seconds</strong> <code>int32</code> (Optional) :
Upper bound of how long to read from Kafka.</p>
+</li>
+</ul>
<h3 id="usage_38">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">ReadFromKafka</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">schema</span><span
class="p">:</span><span class="w"> </span><span
class="s">"schema"</span>
+<span class="w"> </span><span class="nt">consumer_config</span><span
class="p">:</span>
+<span class="w"> </span><span class="nt">a</span><span
class="p">:</span><span class="w"> </span><span
class="s">"consumer_config_value_a"</span>
+<span class="w"> </span><span class="nt">b</span><span
class="p">:</span><span class="w"> </span><span
class="s">"consumer_config_value_b"</span>
+<span class="w"> </span><span class="nt">c</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">...</span>
+<span class="w"> </span><span class="nt">format</span><span
class="p">:</span><span class="w"> </span><span
class="s">"format"</span>
+<span class="w"> </span><span class="nt">topic</span><span
class="p">:</span><span class="w"> </span><span
class="s">"topic"</span>
+<span class="w"> </span><span class="nt">bootstrap_servers</span><span
class="p">:</span><span class="w"> </span><span
class="s">"bootstrap_servers"</span>
+<span class="w"> </span><span
class="nt">confluent_schema_registry_url</span><span class="p">:</span><span
class="w"> </span><span
class="s">"confluent_schema_registry_url"</span>
+<span class="w"> </span><span
class="nt">confluent_schema_registry_subject</span><span
class="p">:</span><span class="w"> </span><span
class="s">"confluent_schema_registry_subject"</span>
+<span class="w"> </span><span class="nt">auto_offset_reset_config</span><span
class="p">:</span><span class="w"> </span><span
class="s">"auto_offset_reset_config"</span>
+<span class="w"> </span><span class="nt">error_handling</span><span
class="p">:</span>
+<span class="w"> </span><span class="nt">output</span><span
class="p">:</span><span class="w"> </span><span
class="s">"output"</span>
+<span class="w"> </span><span class="nt">file_descriptor_path</span><span
class="p">:</span><span class="w"> </span><span
class="s">"file_descriptor_path"</span>
+<span class="w"> </span><span class="nt">message_name</span><span
class="p">:</span><span class="w"> </span><span
class="s">"message_name"</span>
+<span class="w"> </span><span class="nt">max_read_time_seconds</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">max_read_time_seconds</span>
</code></pre></div>
<hr><h2 id="writetokafka">WriteToKafka</h2>
<h3 id="configuration_39">Configuration</h3>
+<ul>
+<li>
+<p><strong>format</strong> <code>string</code> : The encoding format for the
data stored in Kafka. Valid options are: RAW,JSON,AVRO,PROTO</p>
+</li>
+<li>
+<p><strong>topic</strong> <code>string</code> </p>
+</li>
+<li>
+<p><strong>bootstrap_servers</strong> <code>string</code> : A list of
host/port pairs to use for establishing the initial connection to the Kafka
cluster. The client will make use of all servers irrespective of which servers
are specified here for bootstrapping—this list only impacts the initial hosts
used to discover the full set of servers. | Format:
host1:port1,host2:port2,...</p>
+</li>
+<li>
+<p><strong>producer_config_updates</strong> <code>Map[string, string]</code>
(Optional) : A list of key-value pairs that act as configuration parameters
for Kafka producers. Most of these configurations will not be needed, but if
you need to customize your Kafka producer, you may use this. See a detailed
list:
https://docs.confluent.io/platform/current/installation/configuration/producer-configs.html</p>
+</li>
+<li>
+<p><strong>error_handling</strong> <code>Row</code> (Optional) : This option
specifies whether and where to output unwritable rows. </p>
+<p>Row fields:</p>
+<ul>
+<li><strong>output</strong> <code>string</code> : Name to use for the output
error collection</li>
+</ul>
+</li>
+<li>
+<p><strong>file_descriptor_path</strong> <code>string</code> (Optional) : The
path to the Protocol Buffer File Descriptor Set file. This file is used for
schema definition and message serialization.</p>
+</li>
+<li>
+<p><strong>message_name</strong> <code>string</code> (Optional) : The name of
the Protocol Buffer message to be used for schema extraction and data
conversion.</p>
+</li>
+<li>
+<p><strong>schema</strong> <code>string</code> (Optional)</p>
+</li>
+</ul>
<h3 id="usage_39">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">WriteToKafka</span>
<span class="nt">input</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">format</span><span
class="p">:</span><span class="w"> </span><span
class="s">"format"</span>
+<span class="w"> </span><span class="nt">topic</span><span
class="p">:</span><span class="w"> </span><span
class="s">"topic"</span>
+<span class="w"> </span><span class="nt">bootstrap_servers</span><span
class="p">:</span><span class="w"> </span><span
class="s">"bootstrap_servers"</span>
+<span class="w"> </span><span class="nt">producer_config_updates</span><span
class="p">:</span>
+<span class="w"> </span><span class="nt">a</span><span
class="p">:</span><span class="w"> </span><span
class="s">"producer_config_updates_value_a"</span>
+<span class="w"> </span><span class="nt">b</span><span
class="p">:</span><span class="w"> </span><span
class="s">"producer_config_updates_value_b"</span>
+<span class="w"> </span><span class="nt">c</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">...</span>
+<span class="w"> </span><span class="nt">error_handling</span><span
class="p">:</span>
+<span class="w"> </span><span class="nt">output</span><span
class="p">:</span><span class="w"> </span><span
class="s">"output"</span>
+<span class="w"> </span><span class="nt">file_descriptor_path</span><span
class="p">:</span><span class="w"> </span><span
class="s">"file_descriptor_path"</span>
+<span class="w"> </span><span class="nt">message_name</span><span
class="p">:</span><span class="w"> </span><span
class="s">"message_name"</span>
+<span class="w"> </span><span class="nt">schema</span><span
class="p">:</span><span class="w"> </span><span
class="s">"schema"</span>
</code></pre></div>
<hr><h2 id="readfrommysql">ReadFromMySql</h2>
+<p>Read from a MySQL source using a SQL query or by directly accessing a
single table.</p>
+<p>This is a special case of <a href="#readfromjdbc">ReadFromJdbc</a> that
includes the necessary MySQL Driver and classes.</p>
+<p>An example of using ReadFromMySql with SQL query: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">ReadFromMySql</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:mysql://my-host:3306/database"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"SELECT</span><span class="nv"> </span><span
class="s">*</span><span class="nv"> </span><span class="s">FROM</span><span
class="nv"> </span><span class="s">table"</span>
+</code></pre></div>
+
+<p>It is also possible to read a table by specifying a table name. For
example, the following configuration will perform a read on an entire table:
</p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">ReadFromMySql</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:mysql://my-host:3306/database"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"my-table"</span>
+</code></pre></div>
+
+<h4 id="advanced-usage">Advanced Usage</h4>
+<p>It might be necessary to use a custom JDBC driver that is not packaged with
this transform. If that is the case, see <a
href="#readfromjdbc">ReadFromJdbc</a> which allows for more custom
configuration.</p>
<h3 id="configuration_40">Configuration</h3>
+<ul>
+<li>
+<p><strong>url</strong> <code>string</code> : Connection URL for the JDBC
source.</p>
+</li>
+<li>
+<p><strong>connection_init_sql</strong> <code>Array[string]</code> (Optional)
: Sets the connection init sql statements used by the Driver. Only MySQL and
MariaDB support this.</p>
+</li>
+<li>
+<p><strong>connection_properties</strong> <code>string</code> (Optional) :
Used to set connection properties passed to the JDBC driver not already defined
as standalone parameter (e.g. username and password can be set using parameters
above accordingly). Format of the string must be "key1=value1;key2=value2;".</p>
+</li>
+<li>
+<p><strong>disable_auto_commit</strong> <code>boolean</code> (Optional) :
Whether to disable auto commit on read. Defaults to true if not provided. The
need for this config varies depending on the database platform. Informix
requires this to be set to false while Postgres requires this to be set to
true.</p>
+</li>
+<li>
+<p><strong>fetch_size</strong> <code>int32</code> (Optional) : This method is
used to override the size of the data that is going to be fetched and loaded in
memory per every database call. It should ONLY be used if the default value
throws memory errors.</p>
+</li>
+<li>
+<p><strong>output_parallelization</strong> <code>boolean</code> (Optional) :
Whether to reshuffle the resulting PCollection so results are distributed to
all workers.</p>
+</li>
+<li>
+<p><strong>password</strong> <code>string</code> (Optional) : Password for
the JDBC source.</p>
+</li>
+<li>
+<p><strong>query</strong> <code>string</code> (Optional) : SQL query used to
query the JDBC source.</p>
+</li>
+<li>
+<p><strong>table</strong> <code>string</code> (Optional) : Name of the table
to read from.</p>
+</li>
+<li>
+<p><strong>partition_column</strong> <code>string</code> (Optional) : Name of
a column of numeric type that will be used for partitioning.</p>
+</li>
+<li>
+<p><strong>num_partitions</strong> <code>int32</code> (Optional) : The number
of partitions</p>
+</li>
+<li>
+<p><strong>username</strong> <code>string</code> (Optional) : Username for
the JDBC source.</p>
+</li>
+</ul>
<h3 id="usage_40">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">ReadFromMySql</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span class="s">"url"</span>
+<span class="w"> </span><span class="nt">connection_init_sql</span><span
class="p">:</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"connection_init_sql"</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"connection_init_sql"</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="w"> </span><span class="nt">connection_properties</span><span
class="p">:</span><span class="w"> </span><span
class="s">"connection_properties"</span>
+<span class="w"> </span><span class="nt">disable_auto_commit</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
+<span class="w"> </span><span class="nt">fetch_size</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">fetch_size</span>
+<span class="w"> </span><span class="nt">output_parallelization</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
+<span class="w"> </span><span class="nt">password</span><span
class="p">:</span><span class="w"> </span><span
class="s">"password"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"query"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
+<span class="w"> </span><span class="nt">partition_column</span><span
class="p">:</span><span class="w"> </span><span
class="s">"partition_column"</span>
+<span class="w"> </span><span class="nt">num_partitions</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">num_partitions</span>
+<span class="w"> </span><span class="nt">username</span><span
class="p">:</span><span class="w"> </span><span
class="s">"username"</span>
</code></pre></div>
<hr><h2 id="writetomysql">WriteToMySql</h2>
+<p>Write to a MySQL sink using a SQL query or by directly accessing a single
table.</p>
+<p>This is a special case of <a href="#writetojdbc">WriteToJdbc</a> that
includes the necessary MySQL Driver and classes.</p>
+<p>An example of using WriteToMySql with SQL query: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">WriteToMySql</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:mysql://my-host:3306/database"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"INSERT</span><span class="nv"> </span><span
class="s">INTO</span><span class="nv"> </span><span class="s">table</span><span
class="nv"> </span><span class="s">VALUES(?,</span><span class="nv">
</span><span class="s">?)"</span>
+</code></pre></div>
+
+<p>It is also possible to read a table by specifying a table name. For
example, the following configuration will perform a read on an entire table:
</p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">WriteToMySql</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:mysql://my-host:3306/database"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"my-table"</span>
+</code></pre></div>
+
+<h4 id="advanced-usage_1">Advanced Usage</h4>
+<p>It might be necessary to use a custom JDBC driver that is not packaged with
this transform. If that is the case, see <a href="#writetojdbc">WriteToJdbc</a>
which allows for more custom configuration.</p>
<h3 id="configuration_41">Configuration</h3>
+<ul>
+<li>
+<p><strong>url</strong> <code>string</code> : Connection URL for the JDBC
sink.</p>
+</li>
+<li>
+<p><strong>auto_sharding</strong> <code>boolean</code> (Optional) : If true,
enables using a dynamically determined number of shards to write.</p>
+</li>
+<li>
+<p><strong>connection_init_sql</strong> <code>Array[string]</code> (Optional)
: Sets the connection init sql statements used by the Driver. Only MySQL and
MariaDB support this.</p>
+</li>
+<li>
+<p><strong>connection_properties</strong> <code>string</code> (Optional) :
Used to set connection properties passed to the JDBC driver not already defined
as standalone parameter (e.g. username and password can be set using parameters
above accordingly). Format of the string must be "key1=value1;key2=value2;".</p>
+</li>
+<li>
+<p><strong>password</strong> <code>string</code> (Optional) : Password for
the JDBC source.</p>
+</li>
+<li>
+<p><strong>table</strong> <code>string</code> (Optional) : Name of the table
to write to.</p>
+</li>
+<li>
+<p><strong>batch_size</strong> <code>int64</code> (Optional) </p>
+</li>
+<li>
+<p><strong>username</strong> <code>string</code> (Optional) : Username for
the JDBC source.</p>
+</li>
+<li>
+<p><strong>query</strong> <code>string</code> (Optional) : SQL query used to
insert records into the JDBC sink.</p>
+</li>
+</ul>
<h3 id="usage_41">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">WriteToMySql</span>
<span class="nt">input</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span class="s">"url"</span>
+<span class="w"> </span><span class="nt">auto_sharding</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
+<span class="w"> </span><span class="nt">connection_init_sql</span><span
class="p">:</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"connection_init_sql"</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"connection_init_sql"</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="w"> </span><span class="nt">connection_properties</span><span
class="p">:</span><span class="w"> </span><span
class="s">"connection_properties"</span>
+<span class="w"> </span><span class="nt">password</span><span
class="p">:</span><span class="w"> </span><span
class="s">"password"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
+<span class="w"> </span><span class="nt">batch_size</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">batch_size</span>
+<span class="w"> </span><span class="nt">username</span><span
class="p">:</span><span class="w"> </span><span
class="s">"username"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"query"</span>
</code></pre></div>
<hr><h2 id="readfromoracle">ReadFromOracle</h2>
+<p>Read from a Oracle source using a SQL query or by directly accessing a
single table.</p>
+<p>This is a special case of <a href="#readfromjdbc">ReadFromJdbc</a> that
includes the necessary Oracle Driver and classes.</p>
+<p>An example of using ReadFromOracle with SQL query: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">ReadFromOracle</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:oracle://my-host:1521/database"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"SELECT</span><span class="nv"> </span><span
class="s">*</span><span class="nv"> </span><span class="s">FROM</span><span
class="nv"> </span><span class="s">table"</span>
+</code></pre></div>
+
+<p>It is also possible to read a table by specifying a table name. For
example, the following configuration will perform a read on an entire table:
</p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">ReadFromOracle</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:oracle://my-host:1521/database"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"my-table"</span>
+</code></pre></div>
+
+<h4 id="advanced-usage_2">Advanced Usage</h4>
+<p>It might be necessary to use a custom JDBC driver that is not packaged with
this transform. If that is the case, see <a
href="#readfromjdbc">ReadFromJdbc</a> which allows for more custom
configuration.</p>
<h3 id="configuration_42">Configuration</h3>
+<ul>
+<li>
+<p><strong>url</strong> <code>string</code> : Connection URL for the JDBC
source.</p>
+</li>
+<li>
+<p><strong>connection_properties</strong> <code>string</code> (Optional) :
Used to set connection properties passed to the JDBC driver not already defined
as standalone parameter (e.g. username and password can be set using parameters
above accordingly). Format of the string must be "key1=value1;key2=value2;".</p>
+</li>
+<li>
+<p><strong>disable_auto_commit</strong> <code>boolean</code> (Optional) :
Whether to disable auto commit on read. Defaults to true if not provided. The
need for this config varies depending on the database platform. Informix
requires this to be set to false while Postgres requires this to be set to
true.</p>
+</li>
+<li>
+<p><strong>fetch_size</strong> <code>int32</code> (Optional) : This method is
used to override the size of the data that is going to be fetched and loaded in
memory per every database call. It should ONLY be used if the default value
throws memory errors.</p>
+</li>
+<li>
+<p><strong>output_parallelization</strong> <code>boolean</code> (Optional) :
Whether to reshuffle the resulting PCollection so results are distributed to
all workers.</p>
+</li>
+<li>
+<p><strong>password</strong> <code>string</code> (Optional) : Password for
the JDBC source.</p>
+</li>
+<li>
+<p><strong>query</strong> <code>string</code> (Optional) : SQL query used to
query the JDBC source.</p>
+</li>
+<li>
+<p><strong>table</strong> <code>string</code> (Optional) : Name of the table
to read from.</p>
+</li>
+<li>
+<p><strong>partition_column</strong> <code>string</code> (Optional) : Name of
a column of numeric type that will be used for partitioning.</p>
+</li>
+<li>
+<p><strong>num_partitions</strong> <code>int32</code> (Optional) : The number
of partitions</p>
+</li>
+<li>
+<p><strong>username</strong> <code>string</code> (Optional) : Username for
the JDBC source.</p>
+</li>
+</ul>
<h3 id="usage_42">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">ReadFromOracle</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span class="s">"url"</span>
+<span class="w"> </span><span class="nt">connection_properties</span><span
class="p">:</span><span class="w"> </span><span
class="s">"connection_properties"</span>
+<span class="w"> </span><span class="nt">disable_auto_commit</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
+<span class="w"> </span><span class="nt">fetch_size</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">fetch_size</span>
+<span class="w"> </span><span class="nt">output_parallelization</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
+<span class="w"> </span><span class="nt">password</span><span
class="p">:</span><span class="w"> </span><span
class="s">"password"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"query"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
+<span class="w"> </span><span class="nt">partition_column</span><span
class="p">:</span><span class="w"> </span><span
class="s">"partition_column"</span>
+<span class="w"> </span><span class="nt">num_partitions</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">num_partitions</span>
+<span class="w"> </span><span class="nt">username</span><span
class="p">:</span><span class="w"> </span><span
class="s">"username"</span>
</code></pre></div>
<hr><h2 id="writetooracle">WriteToOracle</h2>
+<p>Write to a Oracle sink using a SQL query or by directly accessing a single
table.</p>
+<p>This is a special case of <a href="#writetojdbc">WriteToJdbc</a> that
includes the necessary Oracle Driver and classes.</p>
+<p>An example of using WriteToOracle with SQL query: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">WriteToOracle</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:oracle://my-host:1521/database"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"INSERT</span><span class="nv"> </span><span
class="s">INTO</span><span class="nv"> </span><span class="s">table</span><span
class="nv"> </span><span class="s">VALUES(?,</span><span class="nv">
</span><span class="s">?)"</span>
+</code></pre></div>
+
+<p>It is also possible to read a table by specifying a table name. For
example, the following configuration will perform a read on an entire table:
</p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">WriteToOracle</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:oracle://my-host:1521/database"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"my-table"</span>
+</code></pre></div>
+
+<h4 id="advanced-usage_3">Advanced Usage</h4>
+<p>It might be necessary to use a custom JDBC driver that is not packaged with
this transform. If that is the case, see <a href="#writetojdbc">WriteToJdbc</a>
which allows for more custom configuration.</p>
<h3 id="configuration_43">Configuration</h3>
+<ul>
+<li>
+<p><strong>url</strong> <code>string</code> : Connection URL for the JDBC
sink.</p>
+</li>
+<li>
+<p><strong>auto_sharding</strong> <code>boolean</code> (Optional) : If true,
enables using a dynamically determined number of shards to write.</p>
+</li>
+<li>
+<p><strong>connection_properties</strong> <code>string</code> (Optional) :
Used to set connection properties passed to the JDBC driver not already defined
as standalone parameter (e.g. username and password can be set using parameters
above accordingly). Format of the string must be "key1=value1;key2=value2;".</p>
+</li>
+<li>
+<p><strong>password</strong> <code>string</code> (Optional) : Password for
the JDBC source.</p>
+</li>
+<li>
+<p><strong>table</strong> <code>string</code> (Optional) : Name of the table
to write to.</p>
+</li>
+<li>
+<p><strong>batch_size</strong> <code>int64</code> (Optional) </p>
+</li>
+<li>
+<p><strong>username</strong> <code>string</code> (Optional) : Username for
the JDBC source.</p>
+</li>
+<li>
+<p><strong>query</strong> <code>string</code> (Optional) : SQL query used to
insert records into the JDBC sink.</p>
+</li>
+</ul>
<h3 id="usage_43">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">WriteToOracle</span>
<span class="nt">input</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span class="s">"url"</span>
+<span class="w"> </span><span class="nt">auto_sharding</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
+<span class="w"> </span><span class="nt">connection_properties</span><span
class="p">:</span><span class="w"> </span><span
class="s">"connection_properties"</span>
+<span class="w"> </span><span class="nt">password</span><span
class="p">:</span><span class="w"> </span><span
class="s">"password"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
+<span class="w"> </span><span class="nt">batch_size</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">batch_size</span>
+<span class="w"> </span><span class="nt">username</span><span
class="p">:</span><span class="w"> </span><span
class="s">"username"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"query"</span>
</code></pre></div>
<hr><h2 id="readfromparquet">ReadFromParquet</h2>
@@ -1689,18 +2457,134 @@
https://iceberg.apache.org/spec/#partition-transforms.</li>
</code></pre></div>
<hr><h2 id="readfrompostgres">ReadFromPostgres</h2>
+<p>Read from a Postgres source using a SQL query or by directly accessing a
single table.</p>
+<p>This is a special case of <a href="#readfromjdbc">ReadFromJdbc</a> that
includes the necessary Postgres Driver and classes.</p>
+<p>An example of using ReadFromPostgres with SQL query: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">ReadFromPostgres</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:postgresql://my-host:5432/database"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"SELECT</span><span class="nv"> </span><span
class="s">*</span><span class="nv"> </span><span class="s">FROM</span><span
class="nv"> </span><span class="s">table"</span>
+</code></pre></div>
+
+<p>It is also possible to read a table by specifying a table name. For
example, the following configuration will perform a read on an entire table:
</p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">ReadFromPostgres</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:postgresql://my-host:5432/database"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"my-table"</span>
+</code></pre></div>
+
+<h4 id="advanced-usage_4">Advanced Usage</h4>
+<p>It might be necessary to use a custom JDBC driver that is not packaged with
this transform. If that is the case, see <a
href="#readfromjdbc">ReadFromJdbc</a> which allows for more custom
configuration.</p>
<h3 id="configuration_46">Configuration</h3>
+<ul>
+<li>
+<p><strong>url</strong> <code>string</code> : Connection URL for the JDBC
source.</p>
+</li>
+<li>
+<p><strong>connection_properties</strong> <code>string</code> (Optional) :
Used to set connection properties passed to the JDBC driver not already defined
as standalone parameter (e.g. username and password can be set using parameters
above accordingly). Format of the string must be "key1=value1;key2=value2;".</p>
+</li>
+<li>
+<p><strong>disable_auto_commit</strong> <code>boolean</code> (Optional) :
Whether to disable auto commit on read. Defaults to true if not provided. The
need for this config varies depending on the database platform. Informix
requires this to be set to false while Postgres requires this to be set to
true.</p>
+</li>
+<li>
+<p><strong>fetch_size</strong> <code>int32</code> (Optional) : This method is
used to override the size of the data that is going to be fetched and loaded in
memory per every database call. It should ONLY be used if the default value
throws memory errors.</p>
+</li>
+<li>
+<p><strong>output_parallelization</strong> <code>boolean</code> (Optional) :
Whether to reshuffle the resulting PCollection so results are distributed to
all workers.</p>
+</li>
+<li>
+<p><strong>password</strong> <code>string</code> (Optional) : Password for
the JDBC source.</p>
+</li>
+<li>
+<p><strong>query</strong> <code>string</code> (Optional) : SQL query used to
query the JDBC source.</p>
+</li>
+<li>
+<p><strong>table</strong> <code>string</code> (Optional) : Name of the table
to read from.</p>
+</li>
+<li>
+<p><strong>partition_column</strong> <code>string</code> (Optional) : Name of
a column of numeric type that will be used for partitioning.</p>
+</li>
+<li>
+<p><strong>num_partitions</strong> <code>int32</code> (Optional) : The number
of partitions</p>
+</li>
+<li>
+<p><strong>username</strong> <code>string</code> (Optional) : Username for
the JDBC source.</p>
+</li>
+</ul>
<h3 id="usage_46">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">ReadFromPostgres</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span class="s">"url"</span>
+<span class="w"> </span><span class="nt">connection_properties</span><span
class="p">:</span><span class="w"> </span><span
class="s">"connection_properties"</span>
+<span class="w"> </span><span class="nt">disable_auto_commit</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
+<span class="w"> </span><span class="nt">fetch_size</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">fetch_size</span>
+<span class="w"> </span><span class="nt">output_parallelization</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
+<span class="w"> </span><span class="nt">password</span><span
class="p">:</span><span class="w"> </span><span
class="s">"password"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"query"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
+<span class="w"> </span><span class="nt">partition_column</span><span
class="p">:</span><span class="w"> </span><span
class="s">"partition_column"</span>
+<span class="w"> </span><span class="nt">num_partitions</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">num_partitions</span>
+<span class="w"> </span><span class="nt">username</span><span
class="p">:</span><span class="w"> </span><span
class="s">"username"</span>
</code></pre></div>
<hr><h2 id="writetopostgres">WriteToPostgres</h2>
+<p>Write to a Postgres sink using a SQL query or by directly accessing a
single table.</p>
+<p>This is a special case of <a href="#writetojdbc">WriteToJdbc</a> that
includes the necessary Postgres Driver and classes.</p>
+<p>An example of using WriteToPostgres with SQL query: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">WriteToPostgres</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:postgresql://my-host:5432/database"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"INSERT</span><span class="nv"> </span><span
class="s">INTO</span><span class="nv"> </span><span class="s">table</span><span
class="nv"> </span><span class="s">VALUES(?,</span><span class="nv">
</span><span class="s">?)"</span>
+</code></pre></div>
+
+<p>It is also possible to read a table by specifying a table name. For
example, the following configuration will perform a read on an entire table:
</p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">WriteToPostgres</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:postgresql://my-host:5432/database"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"my-table"</span>
+</code></pre></div>
+
+<h4 id="advanced-usage_5">Advanced Usage</h4>
+<p>It might be necessary to use a custom JDBC driver that is not packaged with
this transform. If that is the case, see <a href="#writetojdbc">WriteToJdbc</a>
which allows for more custom configuration.</p>
<h3 id="configuration_47">Configuration</h3>
+<ul>
+<li>
+<p><strong>url</strong> <code>string</code> : Connection URL for the JDBC
sink.</p>
+</li>
+<li>
+<p><strong>auto_sharding</strong> <code>boolean</code> (Optional) : If true,
enables using a dynamically determined number of shards to write.</p>
+</li>
+<li>
+<p><strong>connection_properties</strong> <code>string</code> (Optional) :
Used to set connection properties passed to the JDBC driver not already defined
as standalone parameter (e.g. username and password can be set using parameters
above accordingly). Format of the string must be "key1=value1;key2=value2;".</p>
+</li>
+<li>
+<p><strong>password</strong> <code>string</code> (Optional) : Password for
the JDBC source.</p>
+</li>
+<li>
+<p><strong>table</strong> <code>string</code> (Optional) : Name of the table
to write to.</p>
+</li>
+<li>
+<p><strong>batch_size</strong> <code>int64</code> (Optional) </p>
+</li>
+<li>
+<p><strong>username</strong> <code>string</code> (Optional) : Username for
the JDBC source.</p>
+</li>
+<li>
+<p><strong>query</strong> <code>string</code> (Optional) : SQL query used to
insert records into the JDBC sink.</p>
+</li>
+</ul>
<h3 id="usage_47">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">WriteToPostgres</span>
<span class="nt">input</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span class="s">"url"</span>
+<span class="w"> </span><span class="nt">auto_sharding</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
+<span class="w"> </span><span class="nt">connection_properties</span><span
class="p">:</span><span class="w"> </span><span
class="s">"connection_properties"</span>
+<span class="w"> </span><span class="nt">password</span><span
class="p">:</span><span class="w"> </span><span
class="s">"password"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
+<span class="w"> </span><span class="nt">batch_size</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">batch_size</span>
+<span class="w"> </span><span class="nt">username</span><span
class="p">:</span><span class="w"> </span><span
class="s">"username"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"query"</span>
</code></pre></div>
<hr><h2 id="readfrompubsub">ReadFromPubSub</h2>
@@ -1860,33 +2744,272 @@
https://iceberg.apache.org/spec/#partition-transforms.</li>
</code></pre></div>
<hr><h2 id="readfromspanner">ReadFromSpanner</h2>
+<p>Performs a Bulk read from Google Cloud Spanner using a specified SQL query
or by directly accessing a single table and its columns.</p>
+<p>Both Query and Read APIs are supported. See more information about <a
href="https://cloud.google.com/spanner/docs/reads">reading from Cloud
Spanner</a>.</p>
+<p>Example configuration for performing a read using a SQL query: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">ReadFromSpanner</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">instance_id</span><span
class="p">:</span><span class="w"> </span><span
class="s">'my-instance-id'</span>
+<span class="w"> </span><span class="nt">database_id</span><span
class="p">:</span><span class="w"> </span><span
class="s">'my-database'</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">'SELECT</span><span class="nv"> </span><span
class="s">*</span><span class="nv"> </span><span class="s">FROM</span><span
class="nv"> </span><span class="s">table'</span>
+</code></pre></div>
+
+<p>It is also possible to read a table by specifying a table name and a list
of columns. For example, the following configuration will perform a read on an
entire table: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">ReadFromSpanner</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">instance_id</span><span
class="p">:</span><span class="w"> </span><span
class="s">'my-instance-id'</span>
+<span class="w"> </span><span class="nt">database_id</span><span
class="p">:</span><span class="w"> </span><span
class="s">'my-database'</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">'my-table'</span>
+<span class="w"> </span><span class="nt">columns</span><span
class="p">:</span><span class="w"> </span><span class="p
p-Indicator">[</span><span class="s">'col1'</span><span class="p
p-Indicator">,</span><span class="w"> </span><span
class="s">'col2'</span><span class="p p-Indicator">]</span>
+</code></pre></div>
+
+<p>Additionally, to read using a <a
href="https://cloud.google.com/spanner/docs/secondary-indexes">Secondary
Index</a>, specify the index name: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">ReadFromSpanner</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">instance_id</span><span
class="p">:</span><span class="w"> </span><span
class="s">'my-instance-id'</span>
+<span class="w"> </span><span class="nt">database_id</span><span
class="p">:</span><span class="w"> </span><span
class="s">'my-database'</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">'my-table'</span>
+<span class="w"> </span><span class="nt">index</span><span
class="p">:</span><span class="w"> </span><span
class="s">'my-index'</span>
+<span class="w"> </span><span class="nt">columns</span><span
class="p">:</span><span class="w"> </span><span class="p
p-Indicator">[</span><span class="s">'col1'</span><span class="p
p-Indicator">,</span><span class="w"> </span><span
class="s">'col2'</span><span class="p p-Indicator">]</span>
+</code></pre></div>
+
+<h4 id="advanced-usage_6">Advanced Usage</h4>
+<p>Reads by default use the <a
href="https://cloud.google.com/spanner/docs/reads#read_data_in_parallel">PartitionQuery
API</a> which enforces some limitations on the type of queries that can be
used so that the data can be read in parallel. If the query is not supported by
the PartitionQuery API, then you can specify a non-partitioned read by setting
batching to false.</p>
+<p>For example: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">ReadFromSpanner</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">batching</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">false</span>
+<span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">...</span>
+</code></pre></div>
+
+<p>Note: See <a
href="https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.html">SpannerIO</a>
for more advanced information.</p>
<h3 id="configuration_50">Configuration</h3>
+<ul>
+<li>
+<p><strong>project</strong> <code>string</code> (Optional) : Specifies the
GCP project ID.</p>
+</li>
+<li>
+<p><strong>instance</strong> <code>string</code> : Specifies the Cloud
Spanner instance.</p>
+</li>
+<li>
+<p><strong>database</strong> <code>string</code> : Specifies the Cloud
Spanner database.</p>
+</li>
+<li>
+<p><strong>table</strong> <code>string</code> (Optional) : Specifies the
Cloud Spanner table.</p>
+</li>
+<li>
+<p><strong>query</strong> <code>string</code> (Optional) : Specifies the SQL
query to execute.</p>
+</li>
+<li>
+<p><strong>columns</strong> <code>Array[string]</code> (Optional) : Specifies
the columns to read from the table. This parameter is required when table is
specified.</p>
+</li>
+<li>
+<p><strong>index</strong> <code>string</code> (Optional) : Specifies the
Index to read from. This parameter can only be specified when using table.</p>
+</li>
+<li>
+<p><strong>batching</strong> <code>boolean</code> (Optional) : Set to false
to disable batching. Useful when using a query that is not compatible with the
PartitionQuery API. Defaults to true.</p>
+</li>
+<li>
+<p><strong>error_handling</strong> <code>Row</code> (Optional) : This option
specifies whether and where to output unwritable rows. </p>
+<p>Row fields:</p>
+<ul>
+<li><strong>output</strong> <code>string</code> : Name to use for the output
error collection</li>
+</ul>
+</li>
+</ul>
<h3 id="usage_50">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">ReadFromSpanner</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">project</span><span
class="p">:</span><span class="w"> </span><span
class="s">"project"</span>
+<span class="w"> </span><span class="nt">instance</span><span
class="p">:</span><span class="w"> </span><span
class="s">"instance"</span>
+<span class="w"> </span><span class="nt">database</span><span
class="p">:</span><span class="w"> </span><span
class="s">"database"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"query"</span>
+<span class="w"> </span><span class="nt">columns</span><span
class="p">:</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"columns"</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="s">"columns"</span>
+<span class="w"> </span><span class="p p-Indicator">-</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="w"> </span><span class="nt">index</span><span
class="p">:</span><span class="w"> </span><span
class="s">"index"</span>
+<span class="w"> </span><span class="nt">batching</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
+<span class="w"> </span><span class="nt">error_handling</span><span
class="p">:</span>
+<span class="w"> </span><span class="nt">output</span><span
class="p">:</span><span class="w"> </span><span
class="s">"output"</span>
</code></pre></div>
<hr><h2 id="writetospanner">WriteToSpanner</h2>
+<p>Performs a bulk write to a Google Cloud Spanner table.</p>
+<p>Example configuration for performing a write to a single table: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">ReadFromSpanner</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">project_id</span><span
class="p">:</span><span class="w"> </span><span
class="s">'my-project-id'</span>
+<span class="w"> </span><span class="nt">instance_id</span><span
class="p">:</span><span class="w"> </span><span
class="s">'my-instance-id'</span>
+<span class="w"> </span><span class="nt">database_id</span><span
class="p">:</span><span class="w"> </span><span
class="s">'my-database'</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">'my-table'</span>
+</code></pre></div>
+
+<p>Note: See <a
href="https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.html">SpannerIO</a>
for more advanced information.</p>
<h3 id="configuration_51">Configuration</h3>
+<ul>
+<li>
+<p><strong>project</strong> <code>string</code> (Optional) : Specifies the
GCP project.</p>
+</li>
+<li>
+<p><strong>instance</strong> <code>string</code> : Specifies the Cloud
Spanner instance.</p>
+</li>
+<li>
+<p><strong>database</strong> <code>string</code> : Specifies the Cloud
Spanner database.</p>
+</li>
+<li>
+<p><strong>table</strong> <code>string</code> : Specifies the Cloud Spanner
table.</p>
+</li>
+<li>
+<p><strong>error_handling</strong> <code>Row</code> (Optional) : Whether and
how to handle write errors. </p>
+<p>Row fields:</p>
+<ul>
+<li><strong>output</strong> <code>string</code> : Name to use for the output
error collection</li>
+</ul>
+</li>
+</ul>
<h3 id="usage_51">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">WriteToSpanner</span>
<span class="nt">input</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">project</span><span
class="p">:</span><span class="w"> </span><span
class="s">"project"</span>
+<span class="w"> </span><span class="nt">instance</span><span
class="p">:</span><span class="w"> </span><span
class="s">"instance"</span>
+<span class="w"> </span><span class="nt">database</span><span
class="p">:</span><span class="w"> </span><span
class="s">"database"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
+<span class="w"> </span><span class="nt">error_handling</span><span
class="p">:</span>
+<span class="w"> </span><span class="nt">output</span><span
class="p">:</span><span class="w"> </span><span
class="s">"output"</span>
</code></pre></div>
<hr><h2 id="readfromsqlserver">ReadFromSqlServer</h2>
+<p>Read from a SQL Server source using a SQL query or by directly accessing a
single table.</p>
+<p>This is a special case of <a href="#readfromjdbc">ReadFromJdbc</a> that
includes the necessary SQL Server Driver and classes.</p>
+<p>An example of using ReadFromSqlServer with SQL query: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">ReadFromSqlServer</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:sqlserver://my-host:1433/database"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"SELECT</span><span class="nv"> </span><span
class="s">*</span><span class="nv"> </span><span class="s">FROM</span><span
class="nv"> </span><span class="s">table"</span>
+</code></pre></div>
+
+<p>It is also possible to read a table by specifying a table name. For
example, the following configuration will perform a read on an entire table:
</p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">ReadFromSqlServer</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:sqlserver://my-host:1433/database"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"my-table"</span>
+</code></pre></div>
+
+<h4 id="advanced-usage_7">Advanced Usage</h4>
+<p>It might be necessary to use a custom JDBC driver that is not packaged with
this transform. If that is the case, see <a
href="#readfromjdbc">ReadFromJdbc</a> which allows for more custom
configuration.</p>
<h3 id="configuration_52">Configuration</h3>
+<ul>
+<li>
+<p><strong>url</strong> <code>string</code> : Connection URL for the JDBC
source.</p>
+</li>
+<li>
+<p><strong>connection_properties</strong> <code>string</code> (Optional) :
Used to set connection properties passed to the JDBC driver not already defined
as standalone parameter (e.g. username and password can be set using parameters
above accordingly). Format of the string must be "key1=value1;key2=value2;".</p>
+</li>
+<li>
+<p><strong>disable_auto_commit</strong> <code>boolean</code> (Optional) :
Whether to disable auto commit on read. Defaults to true if not provided. The
need for this config varies depending on the database platform. Informix
requires this to be set to false while Postgres requires this to be set to
true.</p>
+</li>
+<li>
+<p><strong>fetch_size</strong> <code>int32</code> (Optional) : This method is
used to override the size of the data that is going to be fetched and loaded in
memory per every database call. It should ONLY be used if the default value
throws memory errors.</p>
+</li>
+<li>
+<p><strong>output_parallelization</strong> <code>boolean</code> (Optional) :
Whether to reshuffle the resulting PCollection so results are distributed to
all workers.</p>
+</li>
+<li>
+<p><strong>password</strong> <code>string</code> (Optional) : Password for
the JDBC source.</p>
+</li>
+<li>
+<p><strong>query</strong> <code>string</code> (Optional) : SQL query used to
query the JDBC source.</p>
+</li>
+<li>
+<p><strong>table</strong> <code>string</code> (Optional) : Name of the table
to read from.</p>
+</li>
+<li>
+<p><strong>partition_column</strong> <code>string</code> (Optional) : Name of
a column of numeric type that will be used for partitioning.</p>
+</li>
+<li>
+<p><strong>num_partitions</strong> <code>int32</code> (Optional) : The number
of partitions</p>
+</li>
+<li>
+<p><strong>username</strong> <code>string</code> (Optional) : Username for
the JDBC source.</p>
+</li>
+</ul>
<h3 id="usage_52">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">ReadFromSqlServer</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span class="s">"url"</span>
+<span class="w"> </span><span class="nt">connection_properties</span><span
class="p">:</span><span class="w"> </span><span
class="s">"connection_properties"</span>
+<span class="w"> </span><span class="nt">disable_auto_commit</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
+<span class="w"> </span><span class="nt">fetch_size</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">fetch_size</span>
+<span class="w"> </span><span class="nt">output_parallelization</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
+<span class="w"> </span><span class="nt">password</span><span
class="p">:</span><span class="w"> </span><span
class="s">"password"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"query"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
+<span class="w"> </span><span class="nt">partition_column</span><span
class="p">:</span><span class="w"> </span><span
class="s">"partition_column"</span>
+<span class="w"> </span><span class="nt">num_partitions</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">num_partitions</span>
+<span class="w"> </span><span class="nt">username</span><span
class="p">:</span><span class="w"> </span><span
class="s">"username"</span>
</code></pre></div>
<hr><h2 id="writetosqlserver">WriteToSqlServer</h2>
+<p>Write to a SQL Server sink using a SQL query or by directly accessing a
single table.</p>
+<p>This is a special case of <a href="#writetojdbc">WriteToJdbc</a> that
includes the necessary SQL Server Driver and classes.</p>
+<p>An example of using WriteToSqlServer with SQL query: </p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">WriteToSqlServer</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:sqlserver://my-host:1433/database"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"INSERT</span><span class="nv"> </span><span
class="s">INTO</span><span class="nv"> </span><span class="s">table</span><span
class="nv"> </span><span class="s">VALUES(?,</span><span class="nv">
</span><span class="s">?)"</span>
+</code></pre></div>
+
+<p>It is also possible to read a table by specifying a table name. For
example, the following configuration will perform a read on an entire table:
</p>
+<div class="codehilite"><pre><span></span><code><span class="p
p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">WriteToSqlServer</span>
+<span class="w"> </span><span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span
class="s">"jdbc:sqlserver://my-host:1433/database"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"my-table"</span>
+</code></pre></div>
+
+<h4 id="advanced-usage_8">Advanced Usage</h4>
+<p>It might be necessary to use a custom JDBC driver that is not packaged with
this transform. If that is the case, see <a href="#writetojdbc">WriteToJdbc</a>
which allows for more custom configuration.</p>
<h3 id="configuration_53">Configuration</h3>
+<ul>
+<li>
+<p><strong>url</strong> <code>string</code> : Connection URL for the JDBC
sink.</p>
+</li>
+<li>
+<p><strong>auto_sharding</strong> <code>boolean</code> (Optional) : If true,
enables using a dynamically determined number of shards to write.</p>
+</li>
+<li>
+<p><strong>connection_properties</strong> <code>string</code> (Optional) :
Used to set connection properties passed to the JDBC driver not already defined
as standalone parameter (e.g. username and password can be set using parameters
above accordingly). Format of the string must be "key1=value1;key2=value2;".</p>
+</li>
+<li>
+<p><strong>password</strong> <code>string</code> (Optional) : Password for
the JDBC source.</p>
+</li>
+<li>
+<p><strong>table</strong> <code>string</code> (Optional) : Name of the table
to write to.</p>
+</li>
+<li>
+<p><strong>batch_size</strong> <code>int64</code> (Optional) </p>
+</li>
+<li>
+<p><strong>username</strong> <code>string</code> (Optional) : Username for
the JDBC source.</p>
+</li>
+<li>
+<p><strong>query</strong> <code>string</code> (Optional) : SQL query used to
insert records into the JDBC sink.</p>
+</li>
+</ul>
<h3 id="usage_53">Usage</h3>
<div class="codehilite"><pre><span></span><code><span
class="nt">type</span><span class="p">:</span><span class="w"> </span><span
class="l l-Scalar l-Scalar-Plain">WriteToSqlServer</span>
<span class="nt">input</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
-<span class="nt">config</span><span class="p">:</span><span class="w">
</span><span class="l l-Scalar l-Scalar-Plain">...</span>
+<span class="nt">config</span><span class="p">:</span>
+<span class="w"> </span><span class="nt">url</span><span
class="p">:</span><span class="w"> </span><span class="s">"url"</span>
+<span class="w"> </span><span class="nt">auto_sharding</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">true|false</span>
+<span class="w"> </span><span class="nt">connection_properties</span><span
class="p">:</span><span class="w"> </span><span
class="s">"connection_properties"</span>
+<span class="w"> </span><span class="nt">password</span><span
class="p">:</span><span class="w"> </span><span
class="s">"password"</span>
+<span class="w"> </span><span class="nt">table</span><span
class="p">:</span><span class="w"> </span><span
class="s">"table"</span>
+<span class="w"> </span><span class="nt">batch_size</span><span
class="p">:</span><span class="w"> </span><span class="l l-Scalar
l-Scalar-Plain">batch_size</span>
+<span class="w"> </span><span class="nt">username</span><span
class="p">:</span><span class="w"> </span><span
class="s">"username"</span>
+<span class="w"> </span><span class="nt">query</span><span
class="p">:</span><span class="w"> </span><span
class="s">"query"</span>
</code></pre></div>
<hr><h2 id="readfromtfrecord">ReadFromTFRecord</h2>
@@ -1998,4 +3121,4 @@
https://iceberg.apache.org/spec/#partition-transforms.</li>
</div>
</body>
</html>
-
\ No newline at end of file
+