This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git
The following commit(s) were added to refs/heads/asf-site by this push:
new b5db0e5368d Publishing website 2025/07/16 17:46:04 at commit 84d423f
b5db0e5368d is described below
commit b5db0e5368df00e42f699ee7118346f3f038e45b
Author: runner <runner@main-runner-frrkx-7cnqr>
AuthorDate: Wed Jul 16 17:46:05 2025 +0000
Publishing website 2025/07/16 17:46:04 at commit 84d423f
---
.../extensions/create-external-table/index.html | 63 +++++++++++++++++++++-
website/generated-content/sitemap.xml | 2 +-
2 files changed, 62 insertions(+), 3 deletions(-)
diff --git
a/website/generated-content/documentation/dsls/sql/extensions/create-external-table/index.html
b/website/generated-content/documentation/dsls/sql/extensions/create-external-table/index.html
index c562b194beb..1c5a5122d8a 100644
---
a/website/generated-content/documentation/dsls/sql/extensions/create-external-table/index.html
+++
b/website/generated-content/documentation/dsls/sql/extensions/create-external-table/index.html
@@ -35,7 +35,7 @@
<img class=banner-img-mobile
src=/images/banners/tour-of-beam/tour-of-beam-mobile.png alt="Start Tour of
Beam"></a></div><div class=swiper-slide><a
href=https://beam.apache.org/documentation/ml/overview/><img
class=banner-img-desktop
src=/images/banners/machine-learning/machine-learning-desktop.jpg alt="Machine
Learning">
<img class=banner-img-mobile
src=/images/banners/machine-learning/machine-learning-mobile.jpg alt="Machine
Learning"></a></div></div><div class=swiper-pagination></div><div
class=swiper-button-prev></div><div
class=swiper-button-next></div></div><script
src=/js/swiper-bundle.min.min.e0e8f81b0b15728d35ff73c07f42ddbb17a108d6f23df4953cb3e60df7ade675.js></script>
<script
src=/js/sliders/top-banners.min.afa7d0a19acf7a3b28ca369490b3d401a619562a2a4c9612577be2f66a4b9855.js></script>
-<script>function showSearch(){addPlaceholder();var
e,t=document.querySelector(".searchBar");t.classList.remove("disappear"),e=document.querySelector("#iconsBar"),e.classList.add("disappear")}function
addPlaceholder(){$("input:text").attr("placeholder","What are you looking
for?")}function endSearch(){var
e,t=document.querySelector(".searchBar");t.classList.add("disappear"),e=document.querySelector("#iconsBar"),e.classList.remove("disappear")}function
blockScroll(){$("body").toggleClass(" [...]
+<script>function showSearch(){addPlaceholder();var
e,t=document.querySelector(".searchBar");t.classList.remove("disappear"),e=document.querySelector("#iconsBar"),e.classList.add("disappear")}function
addPlaceholder(){$("input:text").attr("placeholder","What are you looking
for?")}function endSearch(){var
e,t=document.querySelector(".searchBar");t.classList.add("disappear"),e=document.querySelector("#iconsBar"),e.classList.remove("disappear")}function
blockScroll(){$("body").toggleClass(" [...]
<a href=/documentation/io/built-in/>external storage system</a>.
For some storage systems, <code>CREATE EXTERNAL TABLE</code> does not create a
physical table until
a write occurs. After the physical table exists, you can access the table with
@@ -256,7 +256,66 @@ See the following table:</li></ul></li></ul><div
class=table-container-wrapper><
types specified in the schema using
org.apache.commons.csv.</li></ul></li></ul><h3 id=schema-5>Schema</h3><p>Only
simple types are supported.</p><h3 id=example-6>Example</h3><pre
tabindex=0><code>CREATE EXTERNAL TABLE orders (id INTEGER, price INTEGER)
TYPE text
LOCATION '/home/admin/orders'
-</code></pre><h2 id=generic-payload-handling>Generic Payload
Handling</h2><p>Certain data sources and sinks support generic payload
handling. This handling
+</code></pre><h2 id=datagen>DataGen</h2><p>The <strong>DataGen</strong>
connector allows for creating tables based on in-memory data generation. This
is useful for developing and testing queries locally without requiring access
to external systems. The DataGen connector is built-in; no additional
dependencies are required.It is available for Beam 2.67.0+</p><p>Tables can be
either <strong>bounded</strong> (generating a fixed number of rows) or
<strong>unbounded</strong> (generating a str [...]
+</span></span></span><span class=line><span class=cl><span
class=w></span><span class=k>TYPE</span><span class=w> </span><span
class=n>datagen</span><span class=w>
+</span></span></span><span class=line><span class=cl><span
class=w></span><span class=p>[</span><span class=n>TBLPROPERTIES</span><span
class=w> </span><span class=n>tblProperties</span><span class=p>]</span><span
class=w>
+</span></span></span></code></pre></div><h3
id=table-properties-tblproperties>Table Properties
(<code>TBLPROPERTIES</code>)</h3><p>The <code>TBLPROPERTIES</code> JSON object
is used to configure the generator’s behavior.</p><h4
id=general-options>General Options</h4><table><thead><tr><th
style=text-align:left>Key</th><th style=text-align:left>Required</th><th
style=text-align:left>Description</th></tr></thead><tbody><tr><td
style=text-align:left><code>number-of-rows</code></td><td [...]
+</span></span></span><span class=line><span class=cl><span class=w>
</span><span class=n>id</span><span class=w> </span><span
class=nb>BIGINT</span><span class=p>,</span><span class=w>
+</span></span></span><span class=line><span class=cl><span class=w>
</span><span class=n>product_name</span><span class=w> </span><span
class=nb>VARCHAR</span><span class=w>
+</span></span></span><span class=line><span class=cl><span
class=w></span><span class=p>)</span><span class=w>
+</span></span></span><span class=line><span class=cl><span
class=w></span><span class=k>TYPE</span><span class=w> </span><span
class=n>datagen</span><span class=w>
+</span></span></span><span class=line><span class=cl><span
class=w></span><span class=n>TBLPROPERTIES</span><span class=w> </span><span
class=s1>'{
+</span></span></span><span class=line><span class=cl><span class=s1>
"number-of-rows": "1000"
+</span></span></span><span class=line><span class=cl><span
class=s1>}'</span><span class=w>
+</span></span></span></code></pre></div><h4
id=unbounded-streaming-table>Unbounded Streaming Table</h4><p>This example
creates a streaming table that generates 10 rows per second.</p><div
class=highlight><pre tabindex=0 class=chroma><code class=language-sql
data-lang=sql><span class=line><span class=cl><span class=k>CREATE</span><span
class=w> </span><span class=k>EXTERNAL</span><span class=w> </span><span
class=k>TABLE</span><span class=w> </span><span
class=n>user_impressions</span><sp [...]
+</span></span></span><span class=line><span class=cl><span class=w>
</span><span class=n>user_id</span><span class=w> </span><span
class=nb>VARCHAR</span><span class=p>,</span><span class=w>
+</span></span></span><span class=line><span class=cl><span class=w>
</span><span class=n>impression_time</span><span class=w> </span><span
class=k>TIMESTAMP</span><span class=w>
+</span></span></span><span class=line><span class=cl><span
class=w></span><span class=p>)</span><span class=w>
+</span></span></span><span class=line><span class=cl><span
class=w></span><span class=k>TYPE</span><span class=w> </span><span
class=n>datagen</span><span class=w>
+</span></span></span><span class=line><span class=cl><span
class=w></span><span class=n>TBLPROPERTIES</span><span class=w> </span><span
class=s1>'{
+</span></span></span><span class=line><span class=cl><span class=s1>
"rows-per-second": "10"
+</span></span></span><span class=line><span class=cl><span
class=s1>}'</span><span class=w>
+</span></span></span></code></pre></div><hr><h4
id=bounded-table-with-custom-field-generation>Bounded Table with Custom Field
Generation</h4><p>This is a comprehensive example demonstrating various
field-level customizations. The table is bounded because a sequence generator
is used.</p><div class=highlight><pre tabindex=0 class=chroma><code
class=language-sql data-lang=sql><span class=line><span class=cl><span
class=k>CREATE</span><span class=w> </span><span class=k>EXTERNAL</span><span
[...]
+</span></span></span><span class=line><span class=cl><span class=w>
</span><span class=n>event_id</span><span class=w> </span><span
class=nb>BIGINT</span><span class=p>,</span><span class=w>
+</span></span></span><span class=line><span class=cl><span class=w>
</span><span class=n>user_id</span><span class=w> </span><span
class=nb>VARCHAR</span><span class=p>,</span><span class=w>
+</span></span></span><span class=line><span class=cl><span class=w>
</span><span class=n>click_timestamp</span><span class=w> </span><span
class=k>TIMESTAMP</span><span class=p>,</span><span class=w>
+</span></span></span><span class=line><span class=cl><span class=w>
</span><span class=n>score</span><span class=w> </span><span
class=n>DOUBLE</span><span class=w>
+</span></span></span><span class=line><span class=cl><span
class=w></span><span class=p>)</span><span class=w>
+</span></span></span><span class=line><span class=cl><span
class=w></span><span class=k>TYPE</span><span class=w> </span><span
class=s1>'datagen'</span><span class=w>
+</span></span></span><span class=line><span class=cl><span
class=w></span><span class=n>TBLPROPERTIES</span><span class=w> </span><span
class=s1>'{
+</span></span></span><span class=line><span class=cl><span class=s1>
"number-of-rows": "1000000",
+</span></span></span><span class=line><span class=cl><span class=s1>
"fields.event_id.kind": "sequence",
+</span></span></span><span class=line><span class=cl><span class=s1>
"fields.event_id.start": "1",
+</span></span></span><span class=line><span class=cl><span class=s1>
"fields.event_id.end": "1000000",
+</span></span></span><span class=line><span class=cl><span class=s1>
"fields.user_id.kind": "random",
+</span></span></span><span class=line><span class=cl><span class=s1>
"fields.user_id.length": "12",
+</span></span></span><span class=line><span class=cl><span class=s1>
"fields.click_timestamp.kind": "random",
+</span></span></span><span class=line><span class=cl><span class=s1>
"fields.click_timestamp.max-past": "60000",
+</span></span></span><span class=line><span class=cl><span class=s1>
"fields.score.kind": "random",
+</span></span></span><span class=line><span class=cl><span class=s1>
"fields.score.min": "0.0",
+</span></span></span><span class=line><span class=cl><span class=s1>
"fields.score.max": "1.0",
+</span></span></span><span class=line><span class=cl><span class=s1>
"fields.score.null-rate": "0.1"
+</span></span></span><span class=line><span class=cl><span
class=s1>}'</span><span class=w>
+</span></span></span></code></pre></div><h4
id=unbounded-streaming-table-with-event-time>Unbounded Streaming Table with
Event Time</h4><p>This example creates a streaming table that generates 10 rows
per second. It uses the <code>click_timestamp</code> column to drive the
event-time watermark, allowing for up to 5 seconds of out-of-order data. The
<code>ingestion_timestamp</code> column is populated separately with the
processing time.</p><div class=highlight><pre tabindex=0 class=chroma [...]
+</span></span></span><span class=line><span class=cl><span class=w>
</span><span class=n>event_id</span><span class=w> </span><span
class=nb>BIGINT</span><span class=p>,</span><span class=w>
+</span></span></span><span class=line><span class=cl><span class=w>
</span><span class=n>user_id</span><span class=w> </span><span
class=nb>VARCHAR</span><span class=p>,</span><span class=w>
+</span></span></span><span class=line><span class=cl><span class=w>
</span><span class=n>click_timestamp</span><span class=w> </span><span
class=k>TIMESTAMP</span><span class=p>,</span><span class=w>
+</span></span></span><span class=line><span class=cl><span class=w>
</span><span class=n>ingestion_timestamp</span><span class=w> </span><span
class=k>TIMESTAMP</span><span class=w>
+</span></span></span><span class=line><span class=cl><span
class=w></span><span class=p>)</span><span class=w>
+</span></span></span><span class=line><span class=cl><span
class=w></span><span class=k>TYPE</span><span class=w> </span><span
class=s1>'datagen'</span><span class=w>
+</span></span></span><span class=line><span class=cl><span
class=w></span><span class=n>TBLPROPERTIES</span><span class=w> </span><span
class=s1>'{
+</span></span></span><span class=line><span class=cl><span class=s1>
"rows-per-second": "10",
+</span></span></span><span class=line><span class=cl><span class=s1>
"timestamp.behavior": "event-time",
+</span></span></span><span class=line><span class=cl><span class=s1>
"event-time.timestamp-column": "click_timestamp",
+</span></span></span><span class=line><span class=cl><span class=s1>
"event-time.max-out-of-orderness": "5000",
+</span></span></span><span class=line><span class=cl><span class=s1>
"fields.event_id.kind": "sequence",
+</span></span></span><span class=line><span class=cl><span class=s1>
"fields.event_id.start": "1",
+</span></span></span><span class=line><span class=cl><span class=s1>
"fields.event_id.end": "1000000",
+</span></span></span><span class=line><span class=cl><span class=s1>
"fields.user_id.kind": "random",
+</span></span></span><span class=line><span class=cl><span class=s1>
"fields.user_id.length": "12",
+</span></span></span><span class=line><span class=cl><span class=s1>
"fields.ingestion_timestamp.kind": "timestamp"
+</span></span></span><span class=line><span class=cl><span
class=s1>}'</span><span class=w>
+</span></span></span></code></pre></div><h2
id=generic-payload-handling>Generic Payload Handling</h2><p>Certain data
sources and sinks support generic payload handling. This handling
parses a byte array payload field into a table schema. The following schemas
are
supported by this handling. All require at least setting <code>"format":
"<type>"</code>,
and may require other properties.</p><ul><li><code>avro</code>: Avro<ul><li>An
Avro schema is automatically generated from the specified field
diff --git a/website/generated-content/sitemap.xml
b/website/generated-content/sitemap.xml
index 8441193e28c..e3ef8e8b1c2 100644
--- a/website/generated-content/sitemap.xml
+++ b/website/generated-content/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/categories/blog/</loc><lastmod>2025-07-16T07:22:40-04:00</lastmod></url><url><loc>/blog/</loc><lastmod>2025-07-16T07:22:40-04:00</lastmod></url><url><loc>/categories/</loc><lastmod>2025-07-16T07:22:40-04:00</lastmod></url><url><loc>/blog/beam-summit-2025-hackathon-pcollectors-blog/</loc><lastmod>2025-07-16T07:22:40-04:00<
[...]
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/categories/blog/</loc><lastmod>2025-07-16T12:42:51-04:00</lastmod></url><url><loc>/blog/</loc><lastmod>2025-07-16T12:42:51-04:00</lastmod></url><url><loc>/categories/</loc><lastmod>2025-07-16T12:42:51-04:00</lastmod></url><url><loc>/blog/beam-summit-2025-hackathon-pcollectors-blog/</loc><lastmod>2025-07-16T12:42:51-04:00<
[...]
\ No newline at end of file