This is an automated email from the ASF dual-hosted git repository. tvalentyn pushed a change to branch tvalentyn-gha in repository https://gitbox.apache.org/repos/asf/beam.git
from 64e6194b948 Preserve existing linter comments. add 1e873f42e14 Use highmem runner for beam_PostRelease_NightlySnapshot.yml (#31749) add b68b29a3ff6 Basic yaml-defined provider. add d2df083a029 Refactor jinja templatiziation to common location. add 3212688e2e6 Merge pull request #31684 Basic yaml-defined provider. add d6139904db8 Use GCP libraries-bom version for all grpc (#31760) add f15ca986811 Use ByteBuffer instead of BytesString which is unsupported in Schema Coders (#31746) add dbe72830b11 Add a test for getting state with MultimapSideInput StateKey (#31757) add f45b5d88e1d Add SerializableSupplier to the core beam.sdk.util package (#31766) add 54b882d285e Replace LGPL dep in Go SDK with an MIT alternative (#31769) add c1ca5156e64 Polish DoFn.Setup add dea440f46d5 Merge pull request #31764 from liferoad/polish-dofn add a1a22835710 Fix CHANGES.md from #31769 which incorrectly added to a released version (#31770) add 0c89a0edb9c Fix playground snippets (#31778) add a5eee589697 Fix flaky StreamingDataflowWorkerTest which wasn't waiting for enough commits. (#31781) add c08afeae60d Enable MapState and SetState for dataflow streaming engine pipelines with legacy runner by building on top of MultimapState. (#31453) add d1df1d7ecc9 Bump cloud.google.com/go/storage from 1.41.0 to 1.43.0 in /sdks (#31772) add 02600f55d21 Set Snowflake escape char to backslash since it is the default used by CSVParser (fixes #24467) (#31779) add ac423af5699 Pass-through IcebergIO catalog properties (#31726) add 631d40d0e79 Stage PrismRunner implementation and dependencies (#31794) add 8a88f1583f0 Solace Read connector: adding implementations of SempClient and SempClientFactory (#31542) add 746f3c5557e Use go 1.22 for self-hosted GHAs (#31767) add de4645d4507 Add support for StringSet metric in Java SDK. (#31789) add 516bbc77ef3 Add support for WindowStrategy Pane and AllowedLatness features (#31806) add dc10b77ce00 Update Go Version to 1.22.5 (#31812) add 9721aca8f50 Fix PostCommit Java ValidatesRunner Samza job (#31773) add 1db2373debc correctly close the javadoc tag in JmsIO.Write (#31801) add 88aa25391ec Solace Read connector: integration tests with testcontainers (#31543) add 9cbdda1b4e5 add in redistribute option for Kafka Read add cf37997d1dd Merge pull request #31347: Add in redistribute option for Kafka Read add e15cd9e040e Don't cache when building release candidates (#31810) add ef143aed418 Add link to security model (#31811) add 78bab0dd15e Avoid length-prefix-bytes substitutions for Flink boundaries. add dda0fbf57be Merge pull request #31579 Avoid length-prefix-bytes substitutions for Flink boundaries. add fa9c618cdbd Allow pr-bot to re-assign reviewers when stopped (#31436) add a4558dfd8c8 Bump certifi from 2024.2.2 to 2024.7.4 in /sdks/python/container/py38 (#31790) add 566a3ca96c4 Publish and export Javadoc for Solace (#31809) add 81538672cfe Support class executes the Prism binary (#31795) add b12943380b5 Exclude StringSet tests from portable runners and Dataflow LegacyRunner (#31818) add f72f6ce0e81 Remove CsvIOParseResult (#31819) add 5579a16de7d Introduce support for emitting lineage in BiqQueryIOs add dded4f06d82 Be spotless add 024692647b4 A couple improvements to BQ source lineage. add c827bbac387 Update contains test. add c9adc8ee6c6 Merge pull request #31805 Introduce support for emitting lineage in BQ Source. add 018bcdf592c Add missing params to Python Bigtable MutationsBatcher (#31791) add a2260949431 Avoid publishing string set metrics on the Dataflow legacy runner. (#31825) add b9a44126622 Add changelog notes regarding Solace read connector (#31826) add 6c829db657b Bump google.golang.org/grpc from 1.64.0 to 1.64.1 in /sdks (#31817) add 4df89c704b0 Allow Firestore project to be configurable (#31808) add 080c80a9573 Moving to 2.59.0-SNAPSHOT on master branch. add a0ba8dea7d8 isort add 36961405769 Merge pull request #31755 Modernize type hints. add dd0912460c4 add doc warning against using icebergio directly (#31833) add 7c0cf39001a Merge pull request #31823 Add lineage information for BigQuery sinks. add 8d5c3b5ee2c Locate and download Prism binary (#31796) add 00bf1c6d036 Change orphan file log to warning (#31835) add 041ccdbe5d0 playground python image update openjdk to 17 (#31843) add 9ee961fc0c2 Use fileNameTemplate attribute for file prefix (#31844) add b34c014888b Dedup SerializableSupplier (#31829) add e646c28d2ac [CsvIO] Implemented CsvIOParseHelpers:parseCell (#31802) add 50a3403a474 Export string sets in monitoring infos. (#31838) add 441840afb5b Fix Python test auth (#31850) add eb59788cf4b [CsvIO] Changed CsvIOParseConfiguration to include DLQ (#31852) add b573c8ffc8b Add a little more info on snapshot containers (#31861) add 4561fd10634 [CsvIO] Implement CsvIOParseHelpers::validate(CSVFormat) (#31853) add 4d429dde47b Add options to control number of Storage API connections when using multiplexing (#31721) add ff2731bc7e9 Switch to use self-hosted runner for build_wheels action (#31866) add ba27c36c073 [CsvIO] implement CsvIOParseHelpers::validate(CSVFormat, Schema) (#31869) add 7c07f7ff453 Create CsvIOParse scaffold (#31878) add 86ea7806eca Remove auth@1 in tests running on self hosted runner (#31881) add 36f16314052 [CsvIO] Create scaffold CsvIOParseKV class (#31880) add faff55c2882 Fix auth for clean up actions (#31888) add 0b61035f36f Increase retry backoff for Storage API batch to survive AppendRows quota refill (#31837) add 408f67cfc83 Create CsvIOParseHelpers::mapFieldPositions method (#31889) add 3e7614a50e5 Bump braces from 3.0.2 to 3.0.3 in /scripts/ci/pr-bot (#31886) add ea761115998 [CsvIO] Create CsvIOStringToCsvRecord Class (#31857) add d74a2f53e47 Fix ReadAllFromBigQuery leak temp dataset (#31895) add 5ea57c3f8bd Update CHANGES.md with 2.59.0 Section (#31831) add 2547e0b482e Bump github.com/nats-io/nats-server/v2 from 2.10.12 to 2.10.16 in /sdks (#31611) add 53409cca5bb Set setFailIfPoolExhausted in SessionPoolOptions for SpannerAccessor (#31663) add 6b6d9aa1f98 Remove remaining use of setup-cloud@v0 action (#31907) add 2d6d55b98ce Fix generateYamlDocs gradle task (#31909) add f3e6c66c0a5 Improve performance of BigQueryIO connector when withPropagateSuccessfulStorageApiWrites(true) is used (#31840) add ff15999dfc4 [CsvIO] Change method signature of CsvIOStringToCsvRecord class (#31920) add a767d41daaf [CsvIO] update input Pcollection type arguments due to change in implementation details. (#31891) add 24f22f2b2cf Better assertion error messages for PAssert.thatSingleton (#31761) add 8f2c72e8d22 Disable Dataflow run for java_test and python_test (#31934) add 521586e04a7 Update Python artifact name in release process (#31933) add 8760a677e81 Update Dataflow containers (#31936) add be6216357bd Clean up integration test manual run parameters (#31938) add c0f6ac980ba PubsubMessageWithTopicCoder.of() is returning wrong coder (#31619) add 1284986ed05 Modify casing in workflow (#31911) add 12ad2afd2d6 Use default auth for Iceberg integration tests (#31940) add bdd5fff78c8 [Prism] Implement PrismPipelineResult (#31937) add 147d3e495bd Callout beam-pyio (#31860) add 04ed95d6037 JCSMP properties providers for new SolaceIO write connector (#31906) add 53a804e25b5 Update Dataflow containers (#31944) add 04dd443bda7 Update bug.yml (#31947) add edf5e547695 Add infra option for remaining templates + autolabel add 229477976fe Merge pull request #31952 from apache/users/damccorm/issue-templates add aa3cfe59d57 Move dataset cleanup to finishBundle (#31955) add 9b6c805e1cd Use correct device ordinal when GPU is detected (#31951) add 35cfad98818 remove listing topics when processing each element (#31897) add eadb81fd01f Expand test coverage (#31957) add bfffff7f254 Auth with project id in our notebook (#31966) add 92c5d57d529 Dedup Lineage and getTableToExtract call in BigQuerySourceBase (#31960) add e2b8acb8bbf Adds ordered list user state support to fnapi accessor cache. (#31923) add 799405c4c84 Make the SolaceIO.Read constructor private. (#31962) add 74f1f7097cf Generic throttling metrics namespace and BigtableIO write throttling counter (#31924) add dc1e1347de7 correct beam pyio details (#31949) add 6a1d917a5ee [yaml] Fix yaml provider schema validation and merging (#31974) add 8709f1126f6 Note SpannerIO.read new validation in 2.58.0 breaking change section (#31973) add 496e7124272 Use buffered loggers that periodically flush. (#31977) add b9a0c2b72ac Requirements_cache shouldn't create a cache folder when skipped. (#31961) add 1e8c091a14d [CsvIO]: add Coder and FromRowFn to CsvIOParseConfiguration class. (#31989) add 8eb09bfd6d7 Explicitly close the FlowReceiver (#31982) add 44792814ff3 [CsvIO]: update class for parsing date time. (#31996) add 7930a1fc88b Revert "Avoid publishing string set metrics on the Dataflow legacy runner." (#32002) add ffae5b5bf0b Update Build Wheels to only build once on RCs (#32009) add 3bdf702e6ab [CsvIO]: add CsvIORecordToObjects Class (#32006) add 493e0ba0be6 This is a followup PR to #31906, and part of the issue #31905 (#31953) add 0a65b4b065b Regenerate Dataflow Python client (#31997) add 9e431b49bc7 Correctly chose earliest or latest in pane. (#31979) add 5d13894975b Revert "Update Build Wheels to only build once on RCs" (#32014) add 25a4ffe8e09 Bump setuptools from 68.2.2 to 70.0.0 in /.test-infra/mock-apis (#31893) add ee3d57fb13c Bump com.gradle.develocity from 3.17.5 to 3.17.6 (#31948) add 17ef888a783 Add StringSet metrics to Python SDK (#31969) add 89d5e2f2961 Validate commits in StreamingDataflowWorker (#31822) add 98f8b869327 beam-sql-udf-doc-mistake (#32019) add 570f2f89647 Fix/Refactor GetData interfaces for Direct Path integration (#31784) add c624e02b852 [CsvIO] update error and result handling. (#32023) add 121ac713fa0 [#31991][prism] Allow Empty Composites (#32024) add 88a01027f4c [#32003][prism] Support empty transform input sets, such as for flattens. (#32029) add e7847c9448c [#32004] Ensure all pcollection coders are length prefixed if necessary. (#32012) add 835630b12e2 Replace Class with TypeDescriptor in SchemaProvider implementations (#31785) add b61ef7591fe update document in AwsOptions (#32036) add 2824944530f Remove `--impersonate_service_account` whenever PipelineOptions are serialized (#32031) add 2aefd5b9c0c Improve IcebergIO utils (#31958) add 25e839b1f47 Fix Dataflow v2 test failed tag container if project contains colon (#32011) add eec20689f21 Fix StringSet tests on portable runners (#31999) add 346011b0a21 Fix row ranges issue in Bigtable Read. (#31990) add 6d3b547adcf [#31992][prism] Send side inputs for all SDF phases. (#32042) add f1c72c5e0c4 Support metrics at Source.split for Direct Runner (#32022) add 56aa17b59ec Change the gradle task definition in quickstart. (#32043) add 21009e68eab StateWatcher watches and reports changed Pipeline State (#32040) add 6aac47c8354 [CsvIO] Create CsvIOParse Class (#32028) add b795a61f094 Adds null checks when accessing OperationalLimits in config since they might not have been set yet. (#32053) add 202fa56be77 Enable ExternalWorkerService during Prism Runner lifecycle (#32057) add 0b4b8ea9423 Handle rc container in _update_container_image_for_dataflow (#32049) add d96fa7d4009 Add some large model troubleshooting steps (#31862) add bf42a8153af [#32064] Keep elements heap in sequence order. (#32065) add 7e750873152 Update top_wikipedia_sessions to be more idiomatic with beam.Map. (#32041) add ca744ae9f65 Add WorkProvider interfaces and implementations (#31883) add bfc64d5c14a Fix error when ActiveWorkRefresher processed empty heartbeat map. (#32078) add 80ae93217c5 Minor optimization for the common case of merging empty string sets. (#31803) add 5b2bfe96f83 [Prism] Enable an artifact resolver for the Prism runner (#32058) add fb49e9644a4 Fix load test dataproc cluster name exceeded allowed length (#32062) add c60623524ae Beam Website Updates for 2.58.0 Release (#31925) add d09c3237c8a Added support for the TOKENLIST type in Spanner (#32038) add e9b5dc69532 Enforce java.nio.charset.StandardCharsets against guava Charsets (#32083) add 99a23830037 Enable artifact staging during Prism Runner lifecycle (#32084) add 741facf0099 Bump github.com/docker/docker in /sdks (#32046) add 5ab908b984d Add Lineage metrics for BigtableIO (#32068) add 17283bb8294 Add Lineage metrics to PubsubIO (#32037) add e3e44544577 [#32085][prism] Fix session windowing. (#32086) add 0a42afa9f5c [prism] Use non-deprecated docker types in environment. (#32092) add 9b564ef925b Exclude a not yet implemented pandas op from dataframe tests. (#32066) add eeddc6924c3 Bump google.golang.org/grpc from 1.64.0 to 1.65.0 in /sdks (#31824) add 99672af7fe1 Bump torch from 1.13.1 to 2.2.0 in /sdks/python/apache_beam/examples/ml-orchestration/kfp/components/train (#31983) add ebba3bb026b Bump go.mongodb.org/mongo-driver from 1.13.1 to 1.16.0 in /sdks (#32097) add 44a9942719e Add warning + doc callout when encountering ri pickling errors (#32063) add 0d81c599304 Bump golang.org/x/text from 0.16.0 to 0.17.0 in /sdks (#32098) add 81ad4fee378 Bump github.com/aws/aws-sdk-go-v2/credentials in /sdks (#32096) add 828717a71d6 [#21515][Go SDK] Update go protobuf package to new version (#32045) add b54967eab41 Fix Beam Schema to Iceberg Schema ID conversion logic (#32095) add 07e692b56fb Bump github.com/nats-io/nats-server/v2 from 2.10.12 to 2.10.17 in /sdks (#31709) add 656a296a82d Enable Job management for the Prism runner (#32091) add ea982127b60 Override BQ load job location when necessary (#31986) add adc3b2b4a5f Bump cloud.google.com/go/bigtable from 1.25.0 to 1.28.0 in /sdks (#32105) add 1f09065ff32 Fix classifier dropped in artifact pom.xml (#32100) add 502c728dd23 Update cython requirement from <1 to <4 in /sdks/python (#32087) add 529996241be Bump go.mongodb.org/mongo-driver from 1.16.0 to 1.16.1 in /sdks (#32104) add 16d95835df1 Bump github.com/proullon/ramsql from 0.1.3 to 0.1.4 in /sdks (#32106) add 18849de0cc4 Revert "Update cython requirement from <1 to <4 in /sdks/python (#32087)" (#32110) add a6de47572b9 [Go SDK] s3 filesystem: Fix nillable content length, update deps. (#32111) add 6bdf63a9d2d Bump cloud.google.com/go/spanner from 1.64.0 to 1.66.0 in /sdks (#32126) add af6bf8a1423 Bump golang.org/x/net from 0.27.0 to 0.28.0 in /sdks (#32128) add 27741fb6c9b Bump cloud.google.com/go/profiler from 0.4.0 to 0.4.1 in /sdks (#32125) add 6d96ae2580d Fix Lineage name breaking change (#32122) add 01100a3b2fe Generate python dependencies (#32132) add 82c3b36af70 Bump golang.org/x/oauth2 from 0.21.0 to 0.22.0 in /sdks (#32129) add 4b476832020 Bump github.com/fsouza/fake-gcs-server from 1.47.7 to 1.49.2 in /sdks (#32124) add cae9148ee1d Bump up google-cloud-storage version to fix data corruption issue (#32135) add 1c0cfa1ccba Expose watermarkIdleDurationThreshold parameter to the user in SolaceIO (#32109) add 679e9d799e5 Upgrade Beam to use Cython 3. add c825434965e Update base image requirements. add 2b7e84239ea Remove now unneeded langauge level specifications. add 1de0c4670ee Add no-except to time-critical function. add 00de74030e2 Merge branch 'master' into cython3 add 49e98e59fd2 Merge pull request #32112 Upgrade Apache Beam to use Cython 3.x. add f73a6d1570a Create Beam YAML Join documentation (#31494) add ec152e28355 [#32139] Fail pipelines with Stateful SDFs. (#32140) add 17298b5572e [#32115] Fix timer support, support timer clears. (#32119) add b21a84a4cd6 Managed Iceberg hive support and integration tests (#32052) add fc5a71db5ca [CsvIO]: Implement CsvIOParse::withCustomRecordParsing method (#32142) add 488996913ff Add support for setting an HTTP read timeout for BigQueryIO (#32118) add 2f93d8bc199 fix: cover bigquery datetime fraction 1 to 6 or absent (#32146) add 780eef98083 Replace StateTag.StateBinder to top level StateBinder in SparkStateInternals (#31798) add ab4ada4ff40 Skip most bigtableIO write error handling test on Dataflow runner (#32048) add 29de91383f3 Bump cloud.google.com/go/bigtable from 1.28.0 to 1.29.0 in /sdks (#32151) add f4e43148118 Bump cloud.google.com/go/pubsub from 1.40.0 to 1.41.0 in /sdks (#32149) add 8b6f37b2874 Fix broken Beam Quest URL in README.md (#32145) add 2ce0ee34470 Added insertion and enrichment pipeline (#31657) add 9aaf7e41dec fix link (#32156) add edf4c4f5f19 Bump golang.org/x/sys from 0.23.0 to 0.24.0 in /sdks (#32150) add df106091290 Update names.py with container image tag (#32160) add 74668038c02 Support withFormatRecordOnFailureFunction() for BigQuery STORAGE_WRITE_API and STORAGE_API_AT_LEAST_ONCE methods (#31659) add b2d26b6b5f3 Fix upload_graph on v2 (#32165) add ab81e1fc5e9 Added a data corruption known issue to CHANGES.md and release blogs. (#32166) add 8ff7f0d75e4 Bump github.com/docker/docker in /sdks (#32176) add b0f2683cda1 GitHub issue #30257 add 7a4850dcdae Merge pull request #32074: Fixes GitHub issue #30257 add c23e60383bc Add Lineage metrics to Python BigQueryIO (#32116) add 8fbad485688 Change FnApiDoFnRunner to skip trySplit checkpoint requests if not draining and nothing has yet been claimed by the tracker. (#32044) add dbd719ba144 [WIP] Gemma Sentiment and Summarization Example Notebook (#32172) add eaa3b56dbb1 Bump github.com/nats-io/nats.go from 1.36.0 to 1.37.0 in /sdks (#32174) add 2c492434773 Add Lineage metrics to FileSystems (#32090) add 3063b557575 Bump to Java11 for GHA runner (#32138) add 7af43ee4a4a Fix PostCommit Python sklearn on Python3.12 (#32171) add f1f5f775252 [prism] Skip python tests that require an expansion service at this time. (#32182) add 1eb416e169d Fix verifyJavaVersion in a few places (#32183) add a2f5ee26941 Fix configuration of ReadFromKafkaViaSDF which was always enabling redistribute and allowing duplicates instead of basing it on configuration. (#32134) add 8171e8c29a0 copy editing the Gemma notebook (#32188) add d716f3152a3 Increase snapshot publish timeout (#32197) add c197e4ffc1a [#32121] Support timers in interval windows. (#32180) add 028e0eef45a Protect release branch (#32204) add d7d9f5145a3 Fix tests after GHA runner Java11 migration (#32190) add 2d140762eaa Add callback to with_exception_handling (#32136) add 92087f2daa7 [prism] Catch panics in primary execution goroutines. (#32210) add cbe2b9e2dde [#31926] [java] call provision service when creating external workers. (#32198) add 11befd32662 [#31927,#31928][prism] Support StringSet and Gauge metrics. (#32184) add ced67ec1100 [Java11 Migration] Migrate Go testing to use Java11 container image for xlang (#32212) add 65550a71f96 [yaml] Doc improvements (#32117) add ff93c48a4c5 Update Gemma 2 Notebook (#32200) add f1e214712c8 Bump github.com/aws/aws-sdk-go-v2/feature/s3/manager in /sdks (#32214) add c9ad32ea8f4 [Managed Iceberg] Support writing to partitioned tables (#32102) add 83a3db5ce7e More Java11 test fix (#32209) add 307002e0146 Update website for 2.58.1 release (#32205) add 541e5c85031 Make BQ file load limit controls public (#32101) add 391dbf6b243 SolaceIO.read: Encode url parameters in REST requests. (#32133) add e6c42d2dd82 update runner image (#32220) add e4f2322453d Remove executable non-python blocks from Gemma 2 notebook (#32219) add 8c0fbf75c3a Replacing Jackson Factory with Gson Factory and further code updation (#32158) add 88713ace637 [prism] rename watermarkRefreshs concept to changedStages. (#32233) add aa13a692cd8 Add hard delete function to multi_process_shared (#32238) add 872a97f15d5 [#32080] Remove restriction on requests package (#32236) add bed2e562ece Update ClickHouseIO to use the latest version of the ClickHouse JDBC driver (0.6.4) (#32229) add 6b4a7a5d73e feat: optimize Spanner changestream metadata table (#32213) add e2bf5d6dbe3 Add basic testing for yaml join docs. add 7f1c7f4d8df make linter happy add 38dfbd4c35e Merge pull request #32141 Add basic testing for yaml join docs. add 106ba39a3fd Update beam-master to 20240819 (#32240) add 9888900b5db Include Iceberg Hive runtime dependencies in IO expansion service container (#32232) add 68ddd9dc7b9 add a pattern for using a shared object as a cache add afb6a09094b apply suggestions from code review add 049b478ddd9 fix whitespace error add d34c927e57e Apply suggestions from code review add 067d6b8edc6 Merge pull request #32187 from jaehyeon-kim/feature/add-shared-class-examples add 26cd5df5995 Adds an ORDERED_LIST_STATE capability to the Java SDK. (#32067) add ff64566fbdd Bump google-ads API to v17 (#32244) add 714f08b34ae Add to ClickHouseIO dedicated useragnet (#32253) add 6582e7ae538 Fix it-mongo compile fail on Java11 (#32243) add bd65ee9c5bf Portable runner fixes (#32247) add 254519b857a Add Lineage metrics to KafkaIO (#32170) add 89b1a7f2028 [yaml] Fix PubSub error message add 4365f73cbe3 Merge pull request #32093 [yaml] Fix PubSub error message add 63055a8032f Add BatchElements overview doc to Beam Website (#32239) add 2da24d0644f [#32245][Go SDK] Copy bytes sent over the State API Writer. (#32246) add 71e3eedcd65 Update nltk version to 3.9.1 (#32256) add 05b1781c6ea [#32221] [prism] Terminate streams for each timerfamily+transform pair. (#32223) add ec3cec201bf Attach file extension to Iceberg writes (#32254) add 65427efe9d0 Bump cloud.google.com/go/spanner from 1.66.0 to 1.67.0 in /sdks (#32234) add 228554b7319 Bump google.golang.org/api from 0.191.0 to 0.192.0 in /sdks (#32175) add 8601bbaed8d Bump github.com/tetratelabs/wazero from 1.7.3 to 1.8.0 in /sdks (#32193) add 3fb4fd0f6ea Bump github.com/aws/aws-sdk-go-v2/feature/s3/manager in /sdks (#32266) add 917e99670ed Add ErrorHandler pattern to Python. add c4be92fd769 Add with_error_handler to ParDo (Map, FlatMap, etc.) add 34e28f396ea Add collecting error handler. add b69f4d8eb45 Add test stanza and other lint fixes. add daf28cdf979 Fix typo. add 5141f14503d Fix typo. add 049e4b3b6b9 Merge branch 'master' into error-handler add 36e5eff40af Add test of with_exception_handling side effects. add b3a874f4766 Merge pull request #31856 Add ErrorHandler DLQ API to Python. add 3bf24217c79 [Managed Iceberg] Support BigQuery Metastore catalog (#32242) add a0e4541f0f8 Allow setting BigQuery endpoint (#32153) add b873be8bdb9 Add timeout to runinference (#32237) add c3432b73f4e Bump github.com/testcontainers/testcontainers-go in /sdks (#32267) add 37ad3196d81 Fixes a breakage related to Kafka upgrade (#32262) add c5bf5fc7395 Revert "Portable runner fixes (#32247)" (#32271) add 65f556d861e Update requests package to 2.32.0 (#32270) add 512b52a2b23 Better error when Python xlang download jar not available (#32269) add ed4c03e3779 Do not create new Executor everytime createRunner (#32272) add a62ff34b7cf Better error message when trying to get Beam Schema from proto schema with Struct (#32260) add 24699908ebc Moving to 2.60.0-SNAPSHOT on master branch. add 6318bc3461a add set state in spark runner (#32226) add 995724d651a [prism] Add an idle shutdown timout to prism binary. (#32276) add 8488a0d056d Move RunInference GC to finalize_bundle (#32281) add ea882ab55b6 Check experiment for hotkey logging (#32285) add ebf80f326b2 download BQMS catalog jar at build time (#32282) add e62daf70077 [CHANGES.md] Add 2.60.0 section. (#32287) add 22fbdd5a849 Disable sickbay scheduled runs (#32299) add b2ed1c56402 Loose BigQuery GCP project ID regex restrictions (#32178) add 1a8ef79eb35 Remove some setup steps we don't need anymore (#32300) add 1e80815d12d Parallelize building wheels per language version (#32297) add c77460e32b9 Rename delimiter to sep to pass to pandas. add e24a0c473fd Merge pull request #32301 Rename delimiter to sep to pass to pandas. add 46cdc4b220c Fix split with delimiter (#32298) add b8bbf593abf Implement Java TestPrismRunner and PrismRunner (#32294) add 9c0a9503ebd Make autosharding test more robust (#32293) add b23d3ed68b3 patch release docs (#32318) add dd662f639b2 fixed invalid links on programming-guide.md (#32317) add 3dec995d109 Add beam summit banner (#32320) add 24255ac84b7 Add schema to SpannerIO read (#32008) add 0ae3b13ac61 Move beam summit banner to top of swiper slide add 43963b6f07f Merge pull request #32323 from apache/users/damccorm/summitBanner add d1157d8a9c0 fix failing ProtoSchemaTranslator on proto3 optional fields (#32216) add 857eccedc55 Refactor PubsubIO Lineage metrics to work with all runners (#32319) add 142e39250db Preserve numeric string literals when reading from json. add 679e5cc6ff8 Add a test. add bc80e9fdbb2 Merge pull request #32303 Preserve numeric string literals when reading from json. add 9e3aeca6123 Add doc string and signature to generated Python wrappers (#32337) add 8cc80ff44c5 Add quality warnings to pulsar (#32346) add 815a049b1bd Document dynamic destination key should be unique over the pipeline (#32338) add a85b0a636dc Filter out old actions runs from dashboard (#32347) add 7d5c97320b9 Add predicate to control which columns are propagated when propagating successful rows (#32312) add a6cef9210b9 Fix remote execution test flakiness in tearDown (#32328) add 399112b2fab Fix BigtableIO.write() client sharing (#32340) add 923fde0f070 Fix pr building of wheels (#32353) add 55678b2edcf Pause delete images in public AR (#32354) add 7d6f6fb55bc Try deflaking test timing (#32351) add d3f38bec406 Update Python Dependencies (#32310) add bcee5d081d0 Remove expensive shuffle of read data in KafkaIO when using sdf and commit offsets (#31682) add df83fbda179 Relocate guava in shaded hive-exec (#32363) add 2ac5594ce8f add ExternalTransformProvider example (#30666) add a8954691239 Updates Python SDK harness HEAD container for Dataflow. (#32369) add 28f2d47662a docs: modernize py dependencies docs and example (#32345) add 322c47da41b Fix Dataflow ARM Java Example PostCommit bad gradle cache (#32356) add 66bc071fd5e Avoid parsing all options just to grab an option that is already parsed. add 8fe3386c847 The GCS options are in GoogleCloudOptions, not generic PipelineOptions add b0caa7976c3 Fix formatting issues add aac398dab5c Fix formatting issues (again) add fd8d14660af Merge pull request #32362 from iht/fix_gcsio_options add 52ab49e7255 Fix BQ storage stream split (#32376) add cfe8feee7c5 Improve BatchElements documentation (#32082) add 511f294db15 Revert "docs: modernize py dependencies docs and example" (#32382) add e18596c7fc4 Temporarily remove BQMS catalog until it is opend-sourced (#32386) add 64b4aa6fdde Memoize toString() for MetricKey and MetricName (#32379) add 9771a425655 Add WindowedvalueParam option to DoFns. add b875fd490d9 Merge pull request #32305 Add WindowedValueParam option to DoFns. add 6baba928348 [YAML] Better errors for unconsumed error outputs. (#32341) add f8bda18c737 Use proper coders in interactive cache. add f06df5dabf0 Merge pull request #32330 Use proper coders in interactive cache. add 2b02fd3cc94 Allow ib.collect to work with non-interactive runners. (#32383) add 34a7c4f4198 Upgrade artifact actions to v4 (#32391) add 696978018ed Bump actions/download-artifact from 4.1.7 to 4.1.8 (#32396) add 3b4ecd7a987 Bump commons-cli:commons-cli from 1.8.0 to 1.9.0 (#32350) add 4dddde4e6de Colab Notebook for Unit Tests in Beam (#32336) add 6901d7c8623 Undo part of artifact action upgrade to fix workflow (#32401) add 1ee2f6bff4a Bump jinja2 version to resolve vulnerability (#32403) add 50a6cd2b580 Add tests of using ib.collect(...) without InteractiveRunner. add d5dc20000d5 Allow ib.collect(...) to take multiple PCollections. add a7852d9ddb1 Merge pull request #32392 Allow ib.collect(...) to take multiple PCollections. add d9a3d4457a0 Add keyrings.google-artifactregistry-auth (#32404) add 2a7755b0f2c Use Knuth–Morris–Pratt algorithm for delimiter search in TextIO (#32398) add 7a88e6fa0bb [Dataflow Streaming] Move `throwExceptionOnLargeOutput` out of OperationalLimits (#32407) add 173dd486842 [yaml] Adding Spanner IO Providers for Beam YAML (#31987) add 4aeb246596d [yaml] Move js2py package to yaml deps (#32377) add 2573106f367 Bump webpack (#32374) add 284004484c1 Add jobLabelsMap parameter to BigQueryOptions (#31698) add 1d0e09a6d57 Support Zstd codec in SerializableAvroCodecFactory (#32352) add f362045afb7 Bump micromatch from 4.0.2 to 4.0.8 in /website/www add 2a0ce8f9103 Merge pull request #32375 from apache/dependabot/npm_and_yarn/website/www/micromatch-4.0.8 add 5fbfdba926c Move PubsubIO source Lineage report to MapElements (#32381) add c657507c082 Upgrade gcp-bom to 26.45.0 (#32413) add ebcb2dbd160 Dont run flaky test on windows (#32419) add b0896dac512 Disable permared go xlang tests. No new revisions were added by this update. Summary of changes: .asf.yaml | 3 + .github/ACTIONS.md | 4 + .github/ISSUE_TEMPLATE/bug.yml | 2 + .github/ISSUE_TEMPLATE/failing_test.yml | 2 + .github/ISSUE_TEMPLATE/feature.yml | 2 + .github/ISSUE_TEMPLATE/task.yml | 2 + .../actions/setup-environment-action/action.yml | 9 +- .../arc/environments/beam.env | 6 +- .../arc/images/Dockerfile | 18 +- .github/issue-rules.yml | 2 + .../IO_Iceberg_Integration_Tests.json | 3 +- ...am_PostCommit_Java_Examples_Dataflow_Java.json} | 0 .../beam_PostCommit_Java_PVR_Flink_Streaming.json | 2 +- .../beam_PostCommit_Java_PVR_Samza.json | 3 +- .../beam_PostCommit_Java_PVR_Spark3_Streaming.json | 2 +- .../beam_PostCommit_Java_PVR_Spark_Batch.json | 2 +- ...eam_PostCommit_Java_ValidatesRunner_Direct.json | 2 +- ...beam_PostCommit_Java_ValidatesRunner_Spark.json | 3 +- ...a_ValidatesRunner_SparkStructuredStreaming.json | 3 +- ...stCommit_Java_ValidatesRunner_Spark_Java11.json | 3 +- .github/trigger_files/beam_PostCommit_Python.json | 3 +- .github/workflows/IO_Iceberg_Integration_Tests.yml | 8 +- .github/workflows/IO_Iceberg_Performance_Tests.yml | 6 - .github/workflows/IO_Iceberg_Unit_Tests.yml | 6 - .github/workflows/README.md | 18 +- .github/workflows/beam_CancelStaleDataflowJobs.yml | 6 - .github/workflows/beam_CleanUpGCPResources.yml | 9 +- .../workflows/beam_CleanUpPrebuiltSDKImages.yml | 6 - ... beam_LoadTests_Java_GBK_Dataflow_V2_Batch.yml} | 54 +- ...m_LoadTests_Java_GBK_Dataflow_V2_Streaming.yml} | 52 +- .../beam_LoadTests_Python_CoGBK_Flink_Batch.yml | 2 +- .../beam_LoadTests_Python_ParDo_Flink_Batch.yml | 2 +- ...beam_LoadTests_Python_ParDo_Flink_Streaming.yml | 2 +- .../beam_PerformanceTests_AvroIOIT_HDFS.yml | 6 - .github/workflows/beam_PerformanceTests_Cdap.yml | 6 - ...m_PerformanceTests_Compressed_TextIOIT_HDFS.yml | 6 - .../beam_PerformanceTests_HadoopFormat.yml | 6 - .github/workflows/beam_PerformanceTests_JDBC.yml | 6 - .../workflows/beam_PerformanceTests_Kafka_IO.yml | 6 - ...am_PerformanceTests_ManyFiles_TextIOIT_HDFS.yml | 6 - .../beam_PerformanceTests_MongoDBIO_IT.yml | 6 - .../beam_PerformanceTests_ParquetIOIT_HDFS.yml | 6 - .../beam_PerformanceTests_SingleStoreIO.yml | 6 - .../beam_PerformanceTests_SparkReceiver_IO.yml | 6 - .../beam_PerformanceTests_TFRecordIOIT_HDFS.yml | 6 - .../beam_PerformanceTests_XmlIOIT_HDFS.yml | 6 - .../beam_PerformanceTests_xlang_KafkaIO_Python.yml | 6 - .github/workflows/beam_PostCommit_Go_VR_Samza.yml | 5 + .../beam_PostCommit_Java_Examples_Dataflow_ARM.yml | 6 +- ...beam_PostCommit_Java_Examples_Dataflow_Java.yml | 4 +- ...m_PostCommit_Java_Examples_Dataflow_V2_Java.yml | 4 +- .../beam_PostCommit_Java_Hadoop_Versions.yml | 14 +- .../beam_PostCommit_Java_InfluxDbIO_IT.yml | 6 - .../workflows/beam_PostCommit_Java_PVR_Samza.yml | 8 + .../beam_PostCommit_Java_SingleStoreIO_IT.yml | 10 +- ..._Java_ValidatesRunner_Dataflow_JavaVersions.yml | 6 +- ...it_Java_ValidatesRunner_Direct_JavaVersions.yml | 6 +- ...ostCommit_Java_ValidatesRunner_Flink_Java8.yml} | 21 +- .../beam_PostCommit_Java_ValidatesRunner_Samza.yml | 8 +- ...ostCommit_Java_ValidatesRunner_Spark_Java8.yml} | 21 +- ...eam_PostCommit_Python_ValidatesRunner_Samza.yml | 3 +- .../workflows/beam_PostCommit_Sickbay_Python.yml | 4 +- .github/workflows/beam_PostCommit_XVR_Samza.yml | 2 + .../workflows/beam_PostRelease_NightlySnapshot.yml | 2 +- .../beam_PreCommit_Java_HCatalog_IO_Direct.yml | 15 + .../workflows/beam_PreCommit_Java_IOs_Direct.yml | 15 + .../beam_PreCommit_Java_Solace_IO_Direct.yml | 2 +- .../beam_PreCommit_Java_Spark3_Versions.yml | 10 +- .github/workflows/beam_PreCommit_SQL.yml | 2 +- ...SQL_Java11.yml => beam_PreCommit_SQL_Java8.yml} | 22 +- .../workflows/beam_Publish_Beam_SDK_Snapshots.yml | 2 +- .../workflows/beam_StressTests_Java_KafkaIO.yml | 6 - .github/workflows/build_release_candidate.yml | 5 +- .github/workflows/build_runner_image.yml | 8 - .github/workflows/build_wheels.yml | 100 +- .github/workflows/go_tests.yml | 7 +- .github/workflows/java_tests.yml | 11 +- ..._GBK_Dataflow_V2_Batch_2GB_of_100B_records.txt} | 0 ...GBK_Dataflow_V2_Batch_2GB_of_100kB_records.txt} | 0 ...a_GBK_Dataflow_V2_Batch_2GB_of_10B_records.txt} | 0 ...out_4_times_with_2GB_10-byte_records_total.txt} | 0 ...out_8_times_with_2GB_10-byte_records_total.txt} | 0 ...low_V2_Batch_reiterate_4_times_10kB_values.txt} | 0 ...flow_V2_Batch_reiterate_4_times_2MB_values.txt} | 0 ..._Dataflow_V2_Streaming_2GB_of_100B_records.txt} | 0 ...Dataflow_V2_Streaming_2GB_of_100kB_records.txt} | 0 ...K_Dataflow_V2_Streaming_2GB_of_10B_records.txt} | 0 ...out_4_times_with_2GB_10-byte_records_total.txt} | 0 ...out_8_times_with_2GB_10-byte_records_total.txt} | 0 ...V2_Streaming_reiterate_4_times_10kB_values.txt} | 0 ..._V2_Streaming_reiterate_4_times_2MB_values.txt} | 0 .github/workflows/local_env_tests.yml | 18 +- .github/workflows/playground_backend_precommit.yml | 2 +- .github/workflows/python_tests.yml | 11 +- .github/workflows/run_perf_alert_tool.yml | 6 - .github/workflows/typescript_tests.yml | 6 +- .test-infra/dockerized-jenkins/plugins.txt | 2 +- .../github/github_runs_prefetcher/code/main.py | 3 + .test-infra/mock-apis/poetry.lock | 13 +- .test-infra/tools/stale_bq_datasets_cleaner.sh | 4 +- .../tools/stale_dataflow_prebuilt_image_cleaner.sh | 2 +- CHANGES.md | 96 +- README.md | 2 +- .../org/apache/beam/gradle/BeamModulePlugin.groovy | 266 ++- contributor-docs/code-change-guide.md | 30 +- contributor-docs/release-guide.md | 155 +- dev-support/docker/Dockerfile | 2 +- .../kafkatopubsub/transforms/FormatTransform.java | 5 +- examples/multi-language/README.md | 3 + .../multi-language/python/wordcount_external.py | 118 ++ .../schematransforms/ExtractWordsProvider.java | 76 + .../schematransforms/JavaCountProvider.java | 78 + .../schematransforms/WriteWordsProvider.java | 77 + .../beam-ml/bigtable_enrichment_transform.ipynb | 28 +- .../gemma_2_sentiment_and_summarization.ipynb | 615 ++++++ .../beam-ml/rag_usecase/beam_rag_notebook.ipynb | 1795 ++++++++++++++++ .../beam-ml/rag_usecase/chunks_generation.py | 129 ++ .../beam-ml/rag_usecase/redis_connector.py | 349 ++++ .../beam-ml/rag_usecase/redis_enrichment.py | 110 + examples/notebooks/blog/unittests_in_beam.ipynb | 362 ++++ gradle.properties | 4 +- .../it/cassandra/CassandraResourceManager.java | 2 +- .../apache/beam/it/common/PipelineLauncher.java | 2 +- .../org/apache/beam/it/common/TestProperties.java | 2 +- .../beam/it/common/utils/ExceptionUtils.java | 2 +- .../beam/it/common/utils/ResourceManagerUtils.java | 2 +- .../apache/beam/it/conditions/ConditionCheck.java | 2 +- .../ElasticsearchResourceManager.java | 2 +- .../java/org/apache/beam/it/gcp/LoadTestBase.java | 2 +- .../gcp/bigquery/conditions/BigQueryRowsCheck.java | 5 +- .../it/gcp/bigtable/BigtableResourceManager.java | 2 +- .../it/gcp/dataflow/AbstractPipelineLauncher.java | 2 +- .../beam/it/gcp/monitoring/MonitoringClient.java | 2 +- .../gcp/pubsub/conditions/PubsubMessagesCheck.java | 5 +- .../it/gcp/spanner/SpannerResourceManager.java | 2 +- .../beam/it/gcp/bigquery/BigQueryStreamingLT.java | 3 +- .../apache/beam/it/kafka/KafkaResourceManager.java | 2 +- it/mongodb/build.gradle | 4 + .../beam/it/mongodb/MongoDBResourceManager.java | 2 +- .../mongodb/conditions/MongoDBDocumentsCheck.java | 5 +- .../apache/beam/it/neo4j/Neo4jResourceManager.java | 2 +- .../beam/it/neo4j/conditions/Neo4jQueryCheck.java | 5 +- .../beam/it/splunk/SplunkResourceManager.java | 2 +- .../it/splunk/conditions/SplunkEventsCheck.java | 8 +- .../TestContainerResourceManager.java | 2 +- .../beam/it/truthmatchers/PipelineAsserts.java | 2 +- .../beam/it/truthmatchers/RecordsSubject.java | 2 +- .../beam/model/pipeline/v1/beam_runner_api.proto | 4 + .../apache/beam/model/pipeline/v1/metrics.proto | 19 + playground/backend/containers/python/Dockerfile | 3 +- release/build.gradle.kts | 4 +- release/src/main/scripts/run_rc_validation.sh | 48 +- .../org/apache/beam/runners/core/StateTag.java | 5 +- .../org/apache/beam/runners/core/StateTags.java | 8 + .../runners/core/metrics/DefaultMetricResults.java | 14 +- .../beam/runners/core/metrics/MetricUpdates.java | 34 +- .../runners/core/metrics/MetricsContainerImpl.java | 106 +- .../core/metrics/MetricsContainerStepMap.java | 6 + .../core/metrics/MonitoringInfoConstants.java | 4 + .../core/metrics/MonitoringInfoEncodings.java | 26 + .../core/metrics/SimpleMonitoringInfoBuilder.java | 11 + .../beam/runners/core/metrics/StringSetCell.java | 111 + .../beam/runners/core/metrics/StringSetData.java | 100 + .../core/metrics/MetricsContainerImplTest.java | 46 + .../core/metrics/MetricsContainerStepMapTest.java | 102 + .../core/metrics/MonitoringInfoEncodingsTest.java | 28 + .../runners/core/metrics/StringSetCellTest.java | 97 + .../runners/core/metrics/StringSetDataTest.java | 102 + .../apache/beam/runners/direct/DirectMetrics.java | 45 +- .../direct/ExecutorServiceParallelExecutor.java | 13 +- .../beam/runners/direct/DirectMetricsTest.java | 26 +- .../beam/runners/direct/DirectRunnerTest.java | 2 +- .../metrics/CustomMetricQueryResults.java | 11 + .../extensions/metrics/MetricsHttpSinkTest.java | 7 +- runners/flink/flink_runner.gradle | 2 + .../flink/adapter/BeamAdapterCoderUtils.java | 16 + .../runners/flink/adapter/BeamAdapterUtils.java | 22 + .../flink/adapter/BeamFlinkDataSetAdapter.java | 1 - .../flink/adapter/BeamFlinkDataStreamAdapter.java | 1 - .../wrappers/streaming/SplittableDoFnOperator.java | 9 +- .../streaming/io/StreamingImpulseSource.java | 5 +- .../runners/flink/FlinkJobServerDriverTest.java | 12 +- .../FlinkPipelineExecutionEnvironmentTest.java | 4 +- .../beam/runners/flink/FlinkSubmissionTest.java | 4 +- .../flink/adapter/BeamFlinkDataSetAdapterTest.java | 50 + .../streaming/ExecutableStageDoFnOperatorTest.java | 9 +- .../wrappers/streaming/FlinkKeyUtilsTest.java | 4 +- .../google-cloud-dataflow-java/arm/build.gradle | 12 +- runners/google-cloud-dataflow-java/build.gradle | 81 +- .../examples/build.gradle | 11 +- .../beam/runners/dataflow/DataflowMetrics.java | 41 +- .../dataflow/DataflowPipelineTranslator.java | 9 +- .../beam/runners/dataflow/DataflowRunner.java | 47 +- .../beam/runners/dataflow/DataflowMetricsTest.java | 59 + .../beam/runners/dataflow/DataflowRunnerTest.java | 72 +- .../runners/dataflow/util/PackageUtilTest.java | 4 +- .../dataflow/worker/BatchModeExecutionContext.java | 33 +- .../dataflow/worker/DataflowMetricsContainer.java | 6 + .../dataflow/worker/DataflowSystemMetrics.java | 5 +- .../worker/DataflowWorkProgressUpdater.java | 3 +- .../worker/MetricTrackingWindmillServerStub.java | 355 ---- .../worker/MetricsToCounterUpdateConverter.java | 18 +- .../runners/dataflow/worker/OperationalLimits.java | 54 + ...Exception.java => OutputTooLargeException.java} | 19 +- .../worker/SplittableProcessFnFactory.java | 15 +- .../dataflow/worker/StreamingDataflowWorker.java | 461 ++--- .../worker/StreamingModeExecutionContext.java | 23 +- .../worker/StreamingStepMetricsContainer.java | 27 +- .../beam/runners/dataflow/worker/WindmillSink.java | 25 + .../worker/WorkItemCancelledException.java | 4 + .../beam/runners/dataflow/worker/graph/Nodes.java | 6 +- .../dataflow/worker/streaming/ActiveWorkState.java | 61 +- .../worker/streaming/ComputationState.java | 13 +- .../worker/streaming/ComputationWorkExecutor.java | 7 +- .../dataflow/worker/streaming/ExecutableWork.java | 2 +- .../RefreshableWork.java} | 29 +- .../dataflow/worker/streaming/StageInfo.java | 14 +- .../runners/dataflow/worker/streaming/Work.java | 72 +- .../runners/dataflow/worker/streaming/WorkId.java | 4 +- .../StreamingEngineComputationConfigFetcher.java | 18 +- .../config/StreamingEnginePipelineConfig.java | 10 + .../FanOutStreamingEngineWorkerHarness.java} | 91 +- .../harness/SingleSourceWorkerHarness.java | 284 +++ .../harness}/StreamingEngineConnectionState.java | 2 +- .../streaming/harness/StreamingWorkerHarness.java | 15 +- .../harness}/WindmillStreamSender.java | 48 +- .../streaming/sideinput/SideInputStateFetcher.java | 114 +- .../sideinput/SideInputStateFetcherFactory.java | 46 + .../worker/windmill/ApplianceWindmillClient.java | 22 +- .../windmill/StreamingEngineWindmillClient.java | 54 + .../worker/windmill/WindmillConnection.java | 13 +- .../worker/windmill/WindmillServerBase.java | 5 - .../worker/windmill/WindmillServerStub.java | 58 +- .../windmill/client/AbstractWindmillStream.java | 56 +- .../worker/windmill/client/WindmillStream.java | 27 +- .../worker/windmill/client/WindmillStreamPool.java | 4 +- .../commits/StreamingEngineWorkCommitter.java | 113 +- .../client/getdata/ApplianceGetDataClient.java | 220 ++ .../windmill/client/getdata/GetDataClient.java | 57 + .../client/getdata/StreamGetDataClient.java | 101 + .../client/getdata/StreamPoolGetDataClient.java | 80 + .../getdata/ThrottlingGetDataMetricTracker.java | 108 + .../windmill/client/grpc/ChannelzServlet.java | 27 +- .../client/grpc/GetWorkResponseChunkAssembler.java | 139 ++ .../windmill/client/grpc/GrpcCommitWorkStream.java | 7 +- .../client/grpc/GrpcDirectGetWorkStream.java | 178 +- .../windmill/client/grpc/GrpcDispatcherClient.java | 4 +- .../windmill/client/grpc/GrpcGetDataStream.java | 22 +- .../windmill/client/grpc/GrpcGetWorkStream.java | 142 +- .../client/grpc/GrpcGetWorkerMetadataStream.java | 4 +- .../windmill/client/grpc/GrpcWindmillServer.java | 5 - .../client/grpc/GrpcWindmillStreamFactory.java | 18 +- .../StreamObserverCancelledException.java | 14 +- .../worker/windmill/state/AbstractWindmillMap.java | 8 +- .../worker/windmill/state/CachingStateTable.java | 53 +- .../worker/windmill/state/WindmillMap.java | 24 +- .../windmill/state/WindmillMapViaMultimap.java | 164 ++ .../worker/windmill/state/WindmillMultimap.java | 4 +- .../worker/windmill/state/WindmillSet.java | 36 +- .../worker/windmill/state/WindmillStateCache.java | 46 +- .../windmill/state/WindmillStateInternals.java | 14 +- .../worker/windmill/work/WorkItemScheduler.java | 4 - .../work/budget/EvenGetWorkBudgetDistributor.java | 27 +- .../work/budget/GetWorkBudgetDistributor.java | 5 +- .../windmill/work/budget/GetWorkBudgetSpender.java | 19 +- .../processing/ComputationWorkExecutorFactory.java | 10 +- .../work/processing/StreamingWorkScheduler.java | 43 +- .../windmill/work/refresh/ActiveWorkRefresher.java | 104 +- .../work/refresh/ActiveWorkRefreshers.java | 50 - .../work/refresh/ApplianceHeartbeatSender.java | 62 + .../refresh/DispatchedActiveWorkRefresher.java | 68 - .../work/refresh/FixedStreamHeartbeatSender.java | 93 + .../windmill/work/refresh/HeartbeatSender.java | 18 +- .../worker/windmill/work/refresh/Heartbeats.java | 70 + .../work/refresh/StreamPoolHeartbeatSender.java | 48 + .../worker/BatchModeExecutionContextTest.java | 37 +- .../dataflow/worker/FakeWindmillServer.java | 84 +- .../worker/StreamingDataflowWorkerTest.java | 115 +- .../worker/StreamingModeExecutionContextTest.java | 22 +- .../worker/StreamingStepMetricsContainerTest.java | 58 + .../dataflow/worker/WindmillStateTestUtils.java | 2 +- .../dataflow/worker/WorkerCustomSourcesTest.java | 24 +- .../graph/LengthPrefixUnknownCodersTest.java | 10 +- .../worker/streaming/ActiveWorkStateTest.java | 77 +- .../streaming/ComputationStateCacheTest.java | 7 +- .../FanOutStreamingEngineWorkerHarnessTest.java} | 61 +- .../harness}/WindmillStreamSenderTest.java | 56 +- .../sideinput/SideInputStateFetcherTest.java | 93 +- .../dataflow/worker/testing/GenericJsonAssert.java | 8 +- .../worker/testing/GenericJsonMatcher.java | 8 +- .../worker/util/BoundedQueueExecutorTest.java | 26 +- .../worker/GroupingShuffleEntryIteratorTest.java | 7 +- .../windmill/client/WindmillStreamPoolTest.java | 14 +- .../StreamingApplianceWorkCommitterTest.java | 13 +- .../commits/StreamingEngineWorkCommitterTest.java | 54 +- .../client/getdata/FakeGetDataClient.java} | 35 +- .../ThrottlingGetDataMetricTrackerTest.java | 277 +++ .../windmill/client/grpc/ChannelzServletTest.java | 6 +- .../grpc/GrpcGetWorkerMetadataStreamTest.java | 2 +- .../client/grpc/GrpcWindmillServerTest.java | 39 +- .../windmill/state/WindmillStateCacheTest.java | 2 +- .../windmill/state/WindmillStateInternalsTest.java | 234 ++- .../windmill/state/WindmillStateReaderTest.java | 12 +- .../budget/EvenGetWorkBudgetDistributorTest.java | 123 +- .../failures/WorkFailureProcessorTest.java | 8 +- ...esherTest.java => ActiveWorkRefresherTest.java} | 97 +- .../artifact/ArtifactStagingService.java | 4 +- .../fnexecution/state/StateRequestHandlers.java | 5 +- .../wire/LengthPrefixUnknownCoders.java | 18 +- .../artifact/ArtifactRetrievalServiceTest.java | 4 +- .../fnexecution/control/RemoteExecutionTest.java | 10 +- .../runners/jet/FailedRunningPipelineResults.java | 6 + .../beam/runners/jet/metrics/JetMetricResults.java | 54 +- .../runners/jet/metrics/JetMetricsContainer.java | 24 +- .../beam/runners/jet/metrics/StringSetImpl.java | 51 + runners/portability/java/build.gradle | 1 + .../beam/runners/portability/PortableMetrics.java | 41 +- .../portability/testing/TestUniversalRunner.java | 5 +- .../runners/portability/PortableRunnerTest.java | 17 + runners/prism/build.gradle | 1 + runners/prism/java/build.gradle | 50 + .../beam/runners/prism/PrismArtifactResolver.java | 110 + .../beam/runners/prism/PrismArtifactStager.java | 173 ++ .../apache/beam/runners/prism/PrismExecutor.java | 171 ++ .../apache/beam/runners/prism/PrismJobManager.java | 160 ++ .../apache/beam/runners/prism/PrismLocator.java | 223 ++ .../beam/runners/prism/PrismPipelineOptions.java | 61 + .../beam/runners/prism/PrismPipelineResult.java | 94 + .../org/apache/beam/runners/prism/PrismRunner.java | 128 ++ .../beam/runners/prism/PrismRunnerRegistrar.java} | 24 +- .../apache/beam/runners/prism/StateListener.java | 14 +- .../apache/beam/runners/prism/StateWatcher.java | 146 ++ .../runners/prism/TestPrismPipelineOptions.java | 8 +- .../apache/beam/runners/prism/TestPrismRunner.java | 84 + .../apache/beam/runners/prism/WorkerService.java | 116 ++ .../apache/beam/runners/prism}/package-info.java | 5 +- .../runners/prism/PrismArtifactResolverTest.java | 45 + .../runners/prism/PrismArtifactStagerTest.java | 143 ++ .../beam/runners/prism/PrismExecutorTest.java | 105 + .../beam/runners/prism/PrismJobManagerTest.java | 211 ++ .../beam/runners/prism/PrismLocatorTest.java | 125 ++ .../apache/beam/runners/prism/PrismRunnerTest.java | 153 ++ .../beam/runners/prism/StateWatcherTest.java | 136 ++ .../beam/runners/prism/WorkerServiceTest.java | 85 + runners/samza/build.gradle | 2 + .../SplittableParDoProcessKeyedElementsOp.java | 12 +- runners/spark/spark_runner.gradle | 6 + .../spark/stateful/SparkStateInternals.java | 186 +- .../spark/stateful/SparkStateInternalsTest.java | 26 +- scripts/ci/pr-bot/package-lock.json | 28 +- scripts/ci/pr-bot/processPrUpdate.ts | 23 +- scripts/ci/pr-bot/shared/commentStrings.ts | 2 +- scripts/ci/pr-bot/shared/userCommand.ts | 64 +- scripts/tools/bomupgrader.py | 2 +- sdks/go.mod | 152 +- sdks/go.sum | 337 +-- sdks/go/cmd/beamctl/cmd/provision.go | 3 +- sdks/go/cmd/prism/prism.go | 25 +- sdks/go/container/boot_test.go | 2 +- sdks/go/container/tools/provision.go | 13 +- sdks/go/examples/kafka/taxi.go | 6 +- sdks/go/examples/xlang/bigquery/wordcount.go | 6 +- sdks/go/pkg/beam/artifact/gcsproxy/retrieval.go | 2 +- sdks/go/pkg/beam/artifact/gcsproxy/staging.go | 2 +- sdks/go/pkg/beam/artifact/materialize.go | 2 +- sdks/go/pkg/beam/artifact/materialize_test.go | 2 +- sdks/go/pkg/beam/coder.go | 12 +- sdks/go/pkg/beam/core/core.go | 2 +- sdks/go/pkg/beam/core/runtime/exec/hash.go | 7 + sdks/go/pkg/beam/core/runtime/exec/translate.go | 2 +- sdks/go/pkg/beam/core/runtime/graphx/coder.go | 8 +- .../pkg/beam/core/runtime/graphx/schema/schema.go | 2 +- .../beam/core/runtime/graphx/schema/schema_test.go | 3 +- sdks/go/pkg/beam/core/runtime/graphx/translate.go | 8 +- .../pkg/beam/core/runtime/graphx/translate_test.go | 10 +- .../pkg/beam/core/runtime/harness/harness_test.go | 2 +- sdks/go/pkg/beam/core/runtime/harness/statemgr.go | 14 +- .../pkg/beam/core/runtime/pipelinex/clone_test.go | 2 +- sdks/go/pkg/beam/core/runtime/pipelinex/replace.go | 2 +- .../beam/core/runtime/pipelinex/replace_test.go | 2 +- sdks/go/pkg/beam/core/runtime/pipelinex/util.go | 2 +- sdks/go/pkg/beam/core/runtime/xlangx/expand.go | 33 +- .../pkg/beam/core/runtime/xlangx/resolve_test.go | 2 +- sdks/go/pkg/beam/core/util/protox/any.go | 6 +- sdks/go/pkg/beam/core/util/protox/any_test.go | 4 +- sdks/go/pkg/beam/core/util/protox/base64.go | 2 +- sdks/go/pkg/beam/core/util/protox/protox.go | 2 +- sdks/go/pkg/beam/create_test.go | 6 +- sdks/go/pkg/beam/io/filesystem/s3/s3.go | 10 +- .../beam/model/fnexecution_v1/beam_fn_api.pb.go | 7 +- .../model/fnexecution_v1/beam_fn_api_grpc.pb.go | 2 +- .../model/fnexecution_v1/beam_provision_api.pb.go | 2 +- .../fnexecution_v1/beam_provision_api_grpc.pb.go | 2 +- .../model/jobmanagement_v1/beam_artifact_api.pb.go | 2 +- .../jobmanagement_v1/beam_artifact_api_grpc.pb.go | 2 +- .../jobmanagement_v1/beam_expansion_api.pb.go | 2 +- .../jobmanagement_v1/beam_expansion_api_grpc.pb.go | 2 +- .../beam/model/jobmanagement_v1/beam_job_api.pb.go | 2 +- .../model/jobmanagement_v1/beam_job_api_grpc.pb.go | 2 +- .../beam/model/pipeline_v1/beam_runner_api.pb.go | 2177 ++++++++++---------- .../model/pipeline_v1/beam_runner_api_grpc.pb.go | 2 +- sdks/go/pkg/beam/model/pipeline_v1/endpoints.pb.go | 2 +- .../model/pipeline_v1/external_transforms.pb.go | 206 +- sdks/go/pkg/beam/model/pipeline_v1/metrics.pb.go | 698 ++++--- sdks/go/pkg/beam/model/pipeline_v1/schema.pb.go | 2 +- .../model/pipeline_v1/standard_window_fns.pb.go | 2 +- sdks/go/pkg/beam/provision/provision.go | 2 +- sdks/go/pkg/beam/runners/dataflow/dataflow.go | 3 +- .../beam/runners/dataflow/dataflowlib/execute.go | 3 +- sdks/go/pkg/beam/runners/prism/internal/coders.go | 12 +- .../pkg/beam/runners/prism/internal/coders_test.go | 16 +- .../prism/internal/engine/elementmanager.go | 204 +- .../runners/prism/internal/engine/teststream.go | 12 +- .../beam/runners/prism/internal/engine/timers.go | 156 +- .../beam/runners/prism/internal/environments.go | 8 +- sdks/go/pkg/beam/runners/prism/internal/execute.go | 39 +- .../pkg/beam/runners/prism/internal/handlepardo.go | 14 +- .../beam/runners/prism/internal/handlerunner.go | 56 +- .../prism/internal/jobservices/management.go | 50 +- .../runners/prism/internal/jobservices/metrics.go | 123 +- .../prism/internal/jobservices/metrics_test.go | 38 + .../runners/prism/internal/jobservices/server.go | 53 +- .../pkg/beam/runners/prism/internal/preprocess.go | 16 +- sdks/go/pkg/beam/runners/prism/internal/stage.go | 111 +- sdks/go/pkg/beam/runners/prism/internal/web/web.go | 8 +- .../beam/runners/prism/internal/worker/bundle.go | 5 +- .../beam/runners/prism/internal/worker/worker.go | 2 +- .../runners/prism/internal/worker/worker_test.go | 81 +- sdks/go/pkg/beam/runners/prism/prism.go | 12 + .../go/pkg/beam/runners/universal/runnerlib/job.go | 3 +- .../pkg/beam/runners/universal/runnerlib/stage.go | 2 +- sdks/go/pkg/beam/runners/universal/universal.go | 3 +- .../pkg/beam/transforms/xlang/schema/external.go | 2 +- sdks/go/run_with_go_version.sh | 2 +- sdks/go/test/build.gradle | 10 +- .../integration/internal/containers/containers.go | 32 +- .../main/resources/beam/checkstyle/checkstyle.xml | 8 + .../resources/beam/checkstyle/suppressions.xml | 4 +- sdks/java/container/boot.go | 7 +- sdks/java/container/common.gradle | 5 +- .../container/license_scripts/dep_urls_java.yaml | 2 +- .../java/org/apache/beam/sdk/io/FileBasedSink.java | 1 + .../org/apache/beam/sdk/io/FileBasedSource.java | 1 + .../java/org/apache/beam/sdk/io/FileSystem.java | 8 + .../java/org/apache/beam/sdk/io/FileSystems.java | 11 + .../sdk/io/ReadAllViaFileBasedSourceTransform.java | 5 +- .../java/org/apache/beam/sdk/io/TextSource.java | 220 +- .../java/org/apache/beam/sdk/metrics/Lineage.java | 159 ++ .../org/apache/beam/sdk/metrics/MetricKey.java | 4 +- .../org/apache/beam/sdk/metrics/MetricName.java | 2 + .../beam/sdk/metrics/MetricQueryResults.java | 13 +- .../org/apache/beam/sdk/metrics/MetricResult.java | 2 +- .../java/org/apache/beam/sdk/metrics/Metrics.java | 47 + .../apache/beam/sdk/metrics/MetricsContainer.java | 6 + .../beam/sdk/metrics/MetricsEnvironment.java | 11 +- .../org/apache/beam/sdk/metrics/StringSet.java} | 22 +- .../apache/beam/sdk/metrics/StringSetResult.java | 61 + .../apache/beam/sdk/schemas/AutoValueSchema.java | 35 +- .../apache/beam/sdk/schemas/CachingFactory.java | 11 +- .../java/org/apache/beam/sdk/schemas/Factory.java | 3 +- .../beam/sdk/schemas/FromRowUsingCreator.java | 38 +- .../sdk/schemas/GetterBasedSchemaProvider.java | 68 +- .../sdk/schemas/GetterBasedSchemaProviderV2.java | 56 + .../apache/beam/sdk/schemas/JavaBeanSchema.java | 59 +- .../apache/beam/sdk/schemas/JavaFieldSchema.java | 44 +- .../sdk/schemas/SetterBasedCreatorFactory.java | 7 +- .../sdk/schemas/annotations/DefaultSchema.java | 2 +- .../beam/sdk/schemas/transforms/CoGroup.java | 7 +- .../transforms/providers/ErrorHandling.java | 3 +- .../schemas/transforms/providers/JavaRowUdf.java | 2 +- .../providers/LoggingTransformProvider.java | 4 +- .../beam/sdk/schemas/utils/AutoValueUtils.java | 30 +- .../sdk/schemas/utils/FieldValueTypeSupplier.java | 7 +- .../beam/sdk/schemas/utils/JavaBeanUtils.java | 57 +- .../apache/beam/sdk/schemas/utils/POJOUtils.java | 72 +- .../beam/sdk/schemas/utils/ReflectUtils.java | 13 + .../sdk/schemas/utils/StaticSchemaInference.java | 28 +- .../java/org/apache/beam/sdk/state/StateSpecs.java | 23 + .../java/org/apache/beam/sdk/testing/PAssert.java | 15 +- .../beam/sdk/testing/SerializableMatchers.java | 14 +- .../UsesStringSetMetrics.java} | 12 +- .../sdk/transforms/SerializableComparator.java | 21 +- .../java/org/apache/beam/sdk/transforms/Watch.java | 11 +- .../sdk/transforms/errorhandling/BadRecord.java | 6 +- .../reflect/ByteBuddyOnTimerInvokerFactory.java | 6 +- .../transforms/resourcehints/ResourceHints.java | 8 +- .../beam/sdk/util}/SerializableSupplier.java | 2 +- .../java/org/apache/beam/sdk/util/StringUtils.java | 35 + .../beam/sdk/util/construction/Environments.java | 1 + .../org/apache/beam/sdk/values/RowWithGetters.java | 2 +- .../beam/sdk/coders/StructuralByteArrayTest.java | 10 +- .../org/apache/beam/sdk/io/FileBasedSinkTest.java | 2 +- .../java/org/apache/beam/sdk/io/FileIOTest.java | 6 +- .../org/apache/beam/sdk/io/TFRecordIOTest.java | 7 +- .../org/apache/beam/sdk/io/TextIOReadTest.java | 83 +- .../org/apache/beam/sdk/io/TextIOWriteTest.java | 6 +- .../beam/sdk/io/TextRowCountEstimatorTest.java | 10 +- .../org/apache/beam/sdk/io/TextSourceTest.java | 156 ++ .../org/apache/beam/sdk/io/WriteFilesTest.java | 5 +- .../org/apache/beam/sdk/metrics/LineageTest.java | 59 + .../beam/sdk/metrics/MetricResultsMatchers.java | 18 +- .../org/apache/beam/sdk/metrics/MetricsTest.java | 234 ++- .../beam/sdk/metrics/StringSetResultTest.java | 64 + .../sdk/options/PipelineOptionsFactoryTest.java | 24 +- .../beam/sdk/schemas/SchemaTranslationTest.java | 5 +- .../beam/sdk/schemas/utils/JavaBeanUtilsTest.java | 33 +- .../beam/sdk/schemas/utils/POJOUtilsTest.java | 40 +- .../org/apache/beam/sdk/testing/PAssertTest.java | 31 + .../org/apache/beam/sdk/transforms/ParDoTest.java | 28 +- .../sdk/transforms/SerializableComparatorTest.java | 63 + .../BufferedElementCountingOutputStreamTest.java | 6 +- .../sdk/util/ExposedByteArrayInputStreamTest.java | 6 +- .../sdk/util/ExposedByteArrayOutputStreamTest.java | 4 +- .../beam/sdk/util/SerializableUtilsTest.java | 4 +- .../org/apache/beam/sdk/util/StringUtilsTest.java | 23 + .../sdk/util/construction/EnvironmentsTest.java | 3 + .../expansion/service/ExpansionServiceTest.java | 4 +- .../beam/sdk/extensions/arrow/ArrowConversion.java | 3 +- .../avro/AvroGenericCoderTranslator.java | 6 +- .../avro/io/SerializableAvroCodecFactory.java | 12 +- .../extensions/avro/schemas/AvroRecordSchema.java | 18 +- .../extensions/avro/schemas/utils/AvroUtils.java | 45 +- .../beam/sdk/extensions/avro/io/AvroIOTest.java | 6 +- .../avro/io/SerializableAvroCodecFactoryTest.java | 48 +- .../google-cloud-platform-core/build.gradle | 2 +- .../sdk/extensions/gcp/storage/GcsFileSystem.java | 11 + .../gcp/util/RetryHttpRequestInitializer.java | 2 +- .../beam/sdk/extensions/gcp/util/Transport.java | 4 +- .../beam/sdk/extensions/gcp/util/GcsUtilTest.java | 10 +- ...LatencyRecordingHttpRequestInitializerTest.java | 4 +- .../gcp/util/RetryHttpRequestInitializerTest.java | 4 +- .../extensions/protobuf/ProtoByteBuddyUtils.java | 5 +- .../extensions/protobuf/ProtoMessageSchema.java | 27 +- .../extensions/protobuf/ProtoSchemaTranslator.java | 12 +- .../protobuf/ProtoSchemaTranslatorTest.java | 15 + .../sdk/extensions/protobuf/TestProtoSchemas.java | 22 + .../src/test/proto/proto3_schema_messages.proto | 14 +- .../src/test/proto/proto3_schema_options.proto | 2 + .../extensions/python/PythonExternalTransform.java | 5 +- .../beam/sdk/extensions/python/PythonService.java | 5 +- sdks/java/extensions/sql/jdbc/build.gradle | 2 +- .../beam/sdk/extensions/sql/jdbc/BeamSqlLine.java | 6 +- .../meta/provider/text/TextTableProviderTest.java | 20 +- .../beam/fn/harness/ExternalWorkerService.java | 44 +- .../apache/beam/fn/harness/FnApiDoFnRunner.java | 57 +- .../fn/harness/control/ExecutionStateSampler.java | 9 + .../beam/fn/harness/state/FnApiStateAccessor.java | 8 +- .../beam/fn/harness/FnApiDoFnRunnerTest.java | 465 ++++- .../harness/control/ExecutionStateSamplerTest.java | 22 + .../apache/beam/sdk/io/aws/s3/S3FileSystem.java | 6 + .../beam/sdk/io/aws2/options/AwsOptions.java | 8 +- .../apache/beam/sdk/io/aws2/s3/S3FileSystem.java | 6 + .../sdk/io/aws2/schemas/AwsSchemaProvider.java | 40 +- .../apache/beam/sdk/io/aws2/schemas/AwsTypes.java | 4 +- .../azure/blobstore/AzureBlobStoreFileSystem.java | 13 + sdks/java/io/clickhouse/build.gradle | 2 +- .../beam/sdk/io/clickhouse/ClickHouseIO.java | 9 +- .../beam/sdk/io/clickhouse/ClickHouseWriter.java | 10 +- .../beam/sdk/io/clickhouse/ClickHouseIOTest.java | 29 +- .../beam/sdk/io/common/SchemaAwareJavaBeans.java | 8 +- .../io/contextualtextio/ContextualTextIOTest.java | 16 +- .../java/org/apache/beam/sdk/io/csv/CsvIO.java | 228 ++ .../org/apache/beam/sdk/io/csv/CsvIOParse.java | 106 + .../beam/sdk/io/csv/CsvIOParseConfiguration.java | 45 +- .../apache/beam/sdk/io/csv/CsvIOParseError.java | 24 + .../apache/beam/sdk/io/csv/CsvIOParseHelpers.java | 129 +- .../org/apache/beam/sdk/io/csv/CsvIOParseKV.java | 46 + .../apache/beam/sdk/io/csv/CsvIOParseResult.java | 34 +- .../org/apache/beam/sdk/io/csv/CsvIOReadFiles.java | 24 +- .../beam/sdk/io/csv/CsvIORecordToObjects.java | 129 ++ .../beam/sdk/io/csv/CsvIOStringToCsvRecord.java | 114 + .../beam/sdk/io/csv/CsvIOParseHelpersTest.java | 608 ++++++ .../apache/beam/sdk/io/csv/CsvIOParseKVTest.java} | 16 +- .../org/apache/beam/sdk/io/csv/CsvIOParseTest.java | 316 +++ .../beam/sdk/io/csv/CsvIORecordToObjectsTest.java | 422 ++++ .../sdk/io/csv/CsvIOStringToCsvRecordTest.java | 565 +++++ .../java/org/apache/beam/sdk/io/csv/CsvIOTest.java | 300 +++ sdks/java/io/expansion-service/build.gradle | 7 +- .../java/org/apache/beam/sdk/io/text/TextIOIT.java | 5 + sdks/java/io/google-ads/build.gradle | 2 +- .../apache/beam/sdk/io/googleads/GoogleAdsIO.java | 8 +- .../{GoogleAdsV14.java => GoogleAdsV17.java} | 100 +- .../sdk/io/googleads/DummyRateLimitPolicy.java | 11 +- ...GoogleAdsV14Test.java => GoogleAdsV17Test.java} | 90 +- .../io/googleads/MockGoogleAdsClientFactory.java | 10 +- sdks/java/io/google-cloud-platform/build.gradle | 7 +- .../beam/sdk/io/gcp/bigquery/AppendClientInfo.java | 14 +- .../beam/sdk/io/gcp/bigquery/BigQueryHelpers.java | 21 + .../beam/sdk/io/gcp/bigquery/BigQueryIO.java | 93 +- .../sdk/io/gcp/bigquery/BigQueryIOTranslation.java | 11 + .../beam/sdk/io/gcp/bigquery/BigQueryOptions.java | 54 +- .../sdk/io/gcp/bigquery/BigQueryServicesImpl.java | 106 +- .../sdk/io/gcp/bigquery/BigQuerySourceBase.java | 5 +- .../io/gcp/bigquery/BigQueryStorageSourceBase.java | 12 +- .../gcp/bigquery/BigQueryStorageStreamSource.java | 8 +- .../beam/sdk/io/gcp/bigquery/BigQueryUtils.java | 4 +- .../beam/sdk/io/gcp/bigquery/CreateTables.java | 9 +- .../sdk/io/gcp/bigquery/DynamicDestinations.java | 4 + .../sdk/io/gcp/bigquery/SplittingIterable.java | 18 +- .../io/gcp/bigquery/StorageApiConvertMessages.java | 8 +- .../bigquery/StorageApiDynamicDestinations.java | 2 +- .../StorageApiDynamicDestinationsBeamRow.java | 16 +- ...StorageApiDynamicDestinationsGenericRecord.java | 20 +- .../StorageApiDynamicDestinationsProto.java | 33 +- .../StorageApiDynamicDestinationsTableRow.java | 16 +- .../beam/sdk/io/gcp/bigquery/StorageApiLoads.java | 7 + .../io/gcp/bigquery/StorageApiWritePayload.java | 22 +- .../StorageApiWriteRecordsInconsistent.java | 7 + .../bigquery/StorageApiWriteUnshardedRecords.java | 137 +- .../bigquery/StorageApiWritesShardedRecords.java | 46 +- .../io/gcp/bigquery/TableRowToStorageApiProto.java | 32 +- .../beam/sdk/io/gcp/bigquery/WriteRename.java | 7 + .../beam/sdk/io/gcp/bigquery/WriteTables.java | 7 + .../beam/sdk/io/gcp/bigtable/BigtableIO.java | 196 +- .../beam/sdk/io/gcp/bigtable/BigtableService.java | 6 + .../io/gcp/bigtable/BigtableServiceFactory.java | 9 + .../sdk/io/gcp/bigtable/BigtableServiceImpl.java | 39 +- .../sdk/io/gcp/bigtable/BigtableWriteOptions.java | 5 + .../dofn/ReadChangeStreamPartitionDoFn.java | 1 + .../beam/sdk/io/gcp/datastore/DatastoreV1.java | 2 +- .../sdk/io/gcp/datastore/RampupThrottlingFn.java | 3 +- .../sdk/io/gcp/firestore/FirestoreOptions.java | 11 + .../sdk/io/gcp/firestore/FirestoreV1ReadFn.java | 9 +- .../sdk/io/gcp/firestore/FirestoreV1WriteFn.java | 9 +- .../io/gcp/healthcare/HttpHealthcareApiClient.java | 4 +- .../sdk/io/gcp/pubsub/PreparePubsubWriteDoFn.java | 10 + .../beam/sdk/io/gcp/pubsub/PubsubClient.java | 29 + .../apache/beam/sdk/io/gcp/pubsub/PubsubIO.java | 60 +- .../io/gcp/pubsub/PubsubMessageWithTopicCoder.java | 4 +- .../internal/SubscriptionPartitionLoader.java | 2 +- .../beam/sdk/io/gcp/spanner/MutationUtils.java | 60 + .../beam/sdk/io/gcp/spanner/SpannerAccessor.java | 5 +- .../apache/beam/sdk/io/gcp/spanner/SpannerIO.java | 66 +- .../sdk/io/gcp/spanner/SpannerQuerySourceDef.java | 56 + .../SpannerReadSchemaTransformProvider.java | 235 +++ .../beam/sdk/io/gcp/spanner/SpannerSchema.java | 3 + .../spanner/SpannerSchemaRetrievalException.java} | 16 +- .../beam/sdk/io/gcp/spanner/SpannerSourceDef.java} | 20 +- .../sdk/io/gcp/spanner/SpannerTableSourceDef.java | 63 + .../SpannerWriteSchemaTransformProvider.java | 160 +- .../beam/sdk/io/gcp/spanner/StructUtils.java | 67 + .../dao/PartitionMetadataAdminDao.java | 83 +- .../sdk/io/gcp/testing/FakeDatasetService.java | 5 +- .../sdk/io/gcp/bigquery/BigQueryHelpersTest.java | 1 + .../sdk/io/gcp/bigquery/BigQueryIOReadTest.java | 12 +- .../io/gcp/bigquery/BigQueryIOTranslationTest.java | 3 + .../sdk/io/gcp/bigquery/BigQueryIOWriteTest.java | 462 ++++- .../io/gcp/bigquery/BigQueryServicesImplTest.java | 33 +- .../io/gcp/bigquery/BigQuerySinkMetricsTest.java | 4 +- .../sdk/io/gcp/bigquery/BigQueryUtilsTest.java | 103 +- .../beam/sdk/io/gcp/bigtable/BigtableIOTest.java | 15 + .../beam/sdk/io/gcp/bigtable/BigtableReadIT.java | 24 +- .../io/gcp/bigtable/BigtableServiceImplTest.java | 62 + .../io/gcp/bigtable/BigtableSharedClientTest.java | 271 +++ .../beam/sdk/io/gcp/bigtable/BigtableWriteIT.java | 20 +- .../dofn/ReadChangeStreamPartitionDoFnTest.java | 1 + .../gcp/firestore/it/FirestoreTestingHelper.java | 5 +- .../beam/sdk/io/gcp/healthcare/FhirIOTestUtil.java | 5 +- .../beam/sdk/io/gcp/pubsub/PubsubClientTest.java | 3 + .../beam/sdk/io/gcp/pubsub/PubsubIOTest.java | 9 +- .../PubsubReadSchemaTransformProviderTest.java | 6 +- .../sdk/io/gcp/pubsub/PubsubUnboundedSinkTest.java | 2 +- .../gcp/pubsublite/internal/FakeSerializable.java | 2 +- .../internal/SubscriptionPartitionLoaderTest.java | 2 +- .../sdk/io/gcp/spanner/SpannerAccessorTest.java | 2 + .../beam/sdk/io/gcp/spanner/SpannerIOReadTest.java | 26 + .../beam/sdk/io/gcp/spanner/SpannerReadIT.java | 34 + .../beam/sdk/io/gcp/spanner/SpannerSchemaTest.java | 4 +- .../beam/sdk/io/gcp/spanner/StructUtilsTest.java | 66 + .../dao/PartitionMetadataAdminDaoTest.java | 29 +- sdks/java/io/hcatalog/build.gradle | 13 +- sdks/java/io/iceberg/build.gradle | 5 + sdks/java/io/iceberg/hive/build.gradle | 80 + sdks/java/io/iceberg/hive/exec/build.gradle | 58 + .../sdk/io/iceberg/hive/IcebergHiveCatalogIT.java | 280 +++ .../hive/testutils/HiveMetastoreExtension.java | 68 + .../io/iceberg/hive/testutils/ScriptRunner.java | 203 ++ .../iceberg/hive/testutils/TestHiveMetastore.java | 273 +++ .../src/test/resources/hive-schema-3.1.0.derby.sql | 267 +++ .../beam/sdk/io/iceberg/IcebergCatalogConfig.java | 214 +- .../org/apache/beam/sdk/io/iceberg/IcebergIO.java | 10 +- .../IcebergReadSchemaTransformProvider.java | 56 +- .../IcebergSchemaTransformCatalogConfig.java | 107 - .../apache/beam/sdk/io/iceberg/IcebergUtils.java | 381 ++++ .../IcebergWriteSchemaTransformProvider.java | 60 +- .../apache/beam/sdk/io/iceberg/RecordWriter.java | 77 +- .../beam/sdk/io/iceberg/RecordWriterManager.java | 298 +++ .../org/apache/beam/sdk/io/iceberg/ScanSource.java | 2 +- .../apache/beam/sdk/io/iceberg/ScanTaskReader.java | 4 +- .../sdk/io/iceberg/SchemaAndRowConversions.java | 275 --- .../io/iceberg/SchemaTransformConfiguration.java | 69 + .../sdk/io/iceberg/WriteGroupedRowsToFiles.java | 49 +- .../sdk/io/iceberg/WriteUngroupedRowsToFiles.java | 164 +- .../apache/beam/sdk/io/iceberg/IcebergIOIT.java | 289 +-- .../beam/sdk/io/iceberg/IcebergIOReadTest.java | 21 +- .../beam/sdk/io/iceberg/IcebergIOWriteTest.java | 45 +- .../IcebergReadSchemaTransformProviderTest.java | 46 +- .../IcebergSchemaTransformTranslationTest.java | 55 +- .../beam/sdk/io/iceberg/IcebergUtilsTest.java | 677 ++++++ .../IcebergWriteSchemaTransformProviderTest.java | 44 +- .../sdk/io/iceberg/RecordWriterManagerTest.java | 266 +++ .../apache/beam/sdk/io/iceberg/ScanSourceTest.java | 41 +- .../io/iceberg/SchemaAndRowConversionsTest.java | 268 --- .../beam/sdk/io/iceberg/TestDataWarehouse.java | 8 +- .../apache/beam/sdk/io/iceberg/TestFixtures.java | 2 +- .../java/org/apache/beam/sdk/io/jms/JmsIO.java | 9 +- .../java/org/apache/beam/sdk/io/jms/CommonJms.java | 5 +- .../beam/sdk/io/kafka/KafkaCommitOffset.java | 83 +- .../beam/sdk/io/kafka/KafkaExactlyOnceSink.java | 16 + .../java/org/apache/beam/sdk/io/kafka/KafkaIO.java | 234 ++- .../KafkaIOReadImplementationCompatibility.java | 18 + .../beam/sdk/io/kafka/KafkaUnboundedSource.java | 13 + .../org/apache/beam/sdk/io/kafka/KafkaWriter.java | 18 +- .../beam/sdk/io/kafka/ReadFromKafkaDoFn.java | 62 +- .../beam/sdk/io/kafka/KafkaCommitOffsetTest.java | 169 +- .../beam/sdk/io/kafka/KafkaIOExternalTest.java | 17 +- ...KafkaIOReadImplementationCompatibilityTest.java | 27 +- .../org/apache/beam/sdk/io/kafka/KafkaIOTest.java | 131 +- .../beam/sdk/io/kafka/ReadFromKafkaDoFnTest.java | 8 +- .../sdk/io/kafka/upgrade/KafkaIOTranslation.java | 24 + .../io/kafka/upgrade/KafkaIOTranslationTest.java | 1 + .../org/apache/beam/sdk/io/pulsar/PulsarIO.java | 15 + .../beam/sdk/io/pulsar/ReadFromPulsarDoFn.java | 5 + .../beam/sdk/io/pulsar/WriteToPulsarDoFn.java | 5 + .../apache/beam/sdk/io/pulsar/package-info.java | 6 +- .../org/apache/beam/io/requestresponse/Call.java | 1 + .../DefaultSerializableBackoffSupplier.java | 1 + .../beam/io/requestresponse/RequestResponseIO.java | 1 + .../requestresponse/WindowedCallShouldBackoff.java | 1 + .../io/requestresponse/RequestResponseIOTest.java | 1 + .../WindowedCallShouldBackoffTest.java | 1 + .../apache/beam/sdk/io/snowflake/SnowflakeIO.java | 6 +- .../services/SnowflakeBatchServiceImpl.java | 6 +- sdks/java/io/solace/build.gradle | 14 +- .../org/apache/beam/sdk/io/solace/SolaceIO.java | 467 ++++- .../broker/BasicAuthJcsmpSessionService.java | 38 +- .../BasicAuthJcsmpSessionServiceFactory.java | 3 +- .../sdk/io/solace/broker/BasicAuthSempClient.java | 102 + .../solace/broker/BasicAuthSempClientFactory.java | 82 + .../beam/sdk/io/solace/broker/BrokerResponse.java | 62 + .../broker/GCPSecretSessionServiceFactory.java | 169 ++ .../beam/sdk/io/solace/broker/MessageReceiver.java | 3 + .../solace/broker/SempBasicAuthClientExecutor.java | 228 ++ .../beam/sdk/io/solace/broker/SessionService.java | 196 +- .../io/solace/broker/SessionServiceFactory.java | 3 +- .../io/solace/broker/SolaceMessageReceiver.java | 7 + .../org/apache/beam/sdk/io/solace/data/Semp.java | 74 + .../org/apache/beam/sdk/io/solace/data/Solace.java | 114 +- .../sdk/io/solace/read/UnboundedSolaceReader.java | 4 +- .../sdk/io/solace/read/UnboundedSolaceSource.java | 9 + .../sdk/io/solace/read/WatermarkParameters.java | 26 +- .../beam/sdk/io/solace/read/WatermarkPolicy.java | 10 +- .../beam/sdk/io/solace/write/SolaceOutput.java | 104 + .../beam/sdk/io/solace/write}/package-info.java | 5 +- .../sdk/io/solace/MockEmptySessionService.java | 8 +- .../beam/sdk/io/solace/MockSessionService.java | 31 +- .../apache/beam/sdk/io/solace/SolaceIOTest.java | 3 +- .../solace/broker/BasicAuthWriterSessionTest.java | 106 + .../broker/OverrideWriterPropertiesTest.java | 56 + .../broker/SempBasicAuthClientExecutorTest.java | 283 +++ .../beam/sdk/io/solace/data/SolaceDataUtils.java | 5 +- .../sdk/io/solace/it/SolaceContainerManager.java | 188 ++ .../apache/beam/sdk/io/solace/it/SolaceIOIT.java | 129 ++ .../beam/sdk/io/synthetic/SyntheticStep.java | 2 +- .../apache/beam/sdk/io/thrift/ThriftSchema.java | 23 +- sdks/java/testing/test-utils/build.gradle | 10 +- .../testutils/jvmverification/JvmVerification.java | 6 + .../org/apache/beam/sdk/tpcds/QueryReader.java | 4 +- .../apache/beam/sdk/tpcds/SqlTransformRunner.java | 4 +- .../beam/sdk/tpcds/TableSchemaJSONLoader.java | 4 +- .../launcher/TransformServiceLauncherTest.java | 10 +- sdks/python/apache_beam/coders/coder_impl.py | 2 - sdks/python/apache_beam/coders/stream.pyx | 2 - .../apache_beam/dataframe/pandas_doctests_test.py | 1 + .../examples/complete/top_wikipedia_sessions.py | 34 +- .../complete/top_wikipedia_sessions_test.py | 2 + .../inference/sklearn_examples_requirements.txt | 7 +- .../kfp/components/train/requirements.txt | 2 +- sdks/python/apache_beam/io/fileio.py | 14 +- sdks/python/apache_beam/io/gcp/bigquery.py | 11 + .../apache_beam/io/gcp/bigquery_file_loads.py | 30 +- .../apache_beam/io/gcp/bigquery_file_loads_test.py | 20 + .../apache_beam/io/gcp/bigquery_read_internal.py | 56 +- .../apache_beam/io/gcp/bigquery_read_it_test.py | 4 +- .../io/gcp/bigquery_schema_tools_test.py | 18 +- sdks/python/apache_beam/io/gcp/bigquery_test.py | 54 +- sdks/python/apache_beam/io/gcp/bigquery_tools.py | 8 +- sdks/python/apache_beam/io/gcp/bigtableio.py | 5 +- sdks/python/apache_beam/io/gcp/gcsio.py | 11 +- sdks/python/apache_beam/io/gcp/spanner_wrapper.py | 76 + sdks/python/apache_beam/io/kafka.py | 38 +- sdks/python/apache_beam/io/textio.py | 14 +- sdks/python/apache_beam/io/textio_test.py | 30 + sdks/python/apache_beam/metrics/cells.pxd | 6 + sdks/python/apache_beam/metrics/cells.py | 77 +- sdks/python/apache_beam/metrics/cells_test.py | 24 + sdks/python/apache_beam/metrics/execution.py | 22 +- sdks/python/apache_beam/metrics/execution_test.py | 9 + sdks/python/apache_beam/metrics/metric.py | 114 +- sdks/python/apache_beam/metrics/metric_test.py | 22 + sdks/python/apache_beam/metrics/metricbase.py | 16 +- .../python/apache_beam/metrics/monitoring_infos.py | 55 +- .../apache_beam/metrics/monitoring_infos_test.py | 25 + sdks/python/apache_beam/ml/inference/base.py | 231 ++- sdks/python/apache_beam/ml/inference/base_test.py | 216 ++ .../ml/inference/huggingface_inference.py | 2 +- .../python/apache_beam/options/pipeline_options.py | 14 + sdks/python/apache_beam/pipeline.py | 7 + sdks/python/apache_beam/pipeline_test.py | 14 + sdks/python/apache_beam/runners/common.py | 7 +- .../runners/dataflow/dataflow_metrics.py | 15 +- .../runners/dataflow/internal/apiclient.py | 6 + .../runners/dataflow/internal/apiclient_test.py | 37 + .../clients/dataflow/dataflow_v1b3_client.py | 81 +- .../clients/dataflow/dataflow_v1b3_messages.py | 1497 +++++++++++--- .../apache_beam/runners/dataflow/internal/names.py | 2 +- .../apache_beam/runners/direct/direct_metrics.py | 15 +- .../runners/direct/direct_runner_test.py | 9 + .../runners/direct/watermark_manager.py | 2 +- .../runners/interactive/cache_manager.py | 50 +- .../runners/interactive/cache_manager_test.py | 7 +- .../apache-beam-jupyterlab-sidepanel/yarn.lock | 480 +++-- .../runners/interactive/interactive_beam.py | 98 +- .../runners/interactive/interactive_runner_test.py | 46 +- .../interactive/non_interactive_runner_test.py | 262 +++ .../runners/interactive/pipeline_fragment.py | 50 +- .../runners/interactive/pipeline_fragment_test.py | 8 - .../runners/interactive/recording_manager.py | 9 +- .../runners/interactive/recording_manager_test.py | 2 +- .../apache_beam/runners/interactive/utils.py | 3 + .../python/apache_beam/runners/pipeline_context.py | 4 +- .../runners/portability/flink_runner_test.py | 3 + .../runners/portability/fn_api_runner/execution.py | 4 +- .../runners/portability/fn_api_runner/fn_runner.py | 11 +- .../portability/fn_api_runner/fn_runner_test.py | 9 +- .../runners/portability/portable_metrics.py | 13 +- .../runners/portability/portable_runner.py | 5 +- .../runners/portability/prism_runner_test.py | 99 +- .../apache_beam/runners/portability/stager.py | 3 +- sdks/python/apache_beam/runners/worker/logger.py | 2 - .../apache_beam/runners/worker/opcounters.py | 2 - .../apache_beam/runners/worker/operations.py | 2 - .../runners/worker/statesampler_fast.pyx | 4 +- .../python/apache_beam/testing/fast_test_utils.pyx | 2 - sdks/python/apache_beam/transforms/core.py | 57 +- sdks/python/apache_beam/transforms/core_test.py | 71 + sdks/python/apache_beam/transforms/cy_combiners.py | 2 - .../cy_dataflow_distribution_counter.pyx | 2 - .../apache_beam/transforms/error_handling.py | 126 ++ .../apache_beam/transforms/error_handling_test.py | 148 ++ .../transforms/external_transform_provider.py | 50 +- .../external_transform_provider_it_test.py | 11 +- sdks/python/apache_beam/transforms/stats.py | 2 - sdks/python/apache_beam/transforms/util.py | 14 + sdks/python/apache_beam/typehints/schemas.py | 8 +- sdks/python/apache_beam/utils/counters.py | 1 - .../apache_beam/utils/multi_process_shared.py | 23 + .../apache_beam/utils/multi_process_shared_test.py | 43 + sdks/python/apache_beam/utils/subprocess_server.py | 3 +- sdks/python/apache_beam/utils/windowed_value.py | 2 - sdks/python/apache_beam/version.py | 2 +- sdks/python/apache_beam/yaml/integration_tests.py | 17 + sdks/python/apache_beam/yaml/json_utils.py | 3 +- sdks/python/apache_beam/yaml/main.py | 15 +- sdks/python/apache_beam/yaml/pipeline.schema.yaml | 6 +- sdks/python/apache_beam/yaml/readme_test.py | 39 +- sdks/python/apache_beam/yaml/standard_io.yaml | 31 +- sdks/python/apache_beam/yaml/tests/spanner.yaml | 95 + .../{version.py => yaml/tests/tsv.yaml} | 33 +- sdks/python/apache_beam/yaml/yaml_combine.py | 2 +- sdks/python/apache_beam/yaml/yaml_io.py | 12 +- sdks/python/apache_beam/yaml/yaml_join.py | 3 +- sdks/python/apache_beam/yaml/yaml_provider.py | 57 + .../apache_beam/yaml/yaml_provider_unit_test.py | 58 + sdks/python/apache_beam/yaml/yaml_transform.py | 38 +- .../apache_beam/yaml/yaml_transform_unit_test.py | 3 +- sdks/python/build.gradle | 2 +- .../container/base_image_requirements_manual.txt | 6 +- sdks/python/container/boot.go | 12 +- .../container/license_scripts/dep_urls_py.yaml | 2 + sdks/python/container/piputil.go | 13 +- .../container/py310/base_image_requirements.txt | 162 +- .../container/py311/base_image_requirements.txt | 160 +- .../container/py312/base_image_requirements.txt | 148 +- .../container/py38/base_image_requirements.txt | 160 +- .../container/py39/base_image_requirements.txt | 164 +- sdks/python/gen_xlang_wrappers.py | 3 +- sdks/python/pyproject.toml | 2 +- sdks/python/setup.py | 30 +- sdks/python/test-suites/dataflow/common.gradle | 8 +- sdks/python/test-suites/direct/common.gradle | 2 +- sdks/python/test-suites/direct/xlang/build.gradle | 2 +- sdks/python/test-suites/portable/common.gradle | 5 +- sdks/standard_expansion_services.yaml | 4 + sdks/typescript/package.json | 2 +- settings.gradle.kts | 7 +- website/www/site/config.toml | 2 +- website/www/site/content/en/blog/beam-2.49.0.md | 1 + website/www/site/content/en/blog/beam-2.53.0.md | 1 + website/www/site/content/en/blog/beam-2.54.0.md | 1 + website/www/site/content/en/blog/beam-2.55.0.md | 3 + website/www/site/content/en/blog/beam-2.56.0.md | 6 + website/www/site/content/en/blog/beam-2.57.0.md | 4 + website/www/site/content/en/blog/beam-2.58.0.md | 137 ++ website/www/site/content/en/blog/beam-2.58.1.md | 45 + .../dsls/sql/extensions/user-defined-functions.md | 2 +- .../site/content/en/documentation/io/connectors.md | 16 + .../en/documentation/ml/large-language-modeling.md | 50 +- .../en/documentation/patterns/batch-elements.md | 45 + .../content/en/documentation/patterns/overview.md | 5 + .../en/documentation/patterns/shared-class.md | 93 + .../content/en/documentation/programming-guide.md | 5 +- .../content/en/documentation/sdks/yaml-combine.md | 6 +- .../content/en/documentation/sdks/yaml-errors.md | 5 +- .../en/documentation/sdks/yaml-inline-python.md | 2 - .../content/en/documentation/sdks/yaml-join.md | 182 ++ .../www/site/content/en/documentation/sdks/yaml.md | 2 +- .../transforms/python/elementwise/flatmap.md | 2 +- .../transforms/python/elementwise/map.md | 4 +- .../www/site/content/en/get-started/downloads.md | 20 +- .../site/content/en/get-started/quickstart-java.md | 2 +- website/www/site/layouts/partials/header.html | 6 + .../partials/section-menu/en/documentation.html | 1 + .../layouts/partials/section-menu/en/sdks.html | 1 + website/www/site/static/images/banner_desktop.png | Bin 79738 -> 216408 bytes website/www/site/static/images/banner_mobile.png | Bin 64923 -> 581537 bytes website/www/yarn.lock | 33 +- 927 files changed, 36092 insertions(+), 9219 deletions(-) copy .github/trigger_files/{beam_PostCommit_Python_Dependency.json => beam_PostCommit_Java_Examples_Dataflow_Java.json} (100%) rename .github/workflows/{beam_LoadTests_Java_GBK_Dataflow_V2_Batch_Java11.yml => beam_LoadTests_Java_GBK_Dataflow_V2_Batch.yml} (78%) rename .github/workflows/{beam_LoadTests_Java_GBK_Dataflow_V2_Streaming_Java11.yml => beam_LoadTests_Java_GBK_Dataflow_V2_Streaming.yml} (79%) rename .github/workflows/{beam_PostCommit_Java_ValidatesRunner_Flink_Java11.yml => beam_PostCommit_Java_ValidatesRunner_Flink_Java8.yml} (89%) rename .github/workflows/{beam_PostCommit_Java_ValidatesRunner_Spark_Java11.yml => beam_PostCommit_Java_ValidatesRunner_Spark_Java8.yml} (89%) rename .github/workflows/{beam_PreCommit_SQL_Java11.yml => beam_PreCommit_SQL_Java8.yml} (89%) rename .github/workflows/load-tests-pipeline-options/{java_GBK_Dataflow_V2_Batch_Java11_2GB_of_100B_records.txt => java_GBK_Dataflow_V2_Batch_2GB_of_100B_records.txt} (100%) rename .github/workflows/load-tests-pipeline-options/{java_GBK_Dataflow_V2_Batch_Java11_2GB_of_100kB_records.txt => java_GBK_Dataflow_V2_Batch_2GB_of_100kB_records.txt} (100%) rename .github/workflows/load-tests-pipeline-options/{java_GBK_Dataflow_V2_Batch_Java11_2GB_of_10B_records.txt => java_GBK_Dataflow_V2_Batch_2GB_of_10B_records.txt} (100%) rename .github/workflows/load-tests-pipeline-options/{java_GBK_Dataflow_V2_Batch_Java11_fanout_4_times_with_2GB_10-byte_records_total.txt => java_GBK_Dataflow_V2_Batch_fanout_4_times_with_2GB_10-byte_records_total.txt} (100%) rename .github/workflows/load-tests-pipeline-options/{java_GBK_Dataflow_V2_Batch_Java11_fanout_8_times_with_2GB_10-byte_records_total.txt => java_GBK_Dataflow_V2_Batch_fanout_8_times_with_2GB_10-byte_records_total.txt} (100%) rename .github/workflows/load-tests-pipeline-options/{java_GBK_Dataflow_V2_Batch_Java11_reiterate_4_times_10kB_values.txt => java_GBK_Dataflow_V2_Batch_reiterate_4_times_10kB_values.txt} (100%) rename .github/workflows/load-tests-pipeline-options/{java_GBK_Dataflow_V2_Batch_Java11_reiterate_4_times_2MB_values.txt => java_GBK_Dataflow_V2_Batch_reiterate_4_times_2MB_values.txt} (100%) rename .github/workflows/load-tests-pipeline-options/{java_GBK_Dataflow_V2_Streaming_Java11_2GB_of_100B_records.txt => java_GBK_Dataflow_V2_Streaming_2GB_of_100B_records.txt} (100%) rename .github/workflows/load-tests-pipeline-options/{java_GBK_Dataflow_V2_Streaming_Java11_2GB_of_100kB_records.txt => java_GBK_Dataflow_V2_Streaming_2GB_of_100kB_records.txt} (100%) rename .github/workflows/load-tests-pipeline-options/{java_GBK_Dataflow_V2_Streaming_Java11_2GB_of_10B_records.txt => java_GBK_Dataflow_V2_Streaming_2GB_of_10B_records.txt} (100%) rename .github/workflows/load-tests-pipeline-options/{java_GBK_Dataflow_V2_Streaming_Java11_fanout_4_times_with_2GB_10-byte_records_total.txt => java_GBK_Dataflow_V2_Streaming_fanout_4_times_with_2GB_10-byte_records_total.txt} (100%) rename .github/workflows/load-tests-pipeline-options/{java_GBK_Dataflow_V2_Streaming_Java11_fanout_8_times_with_2GB_10-byte_records_total.txt => java_GBK_Dataflow_V2_Streaming_fanout_8_times_with_2GB_10-byte_records_total.txt} (100%) rename .github/workflows/load-tests-pipeline-options/{java_GBK_Dataflow_V2_Streaming_Java11_reiterate_4_times_10kB_values.txt => java_GBK_Dataflow_V2_Streaming_reiterate_4_times_10kB_values.txt} (100%) rename .github/workflows/load-tests-pipeline-options/{java_GBK_Dataflow_V2_Streaming_Java11_reiterate_4_times_2MB_values.txt => java_GBK_Dataflow_V2_Streaming_reiterate_4_times_2MB_values.txt} (100%) create mode 100644 examples/multi-language/python/wordcount_external.py create mode 100644 examples/multi-language/src/main/java/org/apache/beam/examples/multilanguage/schematransforms/ExtractWordsProvider.java create mode 100644 examples/multi-language/src/main/java/org/apache/beam/examples/multilanguage/schematransforms/JavaCountProvider.java create mode 100644 examples/multi-language/src/main/java/org/apache/beam/examples/multilanguage/schematransforms/WriteWordsProvider.java create mode 100644 examples/notebooks/beam-ml/gemma_2_sentiment_and_summarization.ipynb create mode 100644 examples/notebooks/beam-ml/rag_usecase/beam_rag_notebook.ipynb create mode 100644 examples/notebooks/beam-ml/rag_usecase/chunks_generation.py create mode 100644 examples/notebooks/beam-ml/rag_usecase/redis_connector.py create mode 100644 examples/notebooks/beam-ml/rag_usecase/redis_enrichment.py create mode 100644 examples/notebooks/blog/unittests_in_beam.ipynb create mode 100644 runners/core-java/src/main/java/org/apache/beam/runners/core/metrics/StringSetCell.java create mode 100644 runners/core-java/src/main/java/org/apache/beam/runners/core/metrics/StringSetData.java create mode 100644 runners/core-java/src/test/java/org/apache/beam/runners/core/metrics/StringSetCellTest.java create mode 100644 runners/core-java/src/test/java/org/apache/beam/runners/core/metrics/StringSetDataTest.java delete mode 100644 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/MetricTrackingWindmillServerStub.java create mode 100644 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/OperationalLimits.java copy runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/{WorkItemCancelledException.java => OutputTooLargeException.java} (63%) copy runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/{windmill/work/budget/GetWorkBudgetDistributor.java => streaming/RefreshableWork.java} (58%) rename runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/{windmill/client/grpc/StreamingEngineClient.java => streaming/harness/FanOutStreamingEngineWorkerHarness.java} (86%) create mode 100644 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/streaming/harness/SingleSourceWorkerHarness.java rename runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/{windmill/client/grpc => streaming/harness}/StreamingEngineConnectionState.java (97%) rename sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/changestreams/dofn/SerializableSupplier.java => runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/streaming/harness/StreamingWorkerHarness.java (71%) rename runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/{windmill/client/grpc => streaming/harness}/WindmillStreamSender.java (81%) create mode 100644 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/streaming/sideinput/SideInputStateFetcherFactory.java copy sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Factory.java => runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/ApplianceWindmillClient.java (51%) create mode 100644 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/StreamingEngineWindmillClient.java create mode 100644 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/client/getdata/ApplianceGetDataClient.java create mode 100644 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/client/getdata/GetDataClient.java create mode 100644 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/client/getdata/StreamGetDataClient.java create mode 100644 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/client/getdata/StreamPoolGetDataClient.java create mode 100644 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/client/getdata/ThrottlingGetDataMetricTracker.java create mode 100644 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/client/grpc/GetWorkResponseChunkAssembler.java copy sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Factory.java => runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/client/grpc/observers/StreamObserverCancelledException.java (71%) copy sdks/java/io/pulsar/src/main/java/org/apache/beam/sdk/io/pulsar/package-info.java => runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/state/AbstractWindmillMap.java (78%) create mode 100644 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/state/WindmillMapViaMultimap.java copy sdks/java/io/google-ads/src/main/java/org/apache/beam/sdk/io/googleads/GoogleAdsIO.java => runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/work/budget/GetWorkBudgetSpender.java (65%) delete mode 100644 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/work/refresh/ActiveWorkRefreshers.java create mode 100644 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/work/refresh/ApplianceHeartbeatSender.java delete mode 100644 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/work/refresh/DispatchedActiveWorkRefresher.java create mode 100644 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/work/refresh/FixedStreamHeartbeatSender.java copy sdks/java/io/rrio/src/main/java/org/apache/beam/io/requestresponse/SerializableSupplier.java => runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/work/refresh/HeartbeatSender.java (66%) create mode 100644 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/work/refresh/Heartbeats.java create mode 100644 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/work/refresh/StreamPoolHeartbeatSender.java rename runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/{windmill/client/grpc/StreamingEngineClientTest.java => streaming/harness/FanOutStreamingEngineWorkerHarnessTest.java} (90%) rename runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/{windmill/client/grpc => streaming/harness}/WindmillStreamSenderTest.java (81%) copy runners/google-cloud-dataflow-java/worker/src/{main/java/org/apache/beam/runners/dataflow/worker/streaming/ExecutableWork.java => test/java/org/apache/beam/runners/dataflow/worker/windmill/client/getdata/FakeGetDataClient.java} (55%) create mode 100644 runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/windmill/client/getdata/ThrottlingGetDataMetricTrackerTest.java rename runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/windmill/work/refresh/{DispatchedActiveWorkRefresherTest.java => ActiveWorkRefresherTest.java} (75%) create mode 100644 runners/jet/src/main/java/org/apache/beam/runners/jet/metrics/StringSetImpl.java create mode 100644 runners/prism/java/build.gradle create mode 100644 runners/prism/java/src/main/java/org/apache/beam/runners/prism/PrismArtifactResolver.java create mode 100644 runners/prism/java/src/main/java/org/apache/beam/runners/prism/PrismArtifactStager.java create mode 100644 runners/prism/java/src/main/java/org/apache/beam/runners/prism/PrismExecutor.java create mode 100644 runners/prism/java/src/main/java/org/apache/beam/runners/prism/PrismJobManager.java create mode 100644 runners/prism/java/src/main/java/org/apache/beam/runners/prism/PrismLocator.java create mode 100644 runners/prism/java/src/main/java/org/apache/beam/runners/prism/PrismPipelineOptions.java create mode 100644 runners/prism/java/src/main/java/org/apache/beam/runners/prism/PrismPipelineResult.java create mode 100644 runners/prism/java/src/main/java/org/apache/beam/runners/prism/PrismRunner.java copy runners/{google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/work/budget/GetWorkBudgetDistributor.java => prism/java/src/main/java/org/apache/beam/runners/prism/PrismRunnerRegistrar.java} (59%) copy sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Factory.java => runners/prism/java/src/main/java/org/apache/beam/runners/prism/StateListener.java (69%) create mode 100644 runners/prism/java/src/main/java/org/apache/beam/runners/prism/StateWatcher.java copy sdks/java/io/pulsar/src/main/java/org/apache/beam/sdk/io/pulsar/package-info.java => runners/prism/java/src/main/java/org/apache/beam/runners/prism/TestPrismPipelineOptions.java (73%) create mode 100644 runners/prism/java/src/main/java/org/apache/beam/runners/prism/TestPrismRunner.java create mode 100644 runners/prism/java/src/main/java/org/apache/beam/runners/prism/WorkerService.java copy {sdks/java/io/pulsar/src/main/java/org/apache/beam/sdk/io/pulsar => runners/prism/java/src/main/java/org/apache/beam/runners/prism}/package-info.java (88%) create mode 100644 runners/prism/java/src/test/java/org/apache/beam/runners/prism/PrismArtifactResolverTest.java create mode 100644 runners/prism/java/src/test/java/org/apache/beam/runners/prism/PrismArtifactStagerTest.java create mode 100644 runners/prism/java/src/test/java/org/apache/beam/runners/prism/PrismExecutorTest.java create mode 100644 runners/prism/java/src/test/java/org/apache/beam/runners/prism/PrismJobManagerTest.java create mode 100644 runners/prism/java/src/test/java/org/apache/beam/runners/prism/PrismLocatorTest.java create mode 100644 runners/prism/java/src/test/java/org/apache/beam/runners/prism/PrismRunnerTest.java create mode 100644 runners/prism/java/src/test/java/org/apache/beam/runners/prism/StateWatcherTest.java create mode 100644 runners/prism/java/src/test/java/org/apache/beam/runners/prism/WorkerServiceTest.java create mode 100644 sdks/java/core/src/main/java/org/apache/beam/sdk/metrics/Lineage.java copy sdks/java/{io/google-ads/src/main/java/org/apache/beam/sdk/io/googleads/GoogleAdsIO.java => core/src/main/java/org/apache/beam/sdk/metrics/StringSet.java} (65%) create mode 100644 sdks/java/core/src/main/java/org/apache/beam/sdk/metrics/StringSetResult.java create mode 100644 sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/GetterBasedSchemaProviderV2.java copy sdks/java/core/src/main/java/org/apache/beam/sdk/{schemas/Factory.java => testing/UsesStringSetMetrics.java} (75%) rename sdks/java/{io/rrio/src/main/java/org/apache/beam/io/requestresponse => core/src/main/java/org/apache/beam/sdk/util}/SerializableSupplier.java (96%) create mode 100644 sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextSourceTest.java create mode 100644 sdks/java/core/src/test/java/org/apache/beam/sdk/metrics/LineageTest.java create mode 100644 sdks/java/core/src/test/java/org/apache/beam/sdk/metrics/StringSetResultTest.java create mode 100644 sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/SerializableComparatorTest.java create mode 100644 sdks/java/io/csv/src/main/java/org/apache/beam/sdk/io/csv/CsvIOParse.java create mode 100644 sdks/java/io/csv/src/main/java/org/apache/beam/sdk/io/csv/CsvIOParseKV.java create mode 100644 sdks/java/io/csv/src/main/java/org/apache/beam/sdk/io/csv/CsvIORecordToObjects.java create mode 100644 sdks/java/io/csv/src/main/java/org/apache/beam/sdk/io/csv/CsvIOStringToCsvRecord.java create mode 100644 sdks/java/io/csv/src/test/java/org/apache/beam/sdk/io/csv/CsvIOParseHelpersTest.java copy sdks/java/{core/src/main/java/org/apache/beam/sdk/schemas/Factory.java => io/csv/src/test/java/org/apache/beam/sdk/io/csv/CsvIOParseKVTest.java} (73%) create mode 100644 sdks/java/io/csv/src/test/java/org/apache/beam/sdk/io/csv/CsvIOParseTest.java create mode 100644 sdks/java/io/csv/src/test/java/org/apache/beam/sdk/io/csv/CsvIORecordToObjectsTest.java create mode 100644 sdks/java/io/csv/src/test/java/org/apache/beam/sdk/io/csv/CsvIOStringToCsvRecordTest.java create mode 100644 sdks/java/io/csv/src/test/java/org/apache/beam/sdk/io/csv/CsvIOTest.java rename sdks/java/io/google-ads/src/main/java/org/apache/beam/sdk/io/googleads/{GoogleAdsV14.java => GoogleAdsV17.java} (89%) rename sdks/java/io/google-ads/src/test/java/org/apache/beam/sdk/io/googleads/{GoogleAdsV14Test.java => GoogleAdsV17Test.java} (90%) create mode 100644 sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerQuerySourceDef.java create mode 100644 sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerReadSchemaTransformProvider.java copy sdks/java/{core/src/main/java/org/apache/beam/sdk/schemas/Factory.java => io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerSchemaRetrievalException.java} (69%) copy sdks/java/{core/src/main/java/org/apache/beam/sdk/transforms/SerializableComparator.java => io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerSourceDef.java} (62%) create mode 100644 sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerTableSourceDef.java create mode 100644 sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableSharedClientTest.java create mode 100644 sdks/java/io/iceberg/hive/build.gradle create mode 100644 sdks/java/io/iceberg/hive/exec/build.gradle create mode 100644 sdks/java/io/iceberg/hive/src/test/java/org/apache/beam/sdk/io/iceberg/hive/IcebergHiveCatalogIT.java create mode 100644 sdks/java/io/iceberg/hive/src/test/java/org/apache/beam/sdk/io/iceberg/hive/testutils/HiveMetastoreExtension.java create mode 100644 sdks/java/io/iceberg/hive/src/test/java/org/apache/beam/sdk/io/iceberg/hive/testutils/ScriptRunner.java create mode 100644 sdks/java/io/iceberg/hive/src/test/java/org/apache/beam/sdk/io/iceberg/hive/testutils/TestHiveMetastore.java create mode 100644 sdks/java/io/iceberg/hive/src/test/resources/hive-schema-3.1.0.derby.sql delete mode 100644 sdks/java/io/iceberg/src/main/java/org/apache/beam/sdk/io/iceberg/IcebergSchemaTransformCatalogConfig.java create mode 100644 sdks/java/io/iceberg/src/main/java/org/apache/beam/sdk/io/iceberg/IcebergUtils.java create mode 100644 sdks/java/io/iceberg/src/main/java/org/apache/beam/sdk/io/iceberg/RecordWriterManager.java delete mode 100644 sdks/java/io/iceberg/src/main/java/org/apache/beam/sdk/io/iceberg/SchemaAndRowConversions.java create mode 100644 sdks/java/io/iceberg/src/main/java/org/apache/beam/sdk/io/iceberg/SchemaTransformConfiguration.java create mode 100644 sdks/java/io/iceberg/src/test/java/org/apache/beam/sdk/io/iceberg/IcebergUtilsTest.java create mode 100644 sdks/java/io/iceberg/src/test/java/org/apache/beam/sdk/io/iceberg/RecordWriterManagerTest.java delete mode 100644 sdks/java/io/iceberg/src/test/java/org/apache/beam/sdk/io/iceberg/SchemaAndRowConversionsTest.java create mode 100644 sdks/java/io/solace/src/main/java/org/apache/beam/sdk/io/solace/broker/BasicAuthSempClient.java create mode 100644 sdks/java/io/solace/src/main/java/org/apache/beam/sdk/io/solace/broker/BasicAuthSempClientFactory.java create mode 100644 sdks/java/io/solace/src/main/java/org/apache/beam/sdk/io/solace/broker/BrokerResponse.java create mode 100644 sdks/java/io/solace/src/main/java/org/apache/beam/sdk/io/solace/broker/GCPSecretSessionServiceFactory.java create mode 100644 sdks/java/io/solace/src/main/java/org/apache/beam/sdk/io/solace/broker/SempBasicAuthClientExecutor.java create mode 100644 sdks/java/io/solace/src/main/java/org/apache/beam/sdk/io/solace/data/Semp.java create mode 100644 sdks/java/io/solace/src/main/java/org/apache/beam/sdk/io/solace/write/SolaceOutput.java copy sdks/java/io/{pulsar/src/main/java/org/apache/beam/sdk/io/pulsar => solace/src/main/java/org/apache/beam/sdk/io/solace/write}/package-info.java (88%) create mode 100644 sdks/java/io/solace/src/test/java/org/apache/beam/sdk/io/solace/broker/BasicAuthWriterSessionTest.java create mode 100644 sdks/java/io/solace/src/test/java/org/apache/beam/sdk/io/solace/broker/OverrideWriterPropertiesTest.java create mode 100644 sdks/java/io/solace/src/test/java/org/apache/beam/sdk/io/solace/broker/SempBasicAuthClientExecutorTest.java create mode 100644 sdks/java/io/solace/src/test/java/org/apache/beam/sdk/io/solace/it/SolaceContainerManager.java create mode 100644 sdks/java/io/solace/src/test/java/org/apache/beam/sdk/io/solace/it/SolaceIOIT.java create mode 100644 sdks/python/apache_beam/io/gcp/spanner_wrapper.py create mode 100644 sdks/python/apache_beam/runners/interactive/non_interactive_runner_test.py create mode 100644 sdks/python/apache_beam/transforms/error_handling.py create mode 100644 sdks/python/apache_beam/transforms/error_handling_test.py create mode 100644 sdks/python/apache_beam/yaml/tests/spanner.yaml copy sdks/python/apache_beam/{version.py => yaml/tests/tsv.yaml} (50%) create mode 100644 website/www/site/content/en/blog/beam-2.58.0.md create mode 100644 website/www/site/content/en/blog/beam-2.58.1.md create mode 100644 website/www/site/content/en/documentation/patterns/batch-elements.md create mode 100644 website/www/site/content/en/documentation/patterns/shared-class.md create mode 100644 website/www/site/content/en/documentation/sdks/yaml-join.md