AbgarSim opened a new pull request, #37145: URL: https://github.com/apache/beam/pull/37145
[[BEAM-34076](https://issues.apache.org/jira/browse/BEAM-34076)] Added TTL-based caching for BigQuery table definitions ------------------------ ### Description This change implements a thread-safe caching mechanism for BigQuery table definitions in the BigQueryWrapper class to address issue [BEAM-34076](https://issues.apache.org/jira/browse/BEAM-34076). The implementation uses caching strategy to reduce BigQuery API calls and thus optimising the flow. ### Changes in the codebase The solution separates the table metadata lookup into two distinct responsibilities: 1. An uncached lookup method that performs the actual tables.get call and is protected by retry logic with exponential backoff. 2. A cached, thread-safe wrapper method that stores table metadata in a TTL cache and reuses it for subsequent requests. Caching is implemented using the `cachetools` The cached method is now the primary entry point for callers. If the requested table metadata is already present in the cache and valid, it is returned immediately. Otherwise, the uncached method is invoked, and the result is stored in the cache. Cache configurations added: - _cache maxsize_ - 1024 - _ttl seconds - 300_ (5 minutes) ### Additional information - This change is intentionally minimal and non-breaking. - The cache reduces API traffic and improves performance in hot paths without altering existing behavior. - Retry logic is isolated to the uncached method to avoid masking persistent errors while still handling transient failures gracefully. - Thread safety is preserved to support concurrent access scenarios. GitHub Actions Tests Status (on master branch) ------------------------------------------------------------------------------------------------ [](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule) [](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule) [](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule) [](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule) See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI or the [workflows README](https://github.com/apache/beam/blob/master/.github/workflows/README.md) to see a list of phrases to trigger workflows. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
