Strikerrx01 commented on code in PR #34135:
URL: https://github.com/apache/beam/pull/34135#discussion_r1979334359


##########
sdks/python/apache_beam/io/gcp/bigquery_tools.py:
##########
@@ -386,8 +390,28 @@ def __init__(self, client=None, temp_dataset_id=None, 
temp_table_ref=None):
       self._temporary_table_suffix = uuid.uuid4().hex
       self.temp_dataset_id = temp_dataset_id or self._get_temp_dataset()
 
+    # Initialize table definition cache with default TTL of 1 hour
+    # Cache entries are invalidated after TTL expires to ensure fresh metadata
+    self._table_cache = {}

Review Comment:
   @stankiewicz Thanks for catching these important points about cache 
management. You're right - we should:
   
   1. Implement proper cache invalidation with TTL expiration
   2. Add a size limit to prevent unbounded growth
   3. Use LRU (Least Recently Used) strategy for cache eviction
   
   I'll use `cachetools` as you suggested to implement this properly. The 
changes will include:
   - Using `cachetools.TTLCache` for automatic TTL-based invalidation
   - Setting a maximum cache size
   - Implementing LRU eviction strategy
   
   Would you like me to implement this using `cachetools.TTLCache` with both 
size and TTL limits?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to