liferoad opened a new pull request, #35049:
URL: https://github.com/apache/beam/pull/35049

   **Testing Jules**
   
   Fixes https://github.com/apache/beam/issues/24776
   
   This change addresses Apache Beam GitHub Issue #24776, where race conditions 
could occur during the collection of monitoring information in the Python SDK 
Harness, leading to errors such as:
   - SystemError: returned NULL without setting an error
   - RuntimeError: dictionary changed size during iteration
   - AttributeError: 'bytes' object has no attribute 'payload'
   - ValueError: non-UTF-8 strings
   
   The primary cause was concurrent access to metric data structures 
(specifically `MetricsContainer` and its underlying `MetricCell`s) by the DoFn 
execution thread (updating metrics) and the thread responsible for reporting 
bundle progress.
   
   The fix introduces the following:
   1.  A `threading.Lock` is added to the `MetricsContainer` class. This lock 
is acquired before any access or modification of the internal dictionaries that 
store metric cells (`self.counters`, `self.distributions`, `self.gauges`). This 
protection is applied during metric cell retrieval/creation (`get_metric_cell`) 
and when all monitoring information is collected for reporting 
(`to_runner_api_monitoring_infos`).
   2.  The `MetricsContainer`'s lock is passed to individual `MetricCell` 
instances (`CounterCell`, `DistributionCell`, `GaugeCell`) upon their creation.
   3.  Metric update methods within `CounterCell`, `DistributionCell`, and 
`GaugeCell` (e.g., `update()`, `set()`, `add_data()`) now acquire this 
container-level lock before modifying their internal state. This ensures that 
updates are atomic with respect to the collection process in 
`MetricsContainer.to_runner_api_monitoring_infos`.
   
   These changes ensure that metric data is read and updated in a thread-safe 
manner, preventing the previously observed errors caused by concurrent access 
and modification of shared metric state.
   
   **Please** add a meaningful description for your change here
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
    - [ ] Mention the appropriate issue in your description (for example: 
`addresses #123`), if applicable. This will automatically add a link to the 
pull request in the issue. If you would like the issue to automatically close 
on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://github.com/apache/beam/blob/master/CONTRIBUTING.md#make-the-reviewers-job-easier).
   
   To check the build health, please visit 
[https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   
------------------------------------------------------------------------------------------------
   [![Build python source distribution and 
wheels](https://github.com/apache/beam/actions/workflows/build_wheels.yml/badge.svg?event=schedule&&?branch=master)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python 
tests](https://github.com/apache/beam/actions/workflows/python_tests.yml/badge.svg?event=schedule&&?branch=master)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java 
tests](https://github.com/apache/beam/actions/workflows/java_tests.yml/badge.svg?event=schedule&&?branch=master)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go 
tests](https://github.com/apache/beam/actions/workflows/go_tests.yml/badge.svg?event=schedule&&?branch=master)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more 
information about GitHub Actions CI or the [workflows 
README](https://github.com/apache/beam/blob/master/.github/workflows/README.md) 
to see a list of phrases to trigger workflows.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to