tvalentyn opened a new issue, #31607:
URL: https://github.com/apache/beam/issues/31607

   ### What happened?
   
   The following error might occur in some pipelines, possibly 
non-deterministically:
   
   ```
   Exception serializing message!
   Traceback (most recent call last):
     File "/usr/local/lib/python3.10/site-packages/grpc/_common.py", line 89, 
in _transform
       return transformer(message)
   ValueError: Message org.apache.beam.model.fn_execution.v1.Elements exceeds 
maximum protobuf size of 2GB: 2887086320
   
   Traceback (most recent call last):
     File 
\"/usr/local/lib/python3.10/site-packages/apache_beam/runners/worker/data_plane.py\",
 line 700, in _read_inputs
       for elements in elements_iterator:
     File \"/usr/local/lib/python3.10/site-packages/grpc/_channel.py\", line 
542, in __next__
       return self._next()
     File \"/usr/local/lib/python3.10/site-packages/grpc/_channel.py\", line 
968, in _next
       raise self
   ```
   
   This issue is caused by large elements in Beam pipeline.  If you see this 
error, upgrade to Apache Beam 2.57.0. Apache Beam 2.57.0 improves a codepath 
that could  suboptimally  combine multiple  large elements together. It also 
adds better logging when large elements are detected. If you run the pipeline 
on 2.57.0 and above, and failures persist, look for warnings like: 
   
   ```
   Data output stream buffer size ... exceeds 536870912 bytes. This is likely 
due to a large element in a PCollection.
   ```
   
   or errors like:
   ```
   Buffer size ... exceeds GRPC limit 2147483548. This is likely due to a 
single element that is too large.
   ```
   
   If you see these warnings, inspect the logs to see which pipeline step emits 
these messages, and consider reducing the size of the individual elements in 
pcollections in your pipeline in those steps. 
   
   ### Issue Priority
   
   Priority: 2 (default / most bugs should be filed as P2)
   
   ### Issue Components
   
   - [X] Component: Python SDK
   - [ ] Component: Java SDK
   - [ ] Component: Go SDK
   - [ ] Component: Typescript SDK
   - [ ] Component: IO connector
   - [ ] Component: Beam YAML
   - [ ] Component: Beam examples
   - [ ] Component: Beam playground
   - [ ] Component: Beam katas
   - [ ] Component: Website
   - [ ] Component: Spark Runner
   - [ ] Component: Flink Runner
   - [ ] Component: Samza Runner
   - [ ] Component: Twister2 Runner
   - [ ] Component: Hazelcast Jet Runner
   - [ ] Component: Google Cloud Dataflow Runner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to