[
https://issues.apache.org/jira/browse/BEAM-11075?focusedWorklogId=521815&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-521815
]
ASF GitHub Bot logged work on BEAM-11075:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 08/Dec/20 19:04
Start Date: 08/Dec/20 19:04
Worklog Time Spent: 10m
Work Description: lostluck edited a comment on pull request #13436:
URL: https://github.com/apache/beam/pull/13436#issuecomment-740863368
WRT the investigation we've been doing around a flink failure for large
settings:
In my own investigation into the problem on the google internal runner, the
"big" configuration ends up being too large for protocol buffer serialization,
with a single ~10GB StateResponse, causing the issue. Protos have a hard cap of
2GB in serialized size.
From #beam-go slack discussion, and other research, java and python also
fails with large configurations, as it's not yet implemented anywhere to page
through large side inputs.
`--input_options='{"num_records": 2000000, "key_size":100,
"value_size":900}' --access_percentage=1`
and
`--input_options='{"num_records": 10000000, "key_size":100,
"value_size":180}' --access_percentage=1`
work by cutting data total down to 1/5th.
This will affect all portable runners, but not any of the legacy ones,
because they don't pass data around through the protos. A JIRA is being filed
about it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 521815)
Time Spent: 15.5h (was: 15h 20m)
> Load Tests for Go SDK
> ---------------------
>
> Key: BEAM-11075
> URL: https://issues.apache.org/jira/browse/BEAM-11075
> Project: Beam
> Issue Type: Test
> Components: sdk-go, testing
> Reporter: Kamil Wasilewski
> Assignee: Kamil Wasilewski
> Priority: P3
> Time Spent: 15.5h
> Remaining Estimate: 0h
>
> We have Load Tests for Python and Java SDKs[1], but we are missing the ones
> for Go SDK.
> Tests to be done:
> * ParDo
> * Combine
> * coGBK
> * GBK
> * Side Input
> The tests should run on Dataflow and Flink. The tests should be using
> synthetic source and be running in batch mode.
> [1]
> [http://metrics.beam.apache.org/dashboards/f/OtXje1iGz/performance-tests-metrics]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)