[ https://issues.apache.org/jira/browse/BEAM-8825?focusedWorklogId=360398&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-360398 ]
ASF GitHub Bot logged work on BEAM-8825: ---------------------------------------- Author: ASF GitHub Bot Created on: 16/Dec/19 17:44 Start Date: 16/Dec/19 17:44 Worklog Time Spent: 10m Work Description: udim commented on issue #10380: [BEAM-8825] Add limit on number of mutated rows to batching/sorting stages. URL: https://github.com/apache/beam/pull/10380#issuecomment-566166634 Will ignore Java_Examples_Dataflow (WindowedWordCountIT) test failure since this is a release branch ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 360398) Time Spent: 1h 40m (was: 1.5h) > OOM when writing large numbers of 'narrow' rows > ----------------------------------------------- > > Key: BEAM-8825 > URL: https://issues.apache.org/jira/browse/BEAM-8825 > Project: Beam > Issue Type: Bug > Components: io-java-gcp > Affects Versions: 2.9.0, 2.10.0, 2.11.0, 2.12.0, 2.13.0, 2.14.0, 2.15.0, > 2.16.0, 2.17.0 > Reporter: Niel Markwick > Assignee: Niel Markwick > Priority: Major > Fix For: 2.18.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > SpannerIO can OOM when writing large numbers of 'narrow' rows. > > SpannerIO puts input mutation elements into batches for efficient writing. > These batches are limited by number of cells mutated, and size of data > written (5000 cells, 1MB data). SpannerIO groups enough mutations to build > 1000 of these groups (5M cells, 1GB data), then sorts and batches them. > When the number of cells and size of data is very small (<5 cells, <100 > bytes), the memory overhead of storing millions of mutations for batching is > significant, and can lead to OOMs. -- This message was sent by Atlassian Jira (v8.3.4#803005)