dejii commented on PR #38149: URL: https://github.com/apache/beam/pull/38149#issuecomment-4233075506
@ahmedabu98 following up on #37782 - that fix correctly moved FileIO close from `RecordWriter` to `RecordWriterManager`, but it turns out there's a deeper issue that only manifests under high write volume to dynamic destinations (many bundles per worker). The root cause: the catalog is `@MonotonicNonNull` on the DoFn and reused across all bundles on the same instance. `RecordWriterManager.close()` is called per bundle (`@FinishBundle`), so closing FileIO there, even deduplicated, kills the catalog's shared connection pool for all subsequent bundles on that DoFn. This PR removes FileIO close from `RecordWriterManager` entirely and adds `@Teardown` to all four IcebergIO write DoFns, so the catalog (and its FileIO) is closed exactly once when the DoFn instance is destroyed. Would appreciate your review here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
