dramaticlly commented on code in PR #15426:
URL: https://github.com/apache/iceberg/pull/15426#discussion_r2849250788
##########
core/src/main/java/org/apache/iceberg/ManifestGroup.java:
##########
@@ -222,13 +222,23 @@ public <T extends ScanTask> CloseableIterable<T>
plan(CreateTasksFunction<T> cre
/**
* Returns an iterable for manifest entries in the set of manifests.
*
- * <p>Entries are not copied and it is the caller's responsibility to make
defensive copies if
- * adding these entries to a collection.
+ * <p>When parallel execution is enabled via {@link
#planWith(ExecutorService)}, entries are
+ * defensively copied since the underlying {@link ManifestReader} reuses
entry objects during
+ * iteration. Otherwise, entries are not copied and it is the caller's
responsibility to make
+ * defensive copies if adding these entries to a collection.
*
* @return a CloseableIterable of manifest entries.
*/
public CloseableIterable<ManifestEntry<DataFile>> entries() {
- return CloseableIterable.concat(entries((manifest, entries) -> entries));
+ if (executorService != null) {
+ // copy entries to avoid object reuse issues when scanning manifests in
parallel,
+ // as ManifestReader reuses entry objects during iteration
+ Iterable<CloseableIterable<ManifestEntry<DataFile>>> entryIterables =
+ entries((manifest, entries) -> CloseableIterable.transform(entries,
ManifestEntry::copy));
Review Comment:
Hey Steven, I think this is due to we reuse containers to read manifest
entry using ManifestReader, mentioned in comment above as well
https://github.com/apache/iceberg/pull/15426#issuecomment-3948664727.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]