findepi commented on code in PR #10691:
URL: https://github.com/apache/iceberg/pull/10691#discussion_r1677648977
##########
core/src/main/java/org/apache/iceberg/util/ParallelIterable.java:
##########
@@ -20,65 +20,69 @@
import java.io.Closeable;
import java.io.IOException;
+import java.util.ArrayDeque;
+import java.util.Deque;
import java.util.Iterator;
import java.util.NoSuchElementException;
+import java.util.Optional;
+import java.util.concurrent.Callable;
import java.util.concurrent.ConcurrentLinkedQueue;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Future;
-import org.apache.iceberg.exceptions.RuntimeIOException;
+import java.util.concurrent.atomic.AtomicBoolean;
import org.apache.iceberg.io.CloseableGroup;
import org.apache.iceberg.io.CloseableIterable;
import org.apache.iceberg.io.CloseableIterator;
import org.apache.iceberg.relocated.com.google.common.base.Preconditions;
import org.apache.iceberg.relocated.com.google.common.collect.Iterables;
+import org.apache.iceberg.relocated.com.google.common.io.Closer;
public class ParallelIterable<T> extends CloseableGroup implements
CloseableIterable<T> {
+
+ private static final int DEFAULT_MAX_QUEUE_SIZE = 10_000;
Review Comment:
Good call. Admittedly, this value was not tuned. What value would be best
here?
also, per
https://github.com/apache/iceberg/pull/10691#issuecomment-2225641596, if the
yielding actually occurs, this means the ParallelIterator consumer isn't able
to keep up with processing incoming items, so yielding doesn't introduce much
cost. However, resuming is not instantaneous, so every 10k elements we pay
resuming cost. This can be eliminated with low water mark. Resume before we
exhaust the queue.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]