[GitHub] [iceberg] danielcweeks commented on a diff in pull request #7731: Core: Implement adaptive split planning in core.

via GitHub Mon, 29 May 2023 10:12:32 -0700


danielcweeks commented on code in PR #7731:
URL: https://github.com/apache/iceberg/pull/7731#discussion_r1209471374



##########
core/src/main/java/org/apache/iceberg/util/TableScanUtil.java:
##########
@@ -35,16 +38,22 @@
 import org.apache.iceberg.ScanTaskGroup;
 import org.apache.iceberg.SplittableScanTask;
 import org.apache.iceberg.StructLike;
+import org.apache.iceberg.io.CloseableGroup;
 import org.apache.iceberg.io.CloseableIterable;
+import org.apache.iceberg.io.CloseableIterator;
 import org.apache.iceberg.relocated.com.google.common.base.Preconditions;
 import org.apache.iceberg.relocated.com.google.common.collect.FluentIterable;
 import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
 import org.apache.iceberg.relocated.com.google.common.collect.Iterables;
 import org.apache.iceberg.relocated.com.google.common.collect.Lists;
 import org.apache.iceberg.relocated.com.google.common.collect.Maps;
 import org.apache.iceberg.types.Types;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 public class TableScanUtil {
+  private static final Logger LOG = 
LoggerFactory.getLogger(TableScanUtil.class);
+  private static final long MIN_SPLIT_SIZE = 16 * 1024 * 1024; // 16 MB

Review Comment:
   This feels like it's too large for a min split size. Either that or we 
should take into account what the row group size is actually configured to. For 
example, if the row group was set to 8MB, then we still wouldn't be able to 
achieve maximum parallelism because of this arbitrary default.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] danielcweeks commented on a diff in pull request #7731: Core: Implement adaptive split planning in core.

Reply via email to