rdblue commented on a change in pull request #3292:
URL: https://github.com/apache/iceberg/pull/3292#discussion_r780817433



##########
File path: core/src/main/java/org/apache/iceberg/BaseFileScanTask.java
##########
@@ -224,5 +217,51 @@ public Expression residual() {
     public Iterable<FileScanTask> split(long splitSize) {
       throw new UnsupportedOperationException("Cannot split a task which is 
already split");
     }
+
+    public boolean isAdjacent(SplitScanTask other) {
+      return (other != null) &&
+          (this.file().equals(other.file())) &&
+          (this.offset + this.len == other.offset);
+    }
+  }
+
+  static FileScanTask[] combineSimilarTasks(List<FileScanTask> tasks) {
+    if (tasks.isEmpty()) {
+      return new FileScanTask[0];
+    }
+
+    List<FileScanTask> combinedScans = Lists.newArrayList();
+    SplitScanTask lastSplit = null;
+
+    for (FileScanTask fileScanTask : tasks) {
+      if (!(fileScanTask instanceof SplitScanTask)) {
+        // We do not know how to combine anything but SplitScanTasks

Review comment:
       Nit: personal pronouns don't usually add much to comments or 
documentation. It's unclear who "we" is and you can usually be more direct. 
Here I think it is sufficient to say "skip tasks that aren't SplitScanTask 
produced by split".

##########
File path: core/src/main/java/org/apache/iceberg/BaseFileScanTask.java
##########
@@ -224,5 +217,51 @@ public Expression residual() {
     public Iterable<FileScanTask> split(long splitSize) {
       throw new UnsupportedOperationException("Cannot split a task which is 
already split");
     }
+
+    public boolean isAdjacent(SplitScanTask other) {
+      return (other != null) &&
+          (this.file().equals(other.file())) &&
+          (this.offset + this.len == other.offset);
+    }
+  }
+
+  static FileScanTask[] combineSimilarTasks(List<FileScanTask> tasks) {
+    if (tasks.isEmpty()) {
+      return new FileScanTask[0];
+    }
+
+    List<FileScanTask> combinedScans = Lists.newArrayList();
+    SplitScanTask lastSplit = null;
+
+    for (FileScanTask fileScanTask : tasks) {
+      if (!(fileScanTask instanceof SplitScanTask)) {
+        // We do not know how to combine anything but SplitScanTasks

Review comment:
       Nit: personal pronouns don't usually add much to comments or 
documentation. It's unclear who "we" is and you can usually be more direct. 
Here I think it is sufficient to say "pass through tasks that aren't 
SplitScanTask produced by split".




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to