[GitHub] [iceberg] rdblue commented on a change in pull request #3400: Spark: Support dynamic partition filtering in 3.2

GitBox Sun, 31 Oct 2021 15:47:53 -0700


rdblue commented on a change in pull request #3400:
URL: https://github.com/apache/iceberg/pull/3400#discussion_r739887211




##########
File path: 
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkBatchQueryScan.java
##########
@@ -74,17 +100,31 @@
       throw new IllegalArgumentException("Cannot only specify option 
end-snapshot-id to do incremental scan");
     }
 
-    // look for split behavior overrides in options
-    this.splitSize = Spark3Util.propertyAsLong(options, 
SparkReadOptions.SPLIT_SIZE, null);
-    this.splitLookback = Spark3Util.propertyAsInt(options, 
SparkReadOptions.LOOKBACK, null);
-    this.splitOpenFileCost = Spark3Util.propertyAsLong(options, 
SparkReadOptions.FILE_OPEN_COST, null);
+    this.splitSize = table instanceof BaseMetadataTable ? 
readConf.metadataSplitSize() : readConf.splitSize();

Review comment:
       Better idea: I think that `TableScan` should report these settings 
instead of forcing us to pass them in. That way you can configure the scan with 
the read options if they are present, then get the scan's split settings 
without needing to forward/rewrite settings through `properties` or go around 
the API.
   
   That fits with a refactor I want to get to that removes all of the settings 
from here anyway and creates the `TableScan` in the `ScanBuilder`. There's no 
need to create a scan in more than one place.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] rdblue commented on a change in pull request #3400: Spark: Support dynamic partition filtering in 3.2

Reply via email to