rdblue commented on a change in pull request #3400:
URL: https://github.com/apache/iceberg/pull/3400#discussion_r739887211
##########
File path:
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkBatchQueryScan.java
##########
@@ -74,17 +100,31 @@
throw new IllegalArgumentException("Cannot only specify option
end-snapshot-id to do incremental scan");
}
- // look for split behavior overrides in options
- this.splitSize = Spark3Util.propertyAsLong(options,
SparkReadOptions.SPLIT_SIZE, null);
- this.splitLookback = Spark3Util.propertyAsInt(options,
SparkReadOptions.LOOKBACK, null);
- this.splitOpenFileCost = Spark3Util.propertyAsLong(options,
SparkReadOptions.FILE_OPEN_COST, null);
+ this.splitSize = table instanceof BaseMetadataTable ?
readConf.metadataSplitSize() : readConf.splitSize();
Review comment:
Better idea: I think that `TableScan` should report these settings
instead of forcing us to pass them in. That way you can configure the scan with
the read options if they are present, then get the scan's split settings
without needing to forward/rewrite settings through `properties` or go around
the API.
That fits with a refactor I want to get to that removes all of the settings
from here anyway and creates the `TableScan` in the `ScanBuilder`. There's no
need to create a scan in more than one place.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]