mccheah commented on a change in pull request #6: Support customizing the location where data is written in Spark URL: https://github.com/apache/incubator-iceberg/pull/6#discussion_r236869168
########## File path: spark/src/main/java/com/netflix/iceberg/spark/source/IcebergSource.java ########## @@ -89,7 +92,11 @@ public DataSourceReader createReader(DataSourceOptions options) { .toUpperCase(Locale.ENGLISH)); } - return Optional.of(new Writer(table, lazyConf(), format)); + String dataLocation = options.get(TableProperties.WRITE_NEW_DATA_LOCATION) + .orElse(table.properties().getOrDefault( + TableProperties.WRITE_NEW_DATA_LOCATION, + new Path(new Path(table.location()), "data").toString())); + return Optional.of(new Writer(table, lazyConf(), format, dataLocation)); Review comment: I think doing options processing from a `Map<String, String>`, inside a constructor, is a bit of an antipattern. Consider for example writing a unit test for this class in the future. If we pass the `Writer` constructor only a `HashMap`, the unit test would have to construct that `HashMap` in a specific way, i.e. knowing what key-value pairs the constructor is expecting. Perhaps we can have a builder object that acts as a factory that accepts the `Map` and returns the `Writer`. The `Writer` constructor accepts the builder object and copies the set fields on the builder into its own fields. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services