singhpk234 commented on code in PR #5437:
URL: https://github.com/apache/iceberg/pull/5437#discussion_r939650527
##########
aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java:
##########
@@ -210,25 +211,24 @@ void initialize(
protected TableOperations newTableOps(TableIdentifier tableIdentifier) {
if (catalogProperties != null) {
Map<String, String> tableSpecificCatalogProperties =
- ImmutableMap.<String, String>builder()
- .putAll(catalogProperties)
- .put(
- AwsProperties.LAKE_FORMATION_DB_NAME,
- IcebergToGlueConverter.getDatabaseName(
- tableIdentifier,
awsProperties.glueCatalogSkipNameValidation()))
- .put(
- AwsProperties.LAKE_FORMATION_TABLE_NAME,
- IcebergToGlueConverter.getTableName(
- tableIdentifier,
awsProperties.glueCatalogSkipNameValidation()))
- .build();
+ Maps.newHashMapWithExpectedSize(catalogProperties.size() + 2);
+ tableSpecificCatalogProperties.putAll(catalogProperties);
+ tableSpecificCatalogProperties.put(
+ AwsProperties.LAKE_FORMATION_DB_NAME,
+ IcebergToGlueConverter.getDatabaseName(
+ tableIdentifier, awsProperties.glueCatalogSkipNameValidation()));
+ tableSpecificCatalogProperties.put(
+ AwsProperties.LAKE_FORMATION_TABLE_NAME,
+ IcebergToGlueConverter.getTableName(
+ tableIdentifier, awsProperties.glueCatalogSkipNameValidation()));
// FileIO initialization depends on tableSpecificCatalogProperties, so a
new FileIO is
// initialized each time
return new GlueTableOperations(
glue,
lockManager,
catalogName,
awsProperties,
- initializeFileIO(tableSpecificCatalogProperties),
+
initializeFileIO(Collections.unmodifiableMap(tableSpecificCatalogProperties)),
Review Comment:
+1, having a UT will certainly help us catching this, if this happens again.
> To test this, you could do newTableOps on a GlueCatalog and then ensure
that the resulting FileIO can be round trip serialized through Kryo (which
there's a helper method for I believe).
+1 mostly did the same just wrote a new helper for it (still thinking other
alternatives), as at present Kryo helper depends on spark, in background
includes Twitter#chill serializers. newTableOps presently have a protected
access, so was facing issues in writing spark UT.
The new helper takes a dependency in twitter-chill which spark uses and i
think flink as well (may be wrong here, ref
[link](https://flink.apache.org/news/2020/04/15/flink-serialization-tuning-vol-1.html#kryo)).
All suggestions are welcomed here :) !
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]