[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341415701 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/task/HiveConverterUtils.java ## @@ -151,6 +154,30 @@ public static String generateCreateDuplicateTableDDL( dbName, tblName, inputDbName, inputTblName, tblLocation); } + public static String generateAlterSchemaDML( + String tableName, + Optional optionalDbName, Review comment: +1 for consistency. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341415129 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/query/HiveAvroORCQueryGenerator.java ## @@ -202,7 +204,21 @@ public static String generateCreateTableDDL(Schema schema, //.. use columns from destination schema if (isEvolutionEnabled || !destinationTableMeta.isPresent()) { log.info("Generating DDL using source schema"); + System.out.println("Generating DDL using source schema"); ddl.append(generateAvroToHiveColumnMapping(schema, Optional.of(hiveColumns), true, dbName + "." + tblName)); + try { Review comment: Fair enough. Worth to check if the production have schema evolution enabled. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341414676 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/materializer/HiveMaterializerFromEntityQueryGenerator.java ## @@ -118,11 +119,19 @@ public QueryBasedHivePublishEntity generatePublishQueries() throws DataConversio Map publishDirectories = publishEntity.getPublishDirectories(); List cleanupQueries = publishEntity.getCleanupQueries(); List cleanupDirectories = publishEntity.getCleanupDirectories(); +Optional avroSchema = Optional.absent(); Review comment: Your call. I am OK with both. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341414676 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/materializer/HiveMaterializerFromEntityQueryGenerator.java ## @@ -118,11 +119,19 @@ public QueryBasedHivePublishEntity generatePublishQueries() throws DataConversio Map publishDirectories = publishEntity.getPublishDirectories(); List cleanupQueries = publishEntity.getCleanupQueries(); List cleanupDirectories = publishEntity.getCleanupDirectories(); +Optional avroSchema = Optional.absent(); Review comment: You call. I am OK with both. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341358581 ## File path: gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/orc/HiveOrcSerDeManager.java ## @@ -61,8 +61,6 @@ */ @Slf4j public class HiveOrcSerDeManager extends HiveSerDeManager { - // Schema is in the format of TypeDescriptor - public static final String SCHEMA_LITERAL = "orc.schema.literal"; Review comment: Do a rebase and push with force option to avoid old commits This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341361806 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/query/HiveAvroORCQueryGenerator.java ## @@ -202,7 +204,21 @@ public static String generateCreateTableDDL(Schema schema, //.. use columns from destination schema if (isEvolutionEnabled || !destinationTableMeta.isPresent()) { log.info("Generating DDL using source schema"); + System.out.println("Generating DDL using source schema"); ddl.append(generateAvroToHiveColumnMapping(schema, Optional.of(hiveColumns), true, dbName + "." + tblName)); + try { Review comment: Why adding table properties only happens in this branch ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341357486 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/query/HiveAvroORCQueryGenerator.java ## @@ -202,7 +204,21 @@ public static String generateCreateTableDDL(Schema schema, //.. use columns from destination schema if (isEvolutionEnabled || !destinationTableMeta.isPresent()) { log.info("Generating DDL using source schema"); + System.out.println("Generating DDL using source schema"); Review comment: Do you still need it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341356298 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/converter/AbstractAvroToOrcConverter.java ## @@ -85,6 +86,7 @@ * Subdirectory within destination ORC table directory to publish data */ private static final String PUBLISHED_TABLE_SUBDIRECTORY = "final"; + public static final String OUTPUT_AVRO_SCHEMA_KEY = "output.avro.schema"; Review comment: Does it need to be public static ? Limit access modifiers so that it won't be accidentally touched by irrelevant constructs in the future. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341358264 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/task/HiveConverterUtils.java ## @@ -151,6 +154,30 @@ public static String generateCreateDuplicateTableDDL( dbName, tblName, inputDbName, inputTblName, tblLocation); } + public static String generateAlterSchemaDML( Review comment: Shall we use `generateAlterSerDePropsDML`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341363858 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/materializer/HiveMaterializerFromEntityQueryGenerator.java ## @@ -118,11 +119,19 @@ public QueryBasedHivePublishEntity generatePublishQueries() throws DataConversio Map publishDirectories = publishEntity.getPublishDirectories(); List cleanupQueries = publishEntity.getCleanupQueries(); List cleanupDirectories = publishEntity.getCleanupDirectories(); +Optional avroSchema = Optional.absent(); Review comment: I might be wrong but is this method being called through the conversion jobs? By briefly looking at the code base, this `generatePublishQueries` method is only called in `org.apache.gobblin.data.management.conversion.hive.materializer.HiveMaterializer#generatePublishQueries` which is not related to Avro2ORC, just want to confirm. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341357961 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/task/HiveConverterUtils.java ## @@ -151,6 +154,30 @@ public static String generateCreateDuplicateTableDDL( dbName, tblName, inputDbName, inputTblName, tblLocation); } + public static String generateAlterSchemaDML( + String tableName, + Optional optionalDbName, Review comment: I am not sure why dbName is optional This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services