[GitHub] [incubator-gobblin] ZihanLi58 commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema
ZihanLi58 commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341419272 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/query/HiveAvroORCQueryGenerator.java ## @@ -202,7 +204,21 @@ public static String generateCreateTableDDL(Schema schema, //.. use columns from destination schema if (isEvolutionEnabled || !destinationTableMeta.isPresent()) { log.info("Generating DDL using source schema"); + System.out.println("Generating DDL using source schema"); ddl.append(generateAvroToHiveColumnMapping(schema, Optional.of(hiveColumns), true, dbName + "." + tblName)); + try { Review comment: Yes, at least it's enabled in scoreevent. BTW, can you look at the new pr for this commit? I have sent you the email. Thank you! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] ZihanLi58 commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema
ZihanLi58 commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341370332 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/materializer/HiveMaterializerFromEntityQueryGenerator.java ## @@ -118,11 +119,19 @@ public QueryBasedHivePublishEntity generatePublishQueries() throws DataConversio Map publishDirectories = publishEntity.getPublishDirectories(); List cleanupQueries = publishEntity.getCleanupQueries(); List cleanupDirectories = publishEntity.getCleanupDirectories(); +Optional avroSchema = Optional.absent(); Review comment: This is not called by Avro2Roc. I add this just to make sure it's consistent with that. But I can remove this one if needed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] ZihanLi58 commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema
ZihanLi58 commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341369711 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/query/HiveAvroORCQueryGenerator.java ## @@ -202,7 +204,21 @@ public static String generateCreateTableDDL(Schema schema, //.. use columns from destination schema if (isEvolutionEnabled || !destinationTableMeta.isPresent()) { log.info("Generating DDL using source schema"); + System.out.println("Generating DDL using source schema"); ddl.append(generateAvroToHiveColumnMapping(schema, Optional.of(hiveColumns), true, dbName + "." + tblName)); + try { Review comment: I want to make sure only when schema evolution is enabled or there is no existing table, we can set the columns. Because that will overwrite the schema. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] ZihanLi58 commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema
ZihanLi58 commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341366880 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/task/HiveConverterUtils.java ## @@ -151,6 +154,30 @@ public static String generateCreateDuplicateTableDDL( dbName, tblName, inputDbName, inputTblName, tblLocation); } + public static String generateAlterSchemaDML( + String tableName, + Optional optionalDbName, Review comment: Since I try to make it consistent with method generateCreateDuplicateTableDDL. In which the dbName is optional. I guess it's for testing. But not so sure about that This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] ZihanLi58 commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema
ZihanLi58 commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341366268 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/converter/AbstractAvroToOrcConverter.java ## @@ -85,6 +86,7 @@ * Subdirectory within destination ORC table directory to publish data */ private static final String PUBLISHED_TABLE_SUBDIRECTORY = "final"; + public static final String OUTPUT_AVRO_SCHEMA_KEY = "output.avro.schema"; Review comment: This one will be access in HiveMaterializerFromEntityQueryGenerator when it try to get the schema. So I set it to be public static. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services