[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema

2019-10-31 Thread GitBox
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance 
DDL to add column and column.types with case-preserving schema
URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341415701
 
 

 ##
 File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/task/HiveConverterUtils.java
 ##
 @@ -151,6 +154,30 @@ public static String generateCreateDuplicateTableDDL(
 dbName, tblName, inputDbName, inputTblName, tblLocation);
   }
 
+  public static String generateAlterSchemaDML(
+  String tableName,
+  Optional optionalDbName,
 
 Review comment:
   +1 for consistency. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema

2019-10-31 Thread GitBox
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance 
DDL to add column and column.types with case-preserving schema
URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341415129
 
 

 ##
 File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/query/HiveAvroORCQueryGenerator.java
 ##
 @@ -202,7 +204,21 @@ public static String generateCreateTableDDL(Schema schema,
 //.. use columns from destination schema
 if (isEvolutionEnabled || !destinationTableMeta.isPresent()) {
   log.info("Generating DDL using source schema");
+  System.out.println("Generating DDL using source schema");
   ddl.append(generateAvroToHiveColumnMapping(schema, 
Optional.of(hiveColumns), true, dbName + "." + tblName));
+  try {
 
 Review comment:
   Fair enough. Worth to check if the production have schema evolution enabled. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema

2019-10-31 Thread GitBox
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance 
DDL to add column and column.types with case-preserving schema
URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341414676
 
 

 ##
 File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/materializer/HiveMaterializerFromEntityQueryGenerator.java
 ##
 @@ -118,11 +119,19 @@ public QueryBasedHivePublishEntity 
generatePublishQueries() throws DataConversio
 Map publishDirectories = 
publishEntity.getPublishDirectories();
 List cleanupQueries = publishEntity.getCleanupQueries();
 List cleanupDirectories = publishEntity.getCleanupDirectories();
+Optional avroSchema = Optional.absent();
 
 Review comment:
   Your call. I am OK with both.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema

2019-10-31 Thread GitBox
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance 
DDL to add column and column.types with case-preserving schema
URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341414676
 
 

 ##
 File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/materializer/HiveMaterializerFromEntityQueryGenerator.java
 ##
 @@ -118,11 +119,19 @@ public QueryBasedHivePublishEntity 
generatePublishQueries() throws DataConversio
 Map publishDirectories = 
publishEntity.getPublishDirectories();
 List cleanupQueries = publishEntity.getCleanupQueries();
 List cleanupDirectories = publishEntity.getCleanupDirectories();
+Optional avroSchema = Optional.absent();
 
 Review comment:
   You call. I am OK with both.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema

2019-10-31 Thread GitBox
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance 
DDL to add column and column.types with case-preserving schema
URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341358581
 
 

 ##
 File path: 
gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/orc/HiveOrcSerDeManager.java
 ##
 @@ -61,8 +61,6 @@
  */
 @Slf4j
 public class HiveOrcSerDeManager extends HiveSerDeManager {
-  // Schema is in the format of TypeDescriptor
-  public static final String SCHEMA_LITERAL = "orc.schema.literal";
 
 Review comment:
   Do a rebase and push with force option to avoid old commits


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema

2019-10-31 Thread GitBox
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance 
DDL to add column and column.types with case-preserving schema
URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341361806
 
 

 ##
 File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/query/HiveAvroORCQueryGenerator.java
 ##
 @@ -202,7 +204,21 @@ public static String generateCreateTableDDL(Schema schema,
 //.. use columns from destination schema
 if (isEvolutionEnabled || !destinationTableMeta.isPresent()) {
   log.info("Generating DDL using source schema");
+  System.out.println("Generating DDL using source schema");
   ddl.append(generateAvroToHiveColumnMapping(schema, 
Optional.of(hiveColumns), true, dbName + "." + tblName));
+  try {
 
 Review comment:
   Why adding table properties only happens in this branch ? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema

2019-10-31 Thread GitBox
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance 
DDL to add column and column.types with case-preserving schema
URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341357486
 
 

 ##
 File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/query/HiveAvroORCQueryGenerator.java
 ##
 @@ -202,7 +204,21 @@ public static String generateCreateTableDDL(Schema schema,
 //.. use columns from destination schema
 if (isEvolutionEnabled || !destinationTableMeta.isPresent()) {
   log.info("Generating DDL using source schema");
+  System.out.println("Generating DDL using source schema");
 
 Review comment:
   Do you still need it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema

2019-10-31 Thread GitBox
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance 
DDL to add column and column.types with case-preserving schema
URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341356298
 
 

 ##
 File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/converter/AbstractAvroToOrcConverter.java
 ##
 @@ -85,6 +86,7 @@
* Subdirectory within destination ORC table directory to publish data
*/
   private static final String PUBLISHED_TABLE_SUBDIRECTORY = "final";
+  public static final String OUTPUT_AVRO_SCHEMA_KEY = "output.avro.schema";
 
 Review comment:
   Does it need to be public static ? Limit access modifiers so that it won't 
be accidentally touched by irrelevant constructs in the future.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema

2019-10-31 Thread GitBox
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance 
DDL to add column and column.types with case-preserving schema
URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341358264
 
 

 ##
 File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/task/HiveConverterUtils.java
 ##
 @@ -151,6 +154,30 @@ public static String generateCreateDuplicateTableDDL(
 dbName, tblName, inputDbName, inputTblName, tblLocation);
   }
 
+  public static String generateAlterSchemaDML(
 
 Review comment:
   Shall we use `generateAlterSerDePropsDML`? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema

2019-10-31 Thread GitBox
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance 
DDL to add column and column.types with case-preserving schema
URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341363858
 
 

 ##
 File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/materializer/HiveMaterializerFromEntityQueryGenerator.java
 ##
 @@ -118,11 +119,19 @@ public QueryBasedHivePublishEntity 
generatePublishQueries() throws DataConversio
 Map publishDirectories = 
publishEntity.getPublishDirectories();
 List cleanupQueries = publishEntity.getCleanupQueries();
 List cleanupDirectories = publishEntity.getCleanupDirectories();
+Optional avroSchema = Optional.absent();
 
 Review comment:
   I might be wrong but is this method being called through the conversion 
jobs? By briefly looking at the code base, this `generatePublishQueries` method 
is only called in 
`org.apache.gobblin.data.management.conversion.hive.materializer.HiveMaterializer#generatePublishQueries`
 which is not related to Avro2ORC, just want to confirm. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance DDL to add column and column.types with case-preserving schema

2019-10-31 Thread GitBox
autumnust commented on a change in pull request #2790: [GOBBLIN-941] Enhance 
DDL to add column and column.types with case-preserving schema
URL: https://github.com/apache/incubator-gobblin/pull/2790#discussion_r341357961
 
 

 ##
 File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/conversion/hive/task/HiveConverterUtils.java
 ##
 @@ -151,6 +154,30 @@ public static String generateCreateDuplicateTableDDL(
 dbName, tblName, inputDbName, inputTblName, tblLocation);
   }
 
+  public static String generateAlterSchemaDML(
+  String tableName,
+  Optional optionalDbName,
 
 Review comment:
   I am not sure why dbName is optional 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services