[GitHub] [hive] lcspinter commented on a change in pull request #2316: HIVE-25161: Implement partitioned CTAS for iceberg tables

GitBox Wed, 26 May 2021 01:08:21 -0700


lcspinter commented on a change in pull request #2316:
URL: https://github.com/apache/hive/pull/2316#discussion_r639492832




##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
##########
@@ -736,6 +740,21 @@ protected void initializeOp(Configuration hconf) throws 
HiveException {
     }
   }
 
+  private boolean skipPartitionCheck() {
+    return Optional.ofNullable(conf).map(FileSinkDesc::getTableInfo)
+        .map(TableDesc::getProperties)
+        .map(props -> 
props.getProperty(hive_metastoreConstants.META_TABLE_STORAGE))
+        .map(handler -> {
+          try {
+            return HiveUtils.getStorageHandler(hconf, handler);
+          } catch (HiveException e) {
+            return null;
+          }
+        })
+        .map(HiveStorageHandler::alwaysUnpartitioned)

Review comment:
       Wouldn't this end up in a null pointer exception, when we have a 
HiveException? 

##########
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java
##########
@@ -328,6 +345,7 @@ static void overlayTableProperties(Configuration 
configuration, TableDesc tableD
     map.put(InputFormatConfig.TABLE_IDENTIFIER, 
props.getProperty(Catalogs.NAME));
     map.put(InputFormatConfig.TABLE_LOCATION, table.location());
     map.put(InputFormatConfig.TABLE_SCHEMA, schemaJson);
+    props.put(InputFormatConfig.PARTITION_SPEC, 
PartitionSpecParser.toJson(table.spec()));

Review comment:
       It is not related to this change, but it seems to me that the javadoc 
and the naming of the method are not in sync.  Maybe we should separate the 
logic which is strictly related to storing serializable table data from the 
code which updates table properties.

##########
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergSerDe.java
##########
@@ -151,7 +152,23 @@ public void initialize(@Nullable Configuration 
configuration, Properties serDePr
   private void createTableForCTAS(Configuration configuration, Properties 
serDeProperties) {
     serDeProperties.setProperty(TableProperties.ENGINE_HIVE_ENABLED, "true");
     serDeProperties.setProperty(InputFormatConfig.TABLE_SCHEMA, 
SchemaParser.toJson(tableSchema));
+
+    // build partition spec, if any
+    if (serDeProperties.getProperty(serdeConstants.LIST_PARTITION_COLUMNS) != 
null) {
+      String[] partCols = 
serDeProperties.getProperty(serdeConstants.LIST_PARTITION_COLUMNS).split(",");

Review comment:
       Are we certain that the partition column name cannot contain `,`?  

##########
File path: 
iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerWithEngine.java
##########
@@ -540,6 +540,43 @@ public void testCTASFromHiveTable() {
     Assert.assertArrayEquals(new Object[]{2L, "Linda", "Finance"}, 
objects.get(1));
   }
 
+  @Test
+  public void testCTASPartitionedFromHiveTable() throws TException, 
InterruptedException {
+    Assume.assumeTrue("CTAS target table is supported fully only for 
HiveCatalog tables." +

Review comment:
       Can we do a similar check to in production code as well? It would be 
good to warn the end user about this limitation.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [hive] lcspinter commented on a change in pull request #2316: HIVE-25161: Implement partitioned CTAS for iceberg tables

Reply via email to