rdblue commented on code in PR #13219:
URL: https://github.com/apache/iceberg/pull/13219#discussion_r2229566081


##########
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveSchemaUtil.java:
##########
@@ -184,6 +184,8 @@ private static String convertToTypeString(Type type) {
         final Types.MapType mapType = type.asMapType();
         return String.format(
             "map<%s,%s>", convert(mapType.keyType()), 
convert(mapType.valueType()));
+      case VARIANT:
+        return "variant";

Review Comment:
   Okay, I understand why this was originally part of the changes. This PR 
updates `TestIcebergSourceTablesBase`, which tests the Spark `IcebergSource` 
that is used by `DataFrameReader` and `DataFrameWriter`:
   
   ```java
   // write a DF using IcebergSource
   df.write().format("iceberg").mode(SaveMode.Append).save(location);
   // read a DF using IcebergSource
   df = spark.read().format("iceberg").load(location);
   ```
   
   Those tests are run by two suites, `TestIcebergSourceHadoopTables` that 
tests path locations (like `file:/path/to/table` and 
`file:/path/to/table#manifests`) and `TestIcebergSourceHiveTables` that tests 
catalog identifier cases (like `db.table` and `db.table.manifests`). The Hive 
tests use the `HiveCatalog` and will fail at the line below when trying to 
create the table because schema conversion fails ("variant is not supported").
   
   With this change the test _also_ fails, but with a different error message:
   
   ```
   Invalid column type: variant
   InvalidObjectException(message:Invalid column type: variant)
   ```
   
   I was able to get the tests working by changing this string to a generic 
variant equivalent: `"struct<metadata:binary,value:binary>"`. That's a valid 
type in Hive and is a reasonable way to return a variant through Hive.
   
   I'm undecided about whether to make the `struct` type change to fix this or 
to remove the new test from the `TestIcebergSourceTablesBase` test suites, but 
I'm leaning toward removing the new test. Variant support is orthogonal to how 
a table is loaded, so I think it makes little sense to add a special Variant 
test to those suites. However, this did catch that Variant tables cannot be 
stored in `HiveCatalog` right now. That's valuable, but I think the right way 
to fix `HiveCatalog` is to build a `DataTest` that checks its compatibility 
with each Iceberg type since it has to perform conversion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to