rdblue commented on code in PR #13219:
URL: https://github.com/apache/iceberg/pull/13219#discussion_r2229566081
##########
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveSchemaUtil.java:
##########
@@ -184,6 +184,8 @@ private static String convertToTypeString(Type type) {
final Types.MapType mapType = type.asMapType();
return String.format(
"map<%s,%s>", convert(mapType.keyType()),
convert(mapType.valueType()));
+ case VARIANT:
+ return "variant";
Review Comment:
Okay, I understand why this was originally part of the changes. This PR
updates `TestIcebergSourceTablesBase`, which tests the Spark `IcebergSource`
that is used by `DataFrameReader` and `DataFrameWriter`:
```java
// write a DF using IcebergSource
df.write().format("iceberg").mode(SaveMode.Append).save(location);
// read a DF using IcebergSource
df = spark.read().format("iceberg").load(location);
```
Those tests are run by two suites, `TestIcebergSourceHadoopTables` that
tests path locations (like `file:/path/to/table` and
`file:/path/to/table#manifests`) and `TestIcebergSourceHiveTables` that tests
catalog identifier cases (like `db.table` and `db.table.manifests`). The Hive
tests use the `HiveCatalog` and will fail at the line below when trying to
create the table because schema conversion fails ("variant is not supported").
With this change the test _also_ fails, but with a different error message:
```
Invalid column type: variant
InvalidObjectException(message:Invalid column type: variant)
```
I was able to get the tests working by changing this string to a generic
variant equivalent: `"struct<metadata:binary,value:binary>"`. That's a valid
type in Hive and is a reasonable way to return a variant through Hive.
I'm undecided about whether to make the `struct` type change to fix this or
to remove the new test from the `TestIcebergSourceTablesBase` test suites, but
I'm leaning toward removing the new test. Variant support is orthogonal to how
a table is loaded, so I think it makes little sense to add a special Variant
test to those suites. However, this did catch that Variant tables cannot be
stored in `HiveCatalog` right now. That's valuable, but I think the right way
to fix `HiveCatalog` is to build a `DataTest` that checks its compatibility
with each Iceberg type since it has to perform conversion.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]