Hi. I am using HIVE 4.0.0 to read ICEBERG tables. I am having some problems with it, so if someone could guide me, that would be great.
Env: hadoop3.3.6 hive4.0.0 tez0.10.2 iceberg1.4.3 iceberg-table: hadoop-catalog-table/location_based_table Question 1: How tez.mrreader.config.update.properties works? I'm testing hive-iceberg. My current problem is that I find I can't read all the non-partitioned columns under the partitioned table.(With vectorisation turned on). Reading through the code, I found that vectorised reads depend on the value of "hive.io.file.readcolumn.ids". When vectorisation is turned on, TEZ-MAP-TASK relies on the values of the following two attributes: hive.io.file.readcolumn.names and hive.io.file.readcolumn.ids Currently, these two values are dynamically set in TEZ-Driver depending on the SQL submitted by the user. According to https://issues.apache.org/jira/ browse/TEZ-4248 , the authors seem to expect to be able to pass both values to tez-worker. But, I found that in TezChild, I am not able to get the value of hive.io.file.readcolumn.ids which is set in TEZ-ApplicationMaster. When I assign the value "hive.io.file.readcolumn.ids" directly from the console, it reads the ICEBERG partition table just fine. But I can't do this in a production environment. So.How should I troubleshoot this problem? Question 2: HIVE read ICEBERG non-partitioned table dependency on "hive.io.file.readcolumn.ids"? For non-partitioned tables, I found that in cases where I couldn't get the value of "hive.io.file.readcolumn.ids" or the value of "hive.io.file.readcolumn.ids" was wrong. I can still read the ICEBERG non-partitioned tables just fine. But from the code, they are using the same code ..... So. Why...? I'm very confused at the moment and I'd be grateful if someone could help me. I'd appreciate it. Thank you.