leedarHawk opened a new issue, #6235:
URL: https://github.com/apache/iceberg/issues/6235
### Apache Iceberg version
0.14.0
### Query engine
Hive
### Please describe the bug 🐞
When I insert data using Hive with TEZ, the data is stored on HDFS
successfully, but the data is not shown when select. If I change the engine to
MR, it works, the data is stored/loaded successfully.
hive version: 3.1.3
tez: 0.10.2
hadoop: Hadoop 3.1.1.3.1.5.0-152
set hive.execution.engine=tez;
add jar /tmp/iceberg-hive-runtime-0.14.1.jar;
describe foramtted test.x6;
+-------------------------------+----------------------------------------------------+----------------------------------------------------+
| col_name | data_type
| comment |
+-------------------------------+----------------------------------------------------+----------------------------------------------------+
| # col_name | data_type
| comment |
| user_name | string
| from deserializer |
| area | string
| from deserializer |
| | NULL
| NULL |
| # Detailed Table Information | NULL
| NULL |
| Database: | test
| NULL |
| OwnerType: | USER
| NULL |
| Owner: | hive
| NULL |
| CreateTime: | Mon Nov 21 13:31:53 CST 2022
| NULL |
| LastAccessTime: | UNKNOWN
| NULL |
| Retention: | 0
| NULL |
| Location: |
hdfs://hdp46:8020/user/hive/warehouse/test.db/x6 | NULL
|
| Table Type: | MANAGED_TABLE
| NULL |
| Table Parameters: | NULL
| NULL |
| | bucketing_version
| 2 |
| | current-schema
|
{\"type\":\"struct\",\"schema-id\":0,\"fields\":[{\"id\":1,\"name\":\"user_name\",\"required\":false,\"type\":\"string\"},{\"id\":2,\"name\":\"area\",\"required\":false,\"type\":\"string\"}]}
|
| | engine.hive.enabled
| true |
| | external.table.purge
| TRUE |
| | metadata_location
|
hdfs://hdp46:8020/user/hive/warehouse/test.db/x6/metadata/00000-362d4f5e-6575-4be7-aad1-a9c027d1ff43.metadata.json
|
| | numFiles
| 0 |
| | numRows
| 0 |
| | rawDataSize
| 0 |
| | snapshot-count
| 0 |
| | storage_handler
| org.apache.iceberg.mr.hive.HiveIcebergStorageHandler |
| | table_type
| ICEBERG |
| | totalSize
| 0 |
| | transient_lastDdlTime
| 1669008713 |
| | uuid
| 5fe91beb-0cfc-4903-96e6-268b933eca73 |
| | NULL
| NULL |
| # Storage Information | NULL
| NULL |
| SerDe Library: |
org.apache.iceberg.mr.hive.HiveIcebergSerDe | NULL
|
| InputFormat: |
org.apache.iceberg.mr.hive.HiveIcebergInputFormat | NULL
|
| OutputFormat: |
org.apache.iceberg.mr.hive.HiveIcebergOutputFormat | NULL
|
| Compressed: | No
| NULL |
| Num Buckets: | 0
| NULL |
| Bucket Columns: | []
| NULL |
| Sort Columns: | []
| NULL |
+-------------------------------+----------------------------------------------------+----------------------------------------------------+
=================================insert data===================
insert into test.x6 values ('a1', 'b1');
=================================insert data log start================
INFO : Compiling
command(queryId=root_20221121133308_796f75d3-e891-4f39-8b48-ac2dcbb8d433):
insert into test.x6 values ('a1', 'b1')
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:col1,
type:string, comment:null), FieldSchema(name:col2, type:string, comment:null)],
properties:null)
INFO : Completed compiling
command(queryId=root_20221121133308_796f75d3-e891-4f39-8b48-ac2dcbb8d433); Time
taken: 0.782 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Executing
command(queryId=root_20221121133308_796f75d3-e891-4f39-8b48-ac2dcbb8d433):
insert into test.x6 values ('a1', 'b1')
INFO : Query ID = root_20221121133308_796f75d3-e891-4f39-8b48-ac2dcbb8d433
INFO : Total jobs = 1
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Starting task [Stage-1:DDL] in serial mode
INFO : Launching Job 1 out of 1
INFO : Starting task [Stage-2:MAPRED] in serial mode
INFO : Subscribed to counters: [] for queryId:
root_20221121133308_796f75d3-e891-4f39-8b48-ac2dcbb8d433
INFO : Session is already open
INFO : Dag name: insert into test.x6 values ('a1', 'b1') (Stage-2)
INFO : Tez session was closed. Reopening...
INFO : Session re-established.
INFO : Session re-established.
INFO : Status: Running (Executing on YARN cluster with App id
application_1668952240853_0007)
INFO : Starting task [Stage-4:DDL] in serial mode
INFO : Completed executing
command(queryId=root_20221121133308_796f75d3-e891-4f39-8b48-ac2dcbb8d433); Time
taken: 17.825 seconds
INFO : OK
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Compiling
command(queryId=root_20221121133504_3b4e5da8-db8c-4494-b94d-98aa991a33f3):
select * from test.x6
INFO : Concurrency mode is disabled, not creating a lock manager
=================================insert data log end================
=================================select data======================
select * from test.x6
=================================select data log start===================
INFO : Compiling
command(queryId=root_20221121133504_3b4e5da8-db8c-4494-b94d-98aa991a33f3):
select * from test.x6
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema:
Schema(fieldSchemas:[FieldSchema(name:x6.user_name, type:string, comment:null),
FieldSchema(name:x6.area, type:string, comment:null)], properties:null)
INFO : Completed compiling
command(queryId=root_20221121133504_3b4e5da8-db8c-4494-b94d-98aa991a33f3); Time
taken: 0.441 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Executing
command(queryId=root_20221121133504_3b4e5da8-db8c-4494-b94d-98aa991a33f3):
select * from test.x6
INFO : Completed executing
command(queryId=root_20221121133504_3b4e5da8-db8c-4494-b94d-98aa991a33f3); Time
taken: 0.001 seconds
INFO : OK
INFO : Concurrency mode is disabled, not creating a lock manager
+---------------+----------+
| x6.user_name | x6.area |
+---------------+----------+
+---------------+----------+
=================================select data log start===================
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]