Re: show create table return empty after change column name in hive
I am using impala 2.10.0 and hadoop-2.6.0-cdh5.13.0-SNAPSHOT hive-1.1.0-cdh5.13.0-SNAPSHOT. I use apache-hive-1.2.1 to do alter table in hive cli. Here is catalogd log : I1013 16:04:51.927434 31504 CatalogServiceCatalog.java:1081] Invalidating table metadata: impala_test.sales_fact_1997 I1013 16:04:51.927593 31504 PerfLogger.java:127] I1013 16:04:51.927728 31504 HiveMetaStore.java:835] 17: get_table : db=impala_test tbl=sales_fact_1997 I1013 16:04:51.927846 31504 HiveMetaStore.java:404] ugi=impala/ db-87.photo.163@hadoop.hz.netease.com ip=unknown-ip-addr cmd=get_table : db=impala_test tbl=sales_fact_1997 I1013 16:04:51.935062 31504 PerfLogger.java:168] I1013 16:04:51.935269 31113 TableLoadingMgr.java:285] Loading next table from queue: impala_test.sales_fact_1997 I1013 16:04:51.935329 31113 TableLoadingMgr.java:287] Remaining items in queue: 0. Loads in progress: 0 I1013 16:04:51.935458 31123 TableLoader.java:58] Loading metadata for: impala_test.sales_fact_1997 I1013 16:04:51.935539 31123 PerfLogger.java:127] I1013 16:04:51.935638 31123 HiveMetaStore.java:835] 1: get_table : db=impala_test tbl=sales_fact_1997 I1013 16:04:51.935750 31123 HiveMetaStore.java:404] ugi=impala/ db-87.photo.163@hadoop.hz.netease.com ip=unknown-ip-addr cmd=get_table : db=impala_test tbl=sales_fact_1997 I1013 16:04:51.940784 31123 PerfLogger.java:168] I1013 16:04:51.940942 31123 PerfLogger.java:127] I1013 16:04:51.941056 31123 HiveMetaStore.java:835] 1: get_config_value: name=hive.exec.default.partition.name defaultValue=__HIVE_DEFAULT_PARTITION__ I1013 16:04:51.941190 31123 HiveMetaStore.java:404] ugi=impala/ db-87.photo.163@hadoop.hz.netease.com ip=unknown-ip-addr cmd=get_config_value: name=hive.exec.default.partition.name defaultValue=__HIVE_DEFAULT_PARTITION__ I1013 16:04:51.941308 31123 PerfLogger.java:168] I1013 16:04:51.941398 31123 PerfLogger.java:127] I1013 16:04:51.941498 31123 HiveMetaStore.java:835] 1: get_table_statistics_req: db=impala_test table=sales_fact_1997 I1013 16:04:51.941601 31123 HiveMetaStore.java:404] ugi=impala/ db-87.photo.163@hadoop.hz.netease.com ip=unknown-ip-addr cmd=get_table_statistics_req: db=impala_test table=sales_fact_1997 I1013 16:04:51.946686 31123 PerfLogger.java:168] I1013 16:04:51.946748 31123 HdfsTable.java:984] Fetching partition metadata from the Metastore: impala_test.sales_fact_1997 I1013 16:04:51.946826 31123 PerfLogger.java:127] I1013 16:04:51.946928 31123 HiveMetaStore.java:835] 1: get_partition_names : db=impala_test tbl=sales_fact_1997 I1013 16:04:51.947038 31123 HiveMetaStore.java:404] ugi=impala/ db-87.photo.163@hadoop.hz.netease.com ip=unknown-ip-addr cmd=get_partition_names : db=impala_test tbl=sales_fact_1997 I1013 16:04:51.948602 31123 PerfLogger.java:168] I1013 16:04:51.948664 31123 HdfsTable.java:988] Fetched partition metadata from the Metastore: impala_test.sales_fact_1997 I1013 16:04:51.957386 31123 HdfsTable.java:674] Loading file and block metadata for 1 paths: impala_test.sales_fact_1997 I1013 16:04:51.958546 31123 HdfsTable.java:679] Loaded file and block metadata for 1 paths: impala_test.sales_fact_1997 I1013 16:04:51.958627 31123 TableLoader.java:97] Loaded metadata for: impala_test.sales_fact_1997 I1013 16:04:52.328815 31139 catalog-server.cc:320] Publishing update: TABLE:impala_test.sales_fact_1997@1364 I1013 16:04:52.328985 31139 catalog-server.cc:320] Publishing update: CATALOG:b11b2a23ca4a496d:8e99307f267d4d54@1364 I trace the code in TableLoader.load, msTbl = msClient.getHiveClient().getTable(db.getName(), tblName); return the table, but msTbl.getSd().getCols() is empty, and I had tested in my code which get table's metastore twice and between them, I alter table in hive-cli, the result of the second operation is incorrect, So, I think it is a bug of hive MetaStoreClient . 2017-10-13 13:10 GMT+08:00 Jeszy : > Hey Yu, > > I tried to reproduce on a CDH5.13 cluster, but your exact commands > work as expected for me. Are you using Impala 2.10 on a CDH5.13 > cluster, or something else? Can you share your catalog and Hive > metastore logs? > > Thanks. > > On 12 October 2017 at 19:39, yu feng wrote: > > I try to use ' invalidate metadata' for the whole catalog, But the > modified > > table is still empty. I am doubt the only way is restart catalogd. > > > > BTW, I test with the newest version(2.10.0) > > > > 2017-10-13 0:17 GMT+08:00 Jeszy : > > > >> This does sound like a bug. What version are you using? Do you see any > >> errors in the catalog logs? > >> I think a global invalidate metadata should work, and it's a bit less > >> intrusive than a catalog restart. In general, it is a good idea to do > >> all metadata operations from Impala if you are using Impala at all, it > >> helps a lot in making metadata operations seamless. > >> > >> On 12 October 2017 at 02:53, yu feng wrote: > >> > In our scene, users always do metadata modifications in hive, and do > some > >> > query in impala. > >> > > >> > 2017-10-12 16:31 G
Re: show create table return empty after change column name in hive
Hey Yu, I tried to reproduce on a CDH5.13 cluster, but your exact commands work as expected for me. Are you using Impala 2.10 on a CDH5.13 cluster, or something else? Can you share your catalog and Hive metastore logs? Thanks. On 12 October 2017 at 19:39, yu feng wrote: > I try to use ' invalidate metadata' for the whole catalog, But the modified > table is still empty. I am doubt the only way is restart catalogd. > > BTW, I test with the newest version(2.10.0) > > 2017-10-13 0:17 GMT+08:00 Jeszy : > >> This does sound like a bug. What version are you using? Do you see any >> errors in the catalog logs? >> I think a global invalidate metadata should work, and it's a bit less >> intrusive than a catalog restart. In general, it is a good idea to do >> all metadata operations from Impala if you are using Impala at all, it >> helps a lot in making metadata operations seamless. >> >> On 12 October 2017 at 02:53, yu feng wrote: >> > In our scene, users always do metadata modifications in hive, and do some >> > query in impala. >> > >> > 2017-10-12 16:31 GMT+08:00 sky : >> > >> >> Why is the second step performed in hive, not impala? >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> At 2017-10-12 15:12:38, "yu feng" wrote: >> >> >I open impala-shell and hive-cli. >> >> >1、execute 'show create table impala_test.sales_fact_1997' in >> impala-shell >> >> , >> >> >return : >> >> > >> >> >+-- >> >> -+ >> >> >| result >> >> > | >> >> >+-- >> >> -+ >> >> >| CREATE TABLE impala_test.sales_fact_1997 ( >> >> > | >> >> >| product_id INT, >> >> >| >> >> >| time_id INT, >> >> > | >> >> >| customer_id INT, >> >> > | >> >> >| promotion_id INT, >> >> >| >> >> >| store_id INT, >> >> >| >> >> >| store_sales DOUBLE, >> >> >| >> >> >| store_cost DOUBLE, >> >> > | >> >> >| unit_sales DOUBLE >> >> >| >> >> >| ) >> >> >| >> >> >| COMMENT 'Imported by sqoop on 2017/06/09 20:25:40' >> >> >| >> >> >| ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED >> BY >> >> >'\n' | >> >> >| WITH SERDEPROPERTIES ('field.delim'='\u0001', 'line.delim'='\n', >> >> >'serialization.format'='\u0001') | >> >> >| STORED AS PARQUET >> >> >| >> >> >| LOCATION >> >> >'hdfs://hz-cluster1/user/nrpt/hive-server/impala_test. >> db/sales_fact_1997' >> >> > | >> >> >| TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='true', 'numFiles'='3', >> >> >'numRows'='10', 'rawDataSize'='80', 'totalSize'='1619937') | >> >> >+-- >> >> -+ >> >> > >> >> >2、execute 'alter table impala_test.sales_fact_1997 change column >> >> product_id >> >> >pproduct_id int;' in hive -cli, return OK. >> >> >3、execute 'invalidate metadata impala_test.sales_fact_1997 '. >> >> >4、execute 'show create table impala_test.sales_fact_1997' again in >> >> >impala-shell, return : >> >> > >> >> >+-- >> >> -+ >> >> >| result >> >> > | >> >> >+-- >> >> -+ >> >> >| CREATE TABLE impala_test.sales_fact_1997 >> >> > | >> >> >| COMMENT 'Imported by sqoop on 2017/06/09 20:25:40' >> >> >| >> >> >| ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED >> BY >> >> >'\n' | >> >> >| WITH SERDEPROPERTIES ('field.delim'='\u0001', 'line.delim'='\n', >> >> >'serialization.format'='\u0001') | >> >> >| STORED AS PARQUET >> >> >| >> >> >| LOCATION >> >> >'hdfs://hz-cluster1/user/nrpt/hive-server/impala_test. >> db/sales_fact_1997' >> >> > | >> >> >| TBLPROPERTIES ('COLUMN_STATS_ACCU
Re: show create table return empty after change column name in hive
I try to use ' invalidate metadata' for the whole catalog, But the modified table is still empty. I am doubt the only way is restart catalogd. BTW, I test with the newest version(2.10.0) 2017-10-13 0:17 GMT+08:00 Jeszy : > This does sound like a bug. What version are you using? Do you see any > errors in the catalog logs? > I think a global invalidate metadata should work, and it's a bit less > intrusive than a catalog restart. In general, it is a good idea to do > all metadata operations from Impala if you are using Impala at all, it > helps a lot in making metadata operations seamless. > > On 12 October 2017 at 02:53, yu feng wrote: > > In our scene, users always do metadata modifications in hive, and do some > > query in impala. > > > > 2017-10-12 16:31 GMT+08:00 sky : > > > >> Why is the second step performed in hive, not impala? > >> > >> > >> > >> > >> > >> > >> > >> > >> At 2017-10-12 15:12:38, "yu feng" wrote: > >> >I open impala-shell and hive-cli. > >> >1、execute 'show create table impala_test.sales_fact_1997' in > impala-shell > >> , > >> >return : > >> > > >> >+-- > >> -+ > >> >| result > >> > | > >> >+-- > >> -+ > >> >| CREATE TABLE impala_test.sales_fact_1997 ( > >> > | > >> >| product_id INT, > >> >| > >> >| time_id INT, > >> > | > >> >| customer_id INT, > >> > | > >> >| promotion_id INT, > >> >| > >> >| store_id INT, > >> >| > >> >| store_sales DOUBLE, > >> >| > >> >| store_cost DOUBLE, > >> > | > >> >| unit_sales DOUBLE > >> >| > >> >| ) > >> >| > >> >| COMMENT 'Imported by sqoop on 2017/06/09 20:25:40' > >> >| > >> >| ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED > BY > >> >'\n' | > >> >| WITH SERDEPROPERTIES ('field.delim'='\u0001', 'line.delim'='\n', > >> >'serialization.format'='\u0001') | > >> >| STORED AS PARQUET > >> >| > >> >| LOCATION > >> >'hdfs://hz-cluster1/user/nrpt/hive-server/impala_test. > db/sales_fact_1997' > >> > | > >> >| TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='true', 'numFiles'='3', > >> >'numRows'='10', 'rawDataSize'='80', 'totalSize'='1619937') | > >> >+-- > >> -+ > >> > > >> >2、execute 'alter table impala_test.sales_fact_1997 change column > >> product_id > >> >pproduct_id int;' in hive -cli, return OK. > >> >3、execute 'invalidate metadata impala_test.sales_fact_1997 '. > >> >4、execute 'show create table impala_test.sales_fact_1997' again in > >> >impala-shell, return : > >> > > >> >+-- > >> -+ > >> >| result > >> > | > >> >+-- > >> -+ > >> >| CREATE TABLE impala_test.sales_fact_1997 > >> > | > >> >| COMMENT 'Imported by sqoop on 2017/06/09 20:25:40' > >> >| > >> >| ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED > BY > >> >'\n' | > >> >| WITH SERDEPROPERTIES ('field.delim'='\u0001', 'line.delim'='\n', > >> >'serialization.format'='\u0001') | > >> >| STORED AS PARQUET > >> >| > >> >| LOCATION > >> >'hdfs://hz-cluster1/user/nrpt/hive-server/impala_test. > db/sales_fact_1997' > >> > | > >> >| TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='true', 'numFiles'='3', > >> >'numRows'='10', 'rawDataSize'='80', 'totalSize'='1619937') | > >> >+-- > >> -+ > >> > > >> >all columns disappear, the column change will correct if I restart > >> >catalogd, I think it is a BUG caused by hive metastore client, It i
Re: show create table return empty after change column name in hive
This does sound like a bug. What version are you using? Do you see any errors in the catalog logs? I think a global invalidate metadata should work, and it's a bit less intrusive than a catalog restart. In general, it is a good idea to do all metadata operations from Impala if you are using Impala at all, it helps a lot in making metadata operations seamless. On 12 October 2017 at 02:53, yu feng wrote: > In our scene, users always do metadata modifications in hive, and do some > query in impala. > > 2017-10-12 16:31 GMT+08:00 sky : > >> Why is the second step performed in hive, not impala? >> >> >> >> >> >> >> >> >> At 2017-10-12 15:12:38, "yu feng" wrote: >> >I open impala-shell and hive-cli. >> >1、execute 'show create table impala_test.sales_fact_1997' in impala-shell >> , >> >return : >> > >> >+-- >> -+ >> >| result >> > | >> >+-- >> -+ >> >| CREATE TABLE impala_test.sales_fact_1997 ( >> > | >> >| product_id INT, >> >| >> >| time_id INT, >> > | >> >| customer_id INT, >> > | >> >| promotion_id INT, >> >| >> >| store_id INT, >> >| >> >| store_sales DOUBLE, >> >| >> >| store_cost DOUBLE, >> > | >> >| unit_sales DOUBLE >> >| >> >| ) >> >| >> >| COMMENT 'Imported by sqoop on 2017/06/09 20:25:40' >> >| >> >| ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED BY >> >'\n' | >> >| WITH SERDEPROPERTIES ('field.delim'='\u0001', 'line.delim'='\n', >> >'serialization.format'='\u0001') | >> >| STORED AS PARQUET >> >| >> >| LOCATION >> >'hdfs://hz-cluster1/user/nrpt/hive-server/impala_test.db/sales_fact_1997' >> > | >> >| TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='true', 'numFiles'='3', >> >'numRows'='10', 'rawDataSize'='80', 'totalSize'='1619937') | >> >+-- >> -+ >> > >> >2、execute 'alter table impala_test.sales_fact_1997 change column >> product_id >> >pproduct_id int;' in hive -cli, return OK. >> >3、execute 'invalidate metadata impala_test.sales_fact_1997 '. >> >4、execute 'show create table impala_test.sales_fact_1997' again in >> >impala-shell, return : >> > >> >+-- >> -+ >> >| result >> > | >> >+-- >> -+ >> >| CREATE TABLE impala_test.sales_fact_1997 >> > | >> >| COMMENT 'Imported by sqoop on 2017/06/09 20:25:40' >> >| >> >| ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED BY >> >'\n' | >> >| WITH SERDEPROPERTIES ('field.delim'='\u0001', 'line.delim'='\n', >> >'serialization.format'='\u0001') | >> >| STORED AS PARQUET >> >| >> >| LOCATION >> >'hdfs://hz-cluster1/user/nrpt/hive-server/impala_test.db/sales_fact_1997' >> > | >> >| TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='true', 'numFiles'='3', >> >'numRows'='10', 'rawDataSize'='80', 'totalSize'='1619937') | >> >+-- >> -+ >> > >> >all columns disappear, the column change will correct if I restart >> >catalogd, I think it is a BUG caused by hive metastore client, It is any >> >good idea overcome the problem except restart catalogd. >> > >> > I think we can check columns after getTable from HiveMetastoreClient, if >> >it is empty, try to recreate the HiveMetastoreClient(hive do not support >> >0-column table). is it a good way to overcome the problem if modify code >> >like this? >>
Re: show create table return empty after change column name in hive
In our scene, users always do metadata modifications in hive, and do some query in impala. 2017-10-12 16:31 GMT+08:00 sky : > Why is the second step performed in hive, not impala? > > > > > > > > > At 2017-10-12 15:12:38, "yu feng" wrote: > >I open impala-shell and hive-cli. > >1、execute 'show create table impala_test.sales_fact_1997' in impala-shell > , > >return : > > > >+-- > -+ > >| result > > | > >+-- > -+ > >| CREATE TABLE impala_test.sales_fact_1997 ( > > | > >| product_id INT, > >| > >| time_id INT, > > | > >| customer_id INT, > > | > >| promotion_id INT, > >| > >| store_id INT, > >| > >| store_sales DOUBLE, > >| > >| store_cost DOUBLE, > > | > >| unit_sales DOUBLE > >| > >| ) > >| > >| COMMENT 'Imported by sqoop on 2017/06/09 20:25:40' > >| > >| ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED BY > >'\n' | > >| WITH SERDEPROPERTIES ('field.delim'='\u0001', 'line.delim'='\n', > >'serialization.format'='\u0001') | > >| STORED AS PARQUET > >| > >| LOCATION > >'hdfs://hz-cluster1/user/nrpt/hive-server/impala_test.db/sales_fact_1997' > > | > >| TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='true', 'numFiles'='3', > >'numRows'='10', 'rawDataSize'='80', 'totalSize'='1619937') | > >+-- > -+ > > > >2、execute 'alter table impala_test.sales_fact_1997 change column > product_id > >pproduct_id int;' in hive -cli, return OK. > >3、execute 'invalidate metadata impala_test.sales_fact_1997 '. > >4、execute 'show create table impala_test.sales_fact_1997' again in > >impala-shell, return : > > > >+-- > -+ > >| result > > | > >+-- > -+ > >| CREATE TABLE impala_test.sales_fact_1997 > > | > >| COMMENT 'Imported by sqoop on 2017/06/09 20:25:40' > >| > >| ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED BY > >'\n' | > >| WITH SERDEPROPERTIES ('field.delim'='\u0001', 'line.delim'='\n', > >'serialization.format'='\u0001') | > >| STORED AS PARQUET > >| > >| LOCATION > >'hdfs://hz-cluster1/user/nrpt/hive-server/impala_test.db/sales_fact_1997' > > | > >| TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='true', 'numFiles'='3', > >'numRows'='10', 'rawDataSize'='80', 'totalSize'='1619937') | > >+-- > -+ > > > >all columns disappear, the column change will correct if I restart > >catalogd, I think it is a BUG caused by hive metastore client, It is any > >good idea overcome the problem except restart catalogd. > > > > I think we can check columns after getTable from HiveMetastoreClient, if > >it is empty, try to recreate the HiveMetastoreClient(hive do not support > >0-column table). is it a good way to overcome the problem if modify code > >like this? >