Re: show create table return empty after change column name in hive

2017-10-13 Thread yu feng
I am using impala 2.10.0 and hadoop-2.6.0-cdh5.13.0-SNAPSHOT
 hive-1.1.0-cdh5.13.0-SNAPSHOT.
I use apache-hive-1.2.1 to do alter table in hive cli.

Here is catalogd log :

I1013 16:04:51.927434 31504 CatalogServiceCatalog.java:1081] Invalidating
table metadata: impala_test.sales_fact_1997
I1013 16:04:51.927593 31504 PerfLogger.java:127] 
I1013 16:04:51.927728 31504 HiveMetaStore.java:835] 17: get_table :
db=impala_test tbl=sales_fact_1997
I1013 16:04:51.927846 31504 HiveMetaStore.java:404] ugi=impala/
db-87.photo.163@hadoop.hz.netease.com ip=unknown-ip-addr cmd=get_table
: db=impala_test tbl=sales_fact_1997
I1013 16:04:51.935062 31504 PerfLogger.java:168] 
I1013 16:04:51.935269 31113 TableLoadingMgr.java:285] Loading next table
from queue: impala_test.sales_fact_1997
I1013 16:04:51.935329 31113 TableLoadingMgr.java:287] Remaining items in
queue: 0. Loads in progress: 0
I1013 16:04:51.935458 31123 TableLoader.java:58] Loading metadata for:
impala_test.sales_fact_1997
I1013 16:04:51.935539 31123 PerfLogger.java:127] 
I1013 16:04:51.935638 31123 HiveMetaStore.java:835] 1: get_table :
db=impala_test tbl=sales_fact_1997
I1013 16:04:51.935750 31123 HiveMetaStore.java:404] ugi=impala/
db-87.photo.163@hadoop.hz.netease.com ip=unknown-ip-addr cmd=get_table
: db=impala_test tbl=sales_fact_1997
I1013 16:04:51.940784 31123 PerfLogger.java:168] 
I1013 16:04:51.940942 31123 PerfLogger.java:127] 
I1013 16:04:51.941056 31123 HiveMetaStore.java:835] 1: get_config_value:
name=hive.exec.default.partition.name
defaultValue=__HIVE_DEFAULT_PARTITION__
I1013 16:04:51.941190 31123 HiveMetaStore.java:404] ugi=impala/
db-87.photo.163@hadoop.hz.netease.com ip=unknown-ip-addr
cmd=get_config_value:
name=hive.exec.default.partition.name
defaultValue=__HIVE_DEFAULT_PARTITION__
I1013 16:04:51.941308 31123 PerfLogger.java:168] 
I1013 16:04:51.941398 31123 PerfLogger.java:127] 
I1013 16:04:51.941498 31123 HiveMetaStore.java:835] 1:
get_table_statistics_req: db=impala_test table=sales_fact_1997
I1013 16:04:51.941601 31123 HiveMetaStore.java:404] ugi=impala/
db-87.photo.163@hadoop.hz.netease.com ip=unknown-ip-addr
cmd=get_table_statistics_req:
db=impala_test table=sales_fact_1997
I1013 16:04:51.946686 31123 PerfLogger.java:168] 
I1013 16:04:51.946748 31123 HdfsTable.java:984] Fetching partition metadata
from the Metastore: impala_test.sales_fact_1997
I1013 16:04:51.946826 31123 PerfLogger.java:127] 
I1013 16:04:51.946928 31123 HiveMetaStore.java:835] 1: get_partition_names
: db=impala_test tbl=sales_fact_1997
I1013 16:04:51.947038 31123 HiveMetaStore.java:404] ugi=impala/
db-87.photo.163@hadoop.hz.netease.com ip=unknown-ip-addr
cmd=get_partition_names
: db=impala_test tbl=sales_fact_1997
I1013 16:04:51.948602 31123 PerfLogger.java:168] 
I1013 16:04:51.948664 31123 HdfsTable.java:988] Fetched partition metadata
from the Metastore: impala_test.sales_fact_1997
I1013 16:04:51.957386 31123 HdfsTable.java:674] Loading file and block
metadata for 1 paths: impala_test.sales_fact_1997
I1013 16:04:51.958546 31123 HdfsTable.java:679] Loaded file and block
metadata for 1 paths: impala_test.sales_fact_1997
I1013 16:04:51.958627 31123 TableLoader.java:97] Loaded metadata for:
impala_test.sales_fact_1997
I1013 16:04:52.328815 31139 catalog-server.cc:320] Publishing update:
TABLE:impala_test.sales_fact_1997@1364
I1013 16:04:52.328985 31139 catalog-server.cc:320] Publishing update:
CATALOG:b11b2a23ca4a496d:8e99307f267d4d54@1364



I trace the code in TableLoader.load, msTbl =
msClient.getHiveClient().getTable(db.getName(), tblName); return the table,
but msTbl.getSd().getCols() is empty, and I had tested in my code which get
table's metastore twice and between them, I alter table in hive-cli,  the
result of the second operation is incorrect, So, I think it is a bug of
hive MetaStoreClient .



2017-10-13 13:10 GMT+08:00 Jeszy :

> Hey Yu,
>
> I tried to reproduce on a CDH5.13 cluster, but your exact commands
> work as expected for me. Are you using Impala 2.10 on a CDH5.13
> cluster, or something else? Can you share your catalog and Hive
> metastore logs?
>
> Thanks.
>
> On 12 October 2017 at 19:39, yu feng  wrote:
> > I try to use ' invalidate metadata' for the whole catalog, But the
> modified
> > table is still empty.  I am doubt the only way is restart catalogd.
> >
> > BTW, I test with the newest version(2.10.0)
> >
> > 2017-10-13 0:17 GMT+08:00 Jeszy :
> >
> >> This does sound like a bug. What version are you using? Do you see any
> >> errors in the catalog logs?
> >> I think a global invalidate metadata should work, and it's a bit less
> >> intrusive than a catalog restart. In general, it is a good idea to do
> >> all metadata operations from Impala if you are using Impala at all, it
> >> helps a lot in making metadata operations seamless.
> >>
> >> On 12 October 2017 at 02:53, yu feng  wrote:
> >> > In our scene, users always do metadata 

Re: show create table return empty after change column name in hive

2017-10-12 Thread Jeszy
Hey Yu,

I tried to reproduce on a CDH5.13 cluster, but your exact commands
work as expected for me. Are you using Impala 2.10 on a CDH5.13
cluster, or something else? Can you share your catalog and Hive
metastore logs?

Thanks.

On 12 October 2017 at 19:39, yu feng  wrote:
> I try to use ' invalidate metadata' for the whole catalog, But the modified
> table is still empty.  I am doubt the only way is restart catalogd.
>
> BTW, I test with the newest version(2.10.0)
>
> 2017-10-13 0:17 GMT+08:00 Jeszy :
>
>> This does sound like a bug. What version are you using? Do you see any
>> errors in the catalog logs?
>> I think a global invalidate metadata should work, and it's a bit less
>> intrusive than a catalog restart. In general, it is a good idea to do
>> all metadata operations from Impala if you are using Impala at all, it
>> helps a lot in making metadata operations seamless.
>>
>> On 12 October 2017 at 02:53, yu feng  wrote:
>> > In our scene, users always do metadata modifications in hive, and do some
>> > query in impala.
>> >
>> > 2017-10-12 16:31 GMT+08:00 sky :
>> >
>> >> Why is the second step performed in hive, not impala?
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> At 2017-10-12 15:12:38, "yu feng"  wrote:
>> >> >I open impala-shell and hive-cli.
>> >> >1、execute 'show create table impala_test.sales_fact_1997' in
>> impala-shell
>> >> ,
>> >> >return :
>> >> >
>> >> >+--
>> >> -+
>> >> >| result
>> >> > |
>> >> >+--
>> >> -+
>> >> >| CREATE TABLE impala_test.sales_fact_1997 (
>> >> > |
>> >> >|   product_id INT,
>> >> >|
>> >> >|   time_id INT,
>> >> > |
>> >> >|   customer_id INT,
>> >> > |
>> >> >|   promotion_id INT,
>> >> >|
>> >> >|   store_id INT,
>> >> >|
>> >> >|   store_sales DOUBLE,
>> >> >|
>> >> >|   store_cost DOUBLE,
>> >> > |
>> >> >|   unit_sales DOUBLE
>> >> >|
>> >> >| )
>> >> >|
>> >> >|  COMMENT 'Imported by sqoop on 2017/06/09 20:25:40'
>> >> >|
>> >> >| ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED
>> BY
>> >> >'\n'   |
>> >> >| WITH SERDEPROPERTIES ('field.delim'='\u0001', 'line.delim'='\n',
>> >> >'serialization.format'='\u0001') |
>> >> >| STORED AS PARQUET
>> >> >|
>> >> >| LOCATION
>> >> >'hdfs://hz-cluster1/user/nrpt/hive-server/impala_test.
>> db/sales_fact_1997'
>> >> > |
>> >> >| TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='true', 'numFiles'='3',
>> >> >'numRows'='10', 'rawDataSize'='80', 'totalSize'='1619937') |
>> >> >+--
>> >> -+
>> >> >
>> >> >2、execute 'alter table impala_test.sales_fact_1997 change column
>> >> product_id
>> >> >pproduct_id int;'  in hive -cli, return OK.
>> >> >3、execute 'invalidate metadata impala_test.sales_fact_1997 '.
>> >> >4、execute 'show create table impala_test.sales_fact_1997' again in
>> >> >impala-shell, return :
>> >> >
>> >> >+--
>> >> -+
>> >> >| result
>> >> > |
>> >> >+--
>> >> -+
>> >> >| CREATE TABLE impala_test.sales_fact_1997
>> >> > |
>> >> >|  COMMENT 'Imported by sqoop on 2017/06/09 20:25:40'
>> >> >|
>> >> >| ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED
>> BY
>> >> >'\n'   |
>> >> >| WITH SERDEPROPERTIES ('field.delim'='\u0001', 'line.delim'='\n',
>> >> >'serialization.format'='\u0001') |
>> >> >| STORED AS PARQUET
>> >> >|
>> >> >| LOCATION
>> >> >'hdfs://hz-cluster1/user/nrpt/hive-server/impala_test.
>> 

Re: show create table return empty after change column name in hive

2017-10-12 Thread yu feng
I try to use ' invalidate metadata' for the whole catalog, But the modified
table is still empty.  I am doubt the only way is restart catalogd.

BTW, I test with the newest version(2.10.0)

2017-10-13 0:17 GMT+08:00 Jeszy :

> This does sound like a bug. What version are you using? Do you see any
> errors in the catalog logs?
> I think a global invalidate metadata should work, and it's a bit less
> intrusive than a catalog restart. In general, it is a good idea to do
> all metadata operations from Impala if you are using Impala at all, it
> helps a lot in making metadata operations seamless.
>
> On 12 October 2017 at 02:53, yu feng  wrote:
> > In our scene, users always do metadata modifications in hive, and do some
> > query in impala.
> >
> > 2017-10-12 16:31 GMT+08:00 sky :
> >
> >> Why is the second step performed in hive, not impala?
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> At 2017-10-12 15:12:38, "yu feng"  wrote:
> >> >I open impala-shell and hive-cli.
> >> >1、execute 'show create table impala_test.sales_fact_1997' in
> impala-shell
> >> ,
> >> >return :
> >> >
> >> >+--
> >> -+
> >> >| result
> >> > |
> >> >+--
> >> -+
> >> >| CREATE TABLE impala_test.sales_fact_1997 (
> >> > |
> >> >|   product_id INT,
> >> >|
> >> >|   time_id INT,
> >> > |
> >> >|   customer_id INT,
> >> > |
> >> >|   promotion_id INT,
> >> >|
> >> >|   store_id INT,
> >> >|
> >> >|   store_sales DOUBLE,
> >> >|
> >> >|   store_cost DOUBLE,
> >> > |
> >> >|   unit_sales DOUBLE
> >> >|
> >> >| )
> >> >|
> >> >|  COMMENT 'Imported by sqoop on 2017/06/09 20:25:40'
> >> >|
> >> >| ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED
> BY
> >> >'\n'   |
> >> >| WITH SERDEPROPERTIES ('field.delim'='\u0001', 'line.delim'='\n',
> >> >'serialization.format'='\u0001') |
> >> >| STORED AS PARQUET
> >> >|
> >> >| LOCATION
> >> >'hdfs://hz-cluster1/user/nrpt/hive-server/impala_test.
> db/sales_fact_1997'
> >> > |
> >> >| TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='true', 'numFiles'='3',
> >> >'numRows'='10', 'rawDataSize'='80', 'totalSize'='1619937') |
> >> >+--
> >> -+
> >> >
> >> >2、execute 'alter table impala_test.sales_fact_1997 change column
> >> product_id
> >> >pproduct_id int;'  in hive -cli, return OK.
> >> >3、execute 'invalidate metadata impala_test.sales_fact_1997 '.
> >> >4、execute 'show create table impala_test.sales_fact_1997' again in
> >> >impala-shell, return :
> >> >
> >> >+--
> >> -+
> >> >| result
> >> > |
> >> >+--
> >> -+
> >> >| CREATE TABLE impala_test.sales_fact_1997
> >> > |
> >> >|  COMMENT 'Imported by sqoop on 2017/06/09 20:25:40'
> >> >|
> >> >| ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED
> BY
> >> >'\n'   |
> >> >| WITH SERDEPROPERTIES ('field.delim'='\u0001', 'line.delim'='\n',
> >> >'serialization.format'='\u0001') |
> >> >| STORED AS PARQUET
> >> >|
> >> >| LOCATION
> >> >'hdfs://hz-cluster1/user/nrpt/hive-server/impala_test.
> db/sales_fact_1997'
> >> > |
> >> >| TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='true', 'numFiles'='3',
> >> >'numRows'='10', 'rawDataSize'='80', 'totalSize'='1619937') |
> >> >+--
> >> -+
> >> >
> >> >all columns disappear, the column change will correct if I 

Re: show create table return empty after change column name in hive

2017-10-12 Thread Jeszy
This does sound like a bug. What version are you using? Do you see any
errors in the catalog logs?
I think a global invalidate metadata should work, and it's a bit less
intrusive than a catalog restart. In general, it is a good idea to do
all metadata operations from Impala if you are using Impala at all, it
helps a lot in making metadata operations seamless.

On 12 October 2017 at 02:53, yu feng  wrote:
> In our scene, users always do metadata modifications in hive, and do some
> query in impala.
>
> 2017-10-12 16:31 GMT+08:00 sky :
>
>> Why is the second step performed in hive, not impala?
>>
>>
>>
>>
>>
>>
>>
>>
>> At 2017-10-12 15:12:38, "yu feng"  wrote:
>> >I open impala-shell and hive-cli.
>> >1、execute 'show create table impala_test.sales_fact_1997' in impala-shell
>> ,
>> >return :
>> >
>> >+--
>> -+
>> >| result
>> > |
>> >+--
>> -+
>> >| CREATE TABLE impala_test.sales_fact_1997 (
>> > |
>> >|   product_id INT,
>> >|
>> >|   time_id INT,
>> > |
>> >|   customer_id INT,
>> > |
>> >|   promotion_id INT,
>> >|
>> >|   store_id INT,
>> >|
>> >|   store_sales DOUBLE,
>> >|
>> >|   store_cost DOUBLE,
>> > |
>> >|   unit_sales DOUBLE
>> >|
>> >| )
>> >|
>> >|  COMMENT 'Imported by sqoop on 2017/06/09 20:25:40'
>> >|
>> >| ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED BY
>> >'\n'   |
>> >| WITH SERDEPROPERTIES ('field.delim'='\u0001', 'line.delim'='\n',
>> >'serialization.format'='\u0001') |
>> >| STORED AS PARQUET
>> >|
>> >| LOCATION
>> >'hdfs://hz-cluster1/user/nrpt/hive-server/impala_test.db/sales_fact_1997'
>> > |
>> >| TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='true', 'numFiles'='3',
>> >'numRows'='10', 'rawDataSize'='80', 'totalSize'='1619937') |
>> >+--
>> -+
>> >
>> >2、execute 'alter table impala_test.sales_fact_1997 change column
>> product_id
>> >pproduct_id int;'  in hive -cli, return OK.
>> >3、execute 'invalidate metadata impala_test.sales_fact_1997 '.
>> >4、execute 'show create table impala_test.sales_fact_1997' again in
>> >impala-shell, return :
>> >
>> >+--
>> -+
>> >| result
>> > |
>> >+--
>> -+
>> >| CREATE TABLE impala_test.sales_fact_1997
>> > |
>> >|  COMMENT 'Imported by sqoop on 2017/06/09 20:25:40'
>> >|
>> >| ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED BY
>> >'\n'   |
>> >| WITH SERDEPROPERTIES ('field.delim'='\u0001', 'line.delim'='\n',
>> >'serialization.format'='\u0001') |
>> >| STORED AS PARQUET
>> >|
>> >| LOCATION
>> >'hdfs://hz-cluster1/user/nrpt/hive-server/impala_test.db/sales_fact_1997'
>> > |
>> >| TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='true', 'numFiles'='3',
>> >'numRows'='10', 'rawDataSize'='80', 'totalSize'='1619937') |
>> >+--
>> -+
>> >
>> >all columns disappear, the column change will correct if I restart
>> >catalogd, I think it is a BUG caused by hive metastore client, It is any
>> >good idea overcome the problem except restart catalogd.
>> >
>> > I think we can check columns after getTable from HiveMetastoreClient, if
>> >it is empty, try to recreate the HiveMetastoreClient(hive do not support
>> >0-column table). is it a good way to overcome the problem if modify code
>> >like this?
>>


Re: show create table return empty after change column name in hive

2017-10-12 Thread yu feng
In our scene, users always do metadata modifications in hive, and do some
query in impala.

2017-10-12 16:31 GMT+08:00 sky :

> Why is the second step performed in hive, not impala?
>
>
>
>
>
>
>
>
> At 2017-10-12 15:12:38, "yu feng"  wrote:
> >I open impala-shell and hive-cli.
> >1、execute 'show create table impala_test.sales_fact_1997' in impala-shell
> ,
> >return :
> >
> >+--
> -+
> >| result
> > |
> >+--
> -+
> >| CREATE TABLE impala_test.sales_fact_1997 (
> > |
> >|   product_id INT,
> >|
> >|   time_id INT,
> > |
> >|   customer_id INT,
> > |
> >|   promotion_id INT,
> >|
> >|   store_id INT,
> >|
> >|   store_sales DOUBLE,
> >|
> >|   store_cost DOUBLE,
> > |
> >|   unit_sales DOUBLE
> >|
> >| )
> >|
> >|  COMMENT 'Imported by sqoop on 2017/06/09 20:25:40'
> >|
> >| ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED BY
> >'\n'   |
> >| WITH SERDEPROPERTIES ('field.delim'='\u0001', 'line.delim'='\n',
> >'serialization.format'='\u0001') |
> >| STORED AS PARQUET
> >|
> >| LOCATION
> >'hdfs://hz-cluster1/user/nrpt/hive-server/impala_test.db/sales_fact_1997'
> > |
> >| TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='true', 'numFiles'='3',
> >'numRows'='10', 'rawDataSize'='80', 'totalSize'='1619937') |
> >+--
> -+
> >
> >2、execute 'alter table impala_test.sales_fact_1997 change column
> product_id
> >pproduct_id int;'  in hive -cli, return OK.
> >3、execute 'invalidate metadata impala_test.sales_fact_1997 '.
> >4、execute 'show create table impala_test.sales_fact_1997' again in
> >impala-shell, return :
> >
> >+--
> -+
> >| result
> > |
> >+--
> -+
> >| CREATE TABLE impala_test.sales_fact_1997
> > |
> >|  COMMENT 'Imported by sqoop on 2017/06/09 20:25:40'
> >|
> >| ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED BY
> >'\n'   |
> >| WITH SERDEPROPERTIES ('field.delim'='\u0001', 'line.delim'='\n',
> >'serialization.format'='\u0001') |
> >| STORED AS PARQUET
> >|
> >| LOCATION
> >'hdfs://hz-cluster1/user/nrpt/hive-server/impala_test.db/sales_fact_1997'
> > |
> >| TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='true', 'numFiles'='3',
> >'numRows'='10', 'rawDataSize'='80', 'totalSize'='1619937') |
> >+--
> -+
> >
> >all columns disappear, the column change will correct if I restart
> >catalogd, I think it is a BUG caused by hive metastore client, It is any
> >good idea overcome the problem except restart catalogd.
> >
> > I think we can check columns after getTable from HiveMetastoreClient, if
> >it is empty, try to recreate the HiveMetastoreClient(hive do not support
> >0-column table). is it a good way to overcome the problem if modify code
> >like this?
>


show create table return empty after change column name in hive

2017-10-12 Thread yu feng
I open impala-shell and hive-cli.
1、execute 'show create table impala_test.sales_fact_1997' in impala-shell ,
return :

+---+
| result
 |
+---+
| CREATE TABLE impala_test.sales_fact_1997 (
 |
|   product_id INT,
|
|   time_id INT,
 |
|   customer_id INT,
 |
|   promotion_id INT,
|
|   store_id INT,
|
|   store_sales DOUBLE,
|
|   store_cost DOUBLE,
 |
|   unit_sales DOUBLE
|
| )
|
|  COMMENT 'Imported by sqoop on 2017/06/09 20:25:40'
|
| ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED BY
'\n'   |
| WITH SERDEPROPERTIES ('field.delim'='\u0001', 'line.delim'='\n',
'serialization.format'='\u0001') |
| STORED AS PARQUET
|
| LOCATION
'hdfs://hz-cluster1/user/nrpt/hive-server/impala_test.db/sales_fact_1997'
 |
| TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='true', 'numFiles'='3',
'numRows'='10', 'rawDataSize'='80', 'totalSize'='1619937') |
+---+

2、execute 'alter table impala_test.sales_fact_1997 change column product_id
pproduct_id int;'  in hive -cli, return OK.
3、execute 'invalidate metadata impala_test.sales_fact_1997 '.
4、execute 'show create table impala_test.sales_fact_1997' again in
impala-shell, return :

+---+
| result
 |
+---+
| CREATE TABLE impala_test.sales_fact_1997
 |
|  COMMENT 'Imported by sqoop on 2017/06/09 20:25:40'
|
| ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED BY
'\n'   |
| WITH SERDEPROPERTIES ('field.delim'='\u0001', 'line.delim'='\n',
'serialization.format'='\u0001') |
| STORED AS PARQUET
|
| LOCATION
'hdfs://hz-cluster1/user/nrpt/hive-server/impala_test.db/sales_fact_1997'
 |
| TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='true', 'numFiles'='3',
'numRows'='10', 'rawDataSize'='80', 'totalSize'='1619937') |
+---+

all columns disappear, the column change will correct if I restart
catalogd, I think it is a BUG caused by hive metastore client, It is any
good idea overcome the problem except restart catalogd.

 I think we can check columns after getTable from HiveMetastoreClient, if
it is empty, try to recreate the HiveMetastoreClient(hive do not support
0-column table). is it a good way to overcome the problem if modify code
like this?