from:"Alexander Behm \(JIRA\)"

[jira] [Created] (HIVE-6308) COLUMNS_V2 Metastore table not populated for tables created without an explicit column list.

2014-01-24 Thread Alexander Behm (JIRA)

Alexander Behm created HIVE-6308:


 Summary: COLUMNS_V2 Metastore table not populated for tables 
created without an explicit column list.
 Key: HIVE-6308
 URL: https://issues.apache.org/jira/browse/HIVE-6308
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema
Affects Versions: 0.10.0
Reporter: Alexander Behm


Consider this example table:

CREATE TABLE avro_test
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED as INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
TBLPROPERTIES (
'avro.schema.url'='file:///path/to/the/schema/test_serializer.avsc');

When I try to run an ANALYZE TABLE for computing column stats on any of the 
columns, then I get:

org.apache.hadoop.hive.ql.metadata.HiveException: 
NoSuchObjectException(message:Column o_orderpriority for which stats gathering 
is requested doesn't exist.)
at 
org.apache.hadoop.hive.ql.metadata.Hive.updateTableColumnStatistics(Hive.java:2280)
at 
org.apache.hadoop.hive.ql.exec.ColumnStatsTask.persistTableStats(ColumnStatsTask.java:331)
at 
org.apache.hadoop.hive.ql.exec.ColumnStatsTask.execute(ColumnStatsTask.java:343)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1383)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1169)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:982)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

The root cause appears to be that the COLUMNS_V2 table in the Metastore isn't 
populated properly during the table creation.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-5264) SQL generated by MetaStoreDirectSql.java not compliant with Postgres.

2013-09-10 Thread Alexander Behm (JIRA)

Alexander Behm created HIVE-5264:


 Summary: SQL generated by MetaStoreDirectSql.java not compliant 
with Postgres.
 Key: HIVE-5264
 URL: https://issues.apache.org/jira/browse/HIVE-5264
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.11.0
 Environment: Ubuntu 12.04
PostgreSQL 9.1.8
Reporter: Alexander Behm


Some operations against the Hive Metastore seem broken
against Postgres.

For example, when using HiveMetastoreClient.listPartitions()
the Postgres logs show queries such as:

2013-09-09 19:10:01 PDT STATEMENT:  select PARTITIONS.PART_ID from
PARTITIONS  inner join TBLS on PARTITIONS.TBL_ID = TBLS.TBL_ID   inner
join DBS on TBLS.DB_ID = DBS.DB_ID  where TBLS.TBL_NAME = $1 and
DBS.NAME = $2 order by PART_NAME asc

with a somewhat cryptic (but correct) error:

ERROR:  relation "partitions" does not exist at character 32

Postgres identifiers are somewhat unusual. Unquoted identifiers are interpreted 
as lower case (there is no Postgres option to change this). Since the Metastore 
table schema uses upper case table names, the correct SQL requires escaped 
identifiers to those tables, i.e.,
select "PARTITIONS"."PART_ID" from "PARTITIONS"...

Hive sets metastore.try.direct.sql=true by default, so the above SQL is 
generated by hive/metastore/MetaStoreDirectSql.java, i.e., this is not a 
Datanucleus problem.

When I set metastore.try.direct.sql=false, then the Metastore backed by 
Postgres works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5264) SQL generated by MetaStoreDirectSql.java not compliant with Postgres.

2013-09-10 Thread Alexander Behm (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13763767#comment-13763767
 ] 

Alexander Behm commented on HIVE-5264:
--

Thanks, Sergey. I don't know when the change went in, so you are probably 
correct about the version.

Afaik, Postgres does not have an ANSI-compliance flag or any other means to 
change that specific behavior with identifiers.

Just trying to be a good citizen so I reported the issue. I understand the 
difficulties and I'm not suggesting any particular fix.
Setting metastore.try.direct.sql=false is an acceptable workaround for me (but 
others may disagree).

The fallback didn't work for me. I think the reason may be that the original 
(failed) SQL statement and the fallback are executed under the same transaction 
(see ObjectStore.java getPartitionsByNamesInternal()), so the fallback is also 
aborted, see this snippet from the Postgres log immediately after the error 
message:
2013-09-09 19:10:01 PDT ERROR:  current transaction is aborted, commands 
ignored until end of transaction block

I didn't dig too deep into the code so don't take my word for it.

> SQL generated by MetaStoreDirectSql.java not compliant with Postgres.
> -
>
> Key: HIVE-5264
> URL: https://issues.apache.org/jira/browse/HIVE-5264
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.11.0
> Environment: Ubuntu 12.04
> PostgreSQL 9.1.8
>Reporter: Alexander Behm
>
> Some operations against the Hive Metastore seem broken
> against Postgres.
> For example, when using HiveMetastoreClient.listPartitions()
> the Postgres logs show queries such as:
> 2013-09-09 19:10:01 PDT STATEMENT:  select PARTITIONS.PART_ID from
> PARTITIONS  inner join TBLS on PARTITIONS.TBL_ID = TBLS.TBL_ID   inner
> join DBS on TBLS.DB_ID = DBS.DB_ID  where TBLS.TBL_NAME = $1 and
> DBS.NAME = $2 order by PART_NAME asc
> with a somewhat cryptic (but correct) error:
> ERROR:  relation "partitions" does not exist at character 32
> Postgres identifiers are somewhat unusual. Unquoted identifiers are 
> interpreted as lower case (there is no Postgres option to change this). Since 
> the Metastore table schema uses upper case table names, the correct SQL 
> requires escaped identifiers to those tables, i.e.,
> select "PARTITIONS"."PART_ID" from "PARTITIONS"...
> Hive sets metastore.try.direct.sql=true by default, so the above SQL is 
> generated by hive/metastore/MetaStoreDirectSql.java, i.e., this is not a 
> Datanucleus problem.
> When I set metastore.try.direct.sql=false, then the Metastore backed by 
> Postgres works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5264) SQL generated by MetaStoreDirectSql.java not compliant with Postgres.

2013-09-10 Thread Alexander Behm (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13763791#comment-13763791
 ] 

Alexander Behm commented on HIVE-5264:
--

The logs seem to suggest so.

> SQL generated by MetaStoreDirectSql.java not compliant with Postgres.
> -
>
> Key: HIVE-5264
> URL: https://issues.apache.org/jira/browse/HIVE-5264
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.12.0
> Environment: Ubuntu 12.04
> PostgreSQL 9.1.8
>Reporter: Alexander Behm
>Assignee: Sergey Shelukhin
>
> Some operations against the Hive Metastore seem broken
> against Postgres.
> For example, when using HiveMetastoreClient.listPartitions()
> the Postgres logs show queries such as:
> 2013-09-09 19:10:01 PDT STATEMENT:  select PARTITIONS.PART_ID from
> PARTITIONS  inner join TBLS on PARTITIONS.TBL_ID = TBLS.TBL_ID   inner
> join DBS on TBLS.DB_ID = DBS.DB_ID  where TBLS.TBL_NAME = $1 and
> DBS.NAME = $2 order by PART_NAME asc
> with a somewhat cryptic (but correct) error:
> ERROR:  relation "partitions" does not exist at character 32
> Postgres identifiers are somewhat unusual. Unquoted identifiers are 
> interpreted as lower case (there is no Postgres option to change this). Since 
> the Metastore table schema uses upper case table names, the correct SQL 
> requires escaped identifiers to those tables, i.e.,
> select "PARTITIONS"."PART_ID" from "PARTITIONS"...
> Hive sets metastore.try.direct.sql=true by default, so the above SQL is 
> generated by hive/metastore/MetaStoreDirectSql.java, i.e., this is not a 
> Datanucleus problem.
> When I set metastore.try.direct.sql=false, then the Metastore backed by 
> Postgres works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4022) Structs and struct fields cannot be NULL in INSERT statements

2015-01-07 Thread Alexander Behm (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268625#comment-14268625
 ] 

Alexander Behm commented on HIVE-4022:
--

Easier workaround:
IF(false, named_struct("a", 1), NULL)


> Structs and struct fields cannot be NULL in INSERT statements
> -
>
> Key: HIVE-4022
> URL: https://issues.apache.org/jira/browse/HIVE-4022
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Michael Malak
>
> Originally thought to be Avro-specific, and first noted with respect to 
> HIVE-3528 "Avro SerDe doesn't handle serializing Nullable types that require 
> access to a Schema", it turns out even native Hive tables cannot store NULL 
> in a STRUCT field or for the entire STRUCT itself, at least when the NULL is 
> specified directly in the INSERT statement.
> Again, this affects both Avro-backed tables and native Hive tables.
> ***For native Hive tables:
> The following:
> echo 1,2 >twovalues.csv
> hive
> CREATE TABLE tc (x INT, y INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
> LOAD DATA LOCAL INPATH 'twovalues.csv' INTO TABLE tc;
> CREATE TABLE oc (z STRUCT);
> INSERT INTO TABLE oc SELECT null FROM tc;
> produces the error
> FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target 
> table because column number/types are different 'oc': Cannot convert column 0 
> from void to struct.
> The following:
> INSERT INTO TABLE oc SELECT named_struct('a', null, 'b', null) FROM tc;
> produces the error:
> FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target 
> table because column number/types are different 'oc': Cannot convert column 0 
> from struct to struct.
> ***For Avro:
> In HIVE-3528, there is in fact a null-struct test case in line 14 of
> https://github.com/apache/hive/blob/15cc604bf10f4c2502cb88fb8bb3dcd45647cf2c/data/files/csv.txt
> The test script at
> https://github.com/apache/hive/blob/12d6f3e7d21f94e8b8490b7c6d291c9f4cac8a4f/ql/src/test/queries/clientpositive/avro_nullable_fields.q
> does indeed work.  But in that test, the query gets all of its data from a 
> test table verbatim:
> INSERT OVERWRITE TABLE as_avro SELECT * FROM test_serializer;
> If instead we stick in a hard-coded null for the struct directly into the 
> query, it fails:
> INSERT OVERWRITE TABLE as_avro SELECT string1, int1, tinyint1, smallint1, 
> bigint1, boolean1, float1, double1, list1, map1, null, enum1, nullableint, 
> bytes1, fixed1 FROM test_serializer;
> with the following error:
> FAILED: SemanticException [Error 10044]: Line 1:23 Cannot insert into target 
> table because column number/types are different 'as_avro': Cannot convert 
> column 10 from void to struct.
> Note, though, that substituting a hard-coded null for string1 (and restoring 
> struct1 into the query) does work:
> INSERT OVERWRITE TABLE as_avro SELECT null, int1, tinyint1, smallint1, 
> bigint1, boolean1, float1, double1, list1, map1, struct1, enum1, nullableint, 
> bytes1, fixed1 FROM test_serializer;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-18743) CREATE TABLE on S3 data can be extremely slow. DO_NOT_UPDATE_STATS workaround is buggy.

2018-02-16 Thread Alexander Behm (JIRA)

Alexander Behm created HIVE-18743:
-

 Summary: CREATE TABLE on S3 data can be extremely slow. 
DO_NOT_UPDATE_STATS workaround is buggy.
 Key: HIVE-18743
 URL: https://issues.apache.org/jira/browse/HIVE-18743
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 1.1.0, 1.2.0
Reporter: Alexander Behm


When hive.stats.autogather=true then the Metastore lists all files under the 
table directory to populate basic stats like file counts and sizes. This file 
listing operation can be very expensive particularly on filesystems like S3.

One way to address this issue is to reconfigure hive.stats.autogather=false.

*Here's the bug*
It is my understanding that the DO_NOT_UPDATE_STATS table property is intended 
to selectively prevent this stats collection. Unfortunately, this table 
property is checked *after* the expensive file listing operation, so the 
DO_NOT_UPDATE_STATS does not seem to work as intended. See:

https://github.com/apache/hive/blob/master/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java#L633

Relevant code snippet:
{code}
  public static boolean updateTableStatsFast(Database db, Table tbl, Warehouse 
wh,
 boolean madeDir, boolean 
forceRecompute, EnvironmentContext environmentContext) throws MetaException {
if (tbl.getPartitionKeysSize() == 0) {
  // Update stats only when unpartitioned
  FileStatus[] fileStatuses = wh.getFileStatusesForUnpartitionedTable(db, 
tbl);
  return updateTableStatsFast(tbl, fileStatuses, madeDir, forceRecompute, 
environmentContext); <--- DO_NOT_UPDATE_STATS is checked in here after 
wh.getFileStatusesForUnpartitionedTable() has already been called
} else {
  return false;
}
  }
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-9720) Metastore does not properly migrate column stats when renaming a table across databases.

2015-02-18 Thread Alexander Behm (JIRA)

Alexander Behm created HIVE-9720:


 Summary: Metastore does not properly migrate column stats when 
renaming a table across databases.
 Key: HIVE-9720
 URL: https://issues.apache.org/jira/browse/HIVE-9720
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.1
Reporter: Alexander Behm


It appears that the Hive Metastore does not properly migrate column statistics 
when renaming a table across databases. While renaming across databases is not 
supported in HiveQL, it can be done via the Metastore Thrift API.
The problem is that such a newly renamed table cannot be dropped (unless 
renamed back to its original database/name).

Here are steps for reproducing the issue.

1. From the Hive shell/beeline:
{code}
create database db1;
create database db2;
create table db1.mv (i int);
use db1;
analyze table mv compute statistics for columns i;
{code}

2. From a Java program:
{code}
  public static void main(String[] args) throws Exception {
HiveConf conf = new HiveConf(MetaStoreClientPool.class);
HiveMetaStoreClient hiveClient = new HiveMetaStoreClient(conf);
Table t = hiveClient.getTable("db1", "mv");
t.setDbName("db2");
t.setTableName("mv2");
hiveClient.alter_table("db1", "mv", t);
  }
{code}

3. From the Hive shell/beeline:
{code}
drop table db2.mv2;
{code}

Stack shown when running 3:

{code}
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask. 
MetaException(message:javax.jdo.JDODataStoreException: Exception thrown 
flushing changes to datastore
at 
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
at 
org.datanucleus.api.jdo.JDOTransaction.commit(JDOTransaction.java:165)
at 
org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:411)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108)
at com.sun.proxy.$Proxy0.commitTransaction(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:1389)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:1525)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:106)
at com.sun.proxy.$Proxy1.drop_table_with_environment_context(Unknown 
Source)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_table_with_environment_context.getResult(ThriftHiveMetastore.java:8072)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_table_with_environment_context.getResult(ThriftHiveMetastore.java:8056)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hadoop.hive.metastore.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:48)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
NestedThrowablesStackTrace:
java.sql.BatchUpdateException: Batch entry 0 DELETE FROM "TBLS" WHERE 
"TBL_ID"='1621' was aborted.  Call getNextException to see the cause.
at 
org.postgresql.jdbc2.AbstractJdbc2Statement$BatchResultHandler.handleError(AbstractJdbc2Statement.java:2598)
at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1836)
at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:407)
at 
org.postgresql.jdbc2.AbstractJdbc2Statement.executeBatch(AbstractJdbc2Statement.java:2737)
at 
com.jolbox.bonecp.StatementHandle.executeBatch(StatementHandle.java:424)
at 
org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeBatch(ParamLoggingPreparedStatement.java:372)
at 
org.datanucleus.store.rdbms.SQLController.processConnectionStatement(SQLController.java:628)

[jira] [Created] (HIVE-10427) collect_list() and collect_set() should accept struct types as argument

2015-04-21 Thread Alexander Behm (JIRA)

Alexander Behm created HIVE-10427:
-

 Summary: collect_list() and collect_set() should accept struct 
types as argument
 Key: HIVE-10427
 URL: https://issues.apache.org/jira/browse/HIVE-10427
 Project: Hive
  Issue Type: Wish
  Components: UDF
Reporter: Alexander Behm


The collect_list() and collect_set() functions currently only accept scalar 
argument types. It would be very useful if these functions could also accept 
struct argument types for creating nested data from flat data.
For example, suppose I wanted to create a nested customers/orders table from 
two flat tables, customers and orders. Then it'd be very convenient to write 
something like this:

{code}
insert into table nested_customers_orders
select c.*, collect_list(named_struct("oid", o.oid, "order_date": o.date...))
from customers c inner join orders o on (c.cid = o.oid)
group by c.cid
{code}

Thanks you for your consideration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-11622) Creating an Avro table with a complex map-typed column leads to incorrect column type.

2015-08-21 Thread Alexander Behm (JIRA)

Alexander Behm created HIVE-11622:
-

 Summary: Creating an Avro table with a complex map-typed column 
leads to incorrect column type.
 Key: HIVE-11622
 URL: https://issues.apache.org/jira/browse/HIVE-11622
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema
Affects Versions: 1.1.0
Reporter: Alexander Behm


In the following CREATE TABLE the following map-typed column leads to the wrong 
type. I suspect some problem with inferring the Avro schema from the column 
definitions, but I am not sure.

Reproduction:
{code}
hive> create table t (c map>) stored as avro;
OK
Time taken: 0.101 seconds
hive> desc t;
OK
c   array>  from deserializer   
Time taken: 0.135 seconds, Fetched: 1 row(s)
{code}

Note how the type shown in DESCRIBE is not the type originally passed in the 
CREATE TABLE.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-15653) Some ALTER TABLE commands drop table stats

2017-01-17 Thread Alexander Behm (JIRA)

Alexander Behm created HIVE-15653:
-

 Summary: Some ALTER TABLE commands drop table stats
 Key: HIVE-15653
 URL: https://issues.apache.org/jira/browse/HIVE-15653
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.1.0
Reporter: Alexander Behm
Priority: Critical


Some ALTER TABLE commands drop the table stats. That may make sense for some 
ALTER TABLE operations, but certainly not for others. Personally, I I think 
ALTER TABLE should only change what was requested by the user without any side 
effects that may be unclear to users. In particular, collecting stats can be an 
expensive operation so it's rather inconvenient for users if they get wiped 
accidentally.

Repro:
{code}
create table t (i int);
insert into t values(1);
analyze table t compute statistics;
alter table t set tblproperties('test'='test');
hive> describe formatted t;
OK
# col_name  data_type   comment 
 
i   int 
 
# Detailed Table Information 
Database:   default  
Owner:  abehm
CreateTime: Tue Jan 17 18:13:34 PST 2017 
LastAccessTime: UNKNOWN  
Protect Mode:   None 
Retention:  0
Location:   hdfs://localhost:20500/test-warehouse/t  
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE   false   
last_modified_byabehm   
last_modified_time  1484705748  
numFiles1   
numRows -1  
rawDataSize -1  
testtest
totalSize   2   
transient_lastDdlTime   1484705748  
 
# Storage Information
SerDe Library:  org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
 
InputFormat:org.apache.hadoop.mapred.TextInputFormat 
OutputFormat:   
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
Compressed: No   
Num Buckets:-1   
Bucket Columns: []   
Sort Columns:   []   
Storage Desc Params: 
serialization.format1   
Time taken: 0.169 seconds, Fetched: 34 row(s)
{code}

The same behavior can be observed with several other ALTER TABLE commands.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-6308) COLUMNS_V2 Metastore table not populated for tables created without an explicit column list.

[jira] [Created] (HIVE-5264) SQL generated by MetaStoreDirectSql.java not compliant with Postgres.

[jira] [Commented] (HIVE-5264) SQL generated by MetaStoreDirectSql.java not compliant with Postgres.

[jira] [Commented] (HIVE-5264) SQL generated by MetaStoreDirectSql.java not compliant with Postgres.

[jira] [Commented] (HIVE-4022) Structs and struct fields cannot be NULL in INSERT statements

[jira] [Created] (HIVE-18743) CREATE TABLE on S3 data can be extremely slow. DO_NOT_UPDATE_STATS workaround is buggy.

[jira] [Created] (HIVE-9720) Metastore does not properly migrate column stats when renaming a table across databases.

[jira] [Created] (HIVE-10427) collect_list() and collect_set() should accept struct types as argument

[jira] [Created] (HIVE-11622) Creating an Avro table with a complex map-typed column leads to incorrect column type.

[jira] [Created] (HIVE-15653) Some ALTER TABLE commands drop table stats

10 matches

Site Navigation

Mail list logo

Footer information