David Wayne Birdsall created TRAFODION-3316:
-----------------------------------------------

             Summary: Fix some issues with incremental UPDATE STATISTICS
                 Key: TRAFODION-3316
                 URL: https://issues.apache.org/jira/browse/TRAFODION-3316
             Project: Apache Trafodion
          Issue Type: Improvement
            Reporter: David Wayne Birdsall
            Assignee: David Wayne Birdsall


Fix some issues with incremental UPDATE STATISTICS.
 # The Hive TIMESTAMP datatype has precision 9. When processing a Hive 
TIMESTAMP column, UPDATE STATISTICS abends.
 # After fixing that, when testing a small example, the sampling ratio for the 
sample table is made incorrectly small, resulting in 9207 errors (sample is 
empty).
 # Sometimes, when populating the sample table, the UPSERT fails because 
garbage is used for the table name. This is because the table name is taken 
from an NAString that has gone out of scope.

The following test script can be used to test these conditions. In Hive, do the 
following:
{quote}create database if not exists h_increUpdatestats;
use h_increUpdatestats;
drop table if exists hive_sequencefile ;
create table hive_sequencefile(C_CUSTKEY int,C_NAME char(20),C_ADDRESS 
varchar(20), C_NATIONKEY int, C_TIMESTAMP timestamp,C_PHONE char(20), C_ACCTBAL 
string, 
C_MKTSEGMENT char(20), C_DATE date) stored as sequencefile;

insert into hive_sequencefile values 
(1,'x','xx',1,'2018-08-22 12:00:00.123','xxx','xxxx','xxxxx','2018-08-22'),
(2,'x','xx',2,'2018-08-23 12:00:00.123','xxx','xxxx','xxxxx','2018-08-23'),
(3,'x','xx',3,'2018-08-24 12:00:00.123','xxx','xxxx','xxxxx','2018-08-24'),
(4,'x','xx',4,'2018-08-25 12:00:00.123','xxx','xxxx','xxxxx','2018-08-25'),
(5,'x','xx',5,'2018-08-26 12:00:00.123','xxx','xxxx','xxxxx','2018-08-26');
{quote}
Then in Trafodion do the following:
{quote}update statistics for table hive.h_increUpdatestats.hive_sequencefile 
remove sample;
update statistics for table hive.h_increUpdatestats.hive_sequencefile create 
sample random 100 percent;
update statistics for table hive.h_increUpdatestats.hive_sequencefile on every 
column;
update statistics for table hive.h_increUpdatestats.hive_sequencefile on 
existing columns incremental where C_CUSTKEY>=0;
{quote}
Without any fixes, the third UPDATE STATISTICS command abends.

With a fix for the first problem, the last UPDATE STATISTICS command sometimes 
fails with error 2109, and always fails with error 9207.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to