Andre Araujo created IMPALA-7876:
------------------------------------

             Summary: COMPUTE STATS TABLESAMPLE is not updating number of 
estimated rows
                 Key: IMPALA-7876
                 URL: https://issues.apache.org/jira/browse/IMPALA-7876
             Project: IMPALA
          Issue Type: Bug
    Affects Versions: Impala 3.0
            Reporter: Andre Araujo


Running the command below seems to have no impact on the #rows stats.

{code}
[host:21000] default> COMPUTE STATS wide TABLESAMPLE SYSTEM(5);
Query: COMPUTE STATS wide TABLESAMPLE SYSTEM(100)
+-------------------------------------------+
| summary                                   |
+-------------------------------------------+
| Updated 1 partition(s) and 103 column(s). |
+-------------------------------------------+
WARNINGS: Ignoring TABLESAMPLE because the effective sampling rate is 100%.
The minimum sample size is COMPUTE_STATS_MIN_SAMPLE_SIZE=1.00GB and the table 
size 20.35GB
Fetched 1 row(s) in 43.67s

[host:21000] default> show table stats wide;
Query: show table stats wide
+-------+--------------+--------+---------+--------------+-------------------+---------+-------------------+-------------------------------------+
| #Rows | Extrap #Rows | #Files | Size    | Bytes Cached | Cache Replication | 
Format  | Incremental stats | Location                            |
+-------+--------------+--------+---------+--------------+-------------------+---------+-------------------+-------------------------------------+
| 0     | -1           | 84     | 20.35GB | NOT CACHED   | NOT CACHED        | 
PARQUET | false             | hdfs://ns1/user/hive/warehouse/wide |
+-------+--------------+--------+---------+--------------+-------------------+---------+-------------------+-------------------------------------+
Fetched 1 row(s) in 0.01s
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to