This might be a bit far fetched but is there any plan for background
ANALYZE STATISTICS to be performed  on ORC tables for example when it does
compaction etc.

Also I notice that "desc formatted <table> does not show details of
statistics run time. Could that be added in future releases as I think it
will be useful cause a frequent question when the query is runnig slow is
to ask whether the stats are up-to-date on the underkying table(s).

hive> desc formatted nw_10124772;
OK
# col_name              data_type               comment
transactiondate         date
transactiontype         string
description             string
value                   double
balance                 double
accountname             string
accountnumber           int
# Detailed Table Information
Database:               accounts
Owner:                  hduser
CreateTime:             Sun Mar 27 17:29:53 BST 2016
LastAccessTime:         UNKNOWN
Retention:              0
Location:
hdfs://rhes564:9000/user/hive/warehouse/accounts.db/nw_10124772
Table Type:             MANAGED_TABLE
Table Parameters:

*       COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}*
comment                 from csv file from excel sheet
        numFiles                6
        numRows                 1447
        orc.compress            ZLIB
        rawDataSize             0
        totalSize               36537
        transient_lastDdlTime   1459121295


Thanks


Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com

Reply via email to