This might be a bit far fetched but is there any plan for background
ANALYZE STATISTICS to be performed on ORC tables for example when it does
compaction etc.
Also I notice that "desc formatted <table> does not show details of
statistics run time. Could that be added in future releases as I think it
will be useful cause a frequent question when the query is runnig slow is
to ask whether the stats are up-to-date on the underkying table(s).
hive> desc formatted nw_10124772;
OK
# col_name data_type comment
transactiondate date
transactiontype string
description string
value double
balance double
accountname string
accountnumber int
# Detailed Table Information
Database: accounts
Owner: hduser
CreateTime: Sun Mar 27 17:29:53 BST 2016
LastAccessTime: UNKNOWN
Retention: 0
Location:
hdfs://rhes564:9000/user/hive/warehouse/accounts.db/nw_10124772
Table Type: MANAGED_TABLE
Table Parameters:
* COLUMN_STATS_ACCURATE {\"BASIC_STATS\":\"true\"}*
comment from csv file from excel sheet
numFiles 6
numRows 1447
orc.compress ZLIB
rawDataSize 0
totalSize 36537
transient_lastDdlTime 1459121295
Thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com