[ 
https://issues.apache.org/jira/browse/HIVE-19416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510366#comment-16510366
 ] 

Steve Yeom commented on HIVE-19416:
-----------------------------------

Thank you Sergey for your comments. I really appreciate those. 

1. I will add an "API spec" shortly to be clear or to be specific on 
reader/updater. 
  But in short, 
  1.1 updater with alter... API method checks and remove 
COLUMN_STATS_ACCURATE(CSA) for the table or partition
    from the metastore. This is a global operation or metastore-wise persistent 
operation. 
    One case is concurrent insert case.
  1.2 reader checks whether the currently existing stats are 
snapshot-isolation-level-compliant with the calling query and sets
    CSA for the return object. Note that this is query by query return object. 
Since the reader side check is per-query.
2. I will clear TODOs.
3. Regarding taking snapshot info
  3.1. I think the Hive code is in the process of being changed to acquire 
locks at the beginning of query optimization of a query before 
    taking a database snapshot. So if we are depending on locks for logical 
consistency of Metadata objects, then we may have an issue.
  3.2 Just FYI. 
    The concerned table-write-id-list getting locations in StatsOptimizer is 
only used by a reader (not writer of a table/partition) so we don't 
    have the case where the current query is to increment write id of its 
target tables. I.e., a global database snapshot is already taken 
    before starting optimization and the code changes for this jira in 
StatsOptimizer is getting the info from the Metastore which is supposed to
    guarantee Committed Read isolation there in the Metastore DBMS access.


> Create single version transactional table metastore statistics for 
> aggregation queries
> --------------------------------------------------------------------------------------
>
>                 Key: HIVE-19416
>                 URL: https://issues.apache.org/jira/browse/HIVE-19416
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>            Reporter: Steve Yeom
>            Assignee: Steve Yeom
>            Priority: Major
>
> The system should use only statistics for aggregation queries like count on 
> transactional tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to