[ 
https://issues.apache.org/jira/browse/IMPALA-8883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949401#comment-16949401
 ] 

Attila Jeges commented on IMPALA-8883:
--------------------------------------

INSERT:
Updating the table/partition numRows stats after an INSERT with the number of 
newly added rows is currently not possible. To implement this feature properly 
we would need to retrieve the table/partition numRows stats that correspond to 
the current valid write id list.  I couldn't find anything in the HMS API to 
support this.

TRUNCATE
Updating the table/partition numRows stats after a TRUNCATE is probably 
possible:
- Currently TRUNCATE acquires an exclusive lock on the table.
- Table/property numRows  stats have to be reset to 0. No need to retrieve the 
"previous" stats.



> Update statistics of ACID tables during writes
> ----------------------------------------------
>
>                 Key: IMPALA-8883
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8883
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Attila Jeges
>            Priority: Major
>              Labels: impala-acid
>
> When Impala INSERTs or TRUNCATEs an ACID table it simply removes the 
> COLUMN_STATS_ACCURATE property to invalidate the statistics in order to 
> prevent Hive using it.
> Instead of it Impala should properly update the statistics. It should be 
> relatively simple for TRUNCATE since it erases all the data, but a bit more 
> complicated for INSERT, e.g.:
>  * Properly update _number of distinct values_
>  * INSERT OVERWRITE partition should properly update table level _number of 
> rows_.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to