[ https://issues.apache.org/jira/browse/IMPALA-8883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949401#comment-16949401 ]
Attila Jeges commented on IMPALA-8883: -------------------------------------- INSERT: Updating the table/partition numRows stats after an INSERT with the number of newly added rows is currently not possible. To implement this feature properly we would need to retrieve the table/partition numRows stats that correspond to the current valid write id list. I couldn't find anything in the HMS API to support this. TRUNCATE Updating the table/partition numRows stats after a TRUNCATE is probably possible: - Currently TRUNCATE acquires an exclusive lock on the table. - Table/property numRows stats have to be reset to 0. No need to retrieve the "previous" stats. > Update statistics of ACID tables during writes > ---------------------------------------------- > > Key: IMPALA-8883 > URL: https://issues.apache.org/jira/browse/IMPALA-8883 > Project: IMPALA > Issue Type: Improvement > Reporter: Zoltán Borók-Nagy > Assignee: Attila Jeges > Priority: Major > Labels: impala-acid > > When Impala INSERTs or TRUNCATEs an ACID table it simply removes the > COLUMN_STATS_ACCURATE property to invalidate the statistics in order to > prevent Hive using it. > Instead of it Impala should properly update the statistics. It should be > relatively simple for TRUNCATE since it erases all the data, but a bit more > complicated for INSERT, e.g.: > * Properly update _number of distinct values_ > * INSERT OVERWRITE partition should properly update table level _number of > rows_. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org