[ 
https://issues.apache.org/jira/browse/IMPALA-10976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17616720#comment-17616720
 ] 

ASF subversion and git services commented on IMPALA-10976:
----------------------------------------------------------

Commit b92a8ad745e2499103a55aeb5e58aa090aee05e6 in impala's branch 
refs/heads/branch-4.1.1 from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b92a8ad74 ]

IMPALA-11160: Ignore stale ALTER_PARTITION events on transactional tables

When applying ALTER_PARTITION events on transactional tables, we refresh
the partition using the metadata in events if
hms_event_incremental_refresh_transactional_table is enabled (which is
the default). This could be wrong if the ALTER_PARTITION event is stale.
The partition metadata will be rolled back to a stale state.

This patch compares the eventId with the createEventId of the table and
ignores those ALTER_PARTITION events that have older (smaller) event
ids. Note that we already do this for many other event types,
ALTER_PARTITION is somehow missing the checks.

Eventually we should depend on the lastSyncedEventId and replace
createEventId with it. The self-event detection can also be replaced
since self-events are also stale events. These will be addressed in
IMPALA-10976.

Tests
- Verified locally with local-catalog mode and event-processor enabled
  and iterated test_acid_compute_stats for 1400 times. Without the fix,
  the test would fail in tens of runs.

Change-Id: I5bb8cfc213093f3bbd0359c7084b277a3bd5264a
Reviewed-on: http://gerrit.cloudera.org:8080/19020
Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Reviewed-on: http://gerrit.cloudera.org:8080/19126
Reviewed-by: Csaba Ringhofer <csringho...@cloudera.com>
Tested-by: Quanlong Huang <huangquanl...@gmail.com>


> Sync db/table in catalogd to latest HMS event id for all DDLs from Impala 
> shell
> -------------------------------------------------------------------------------
>
>                 Key: IMPALA-10976
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10976
>             Project: IMPALA
>          Issue Type: Task
>          Components: Catalog, Frontend
>            Reporter: Sourabh Goyal
>            Assignee: Sourabh Goyal
>            Priority: Major
>
> This is a follow up from IMPALA-10926. The idea is that when any DDL 
> operation is performed from Impala shell, it also syncs the db/table to its 
> latest event ID as per HMS. This way updates to a db/table's are applied in 
> the same order as they appear in the Notification log in HMS which ensures 
> consistency. Currently catalogD applies any updates received from Impala 
> shell in place. Instead it should perform an HMS operation first and then 
> replay all the HMS events since the last synced event.
>  However there are subtle differences in how Impala processes DDLs via shell 
> vs how it processes HMS events These are:
>  * When processing an alter table event, currently catalogD does a full table 
> reload. This has a performance impact as table reload is time consuming. 
> Whereas in place alter table DDL operation in catalogOpExecutor (via Impala 
> shell) is faster since detects when to reload table schema or file metadata 
> or both. Need some improvements in Alter table event processing logic to 
> detect whether to reload the file metadata or not.
>  * Similar improvement is required in processing alter partition event. As of 
> now, when processing AlterPartition HMS event, catalogd always  reloads file 
> metadata but when doing the same from shell, it reloads metadata only when it 
> is required. 
>  * Impala shell already caches hive fns in catalog db’s object.  But catalogD 
> does *not* process CREATE/DROP Fns HMS event
>  * When creating a db/table from Impala shell, if the operation fails because 
> the db/table already exists, then there is no reliable way in catalogd to 
> determine create event id for that db/table. The create event is required so 
> that for any subsequent ddl operations, catalogd can process HMS events 
> starting from createEvent Id. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to