>We tried invalidating metadata ( followed by 'describe table' to fix metadata) but it didn't help. Even re-running refresh doesn't help all the time. We need some help/points to figure out the issue. Did you to this on the node with the stale metadata, or on another one?
It is normal that some nodes have stale metadata for some time after REFRESH was run on another node, unless query option SYNC_DDL=1. SYNC_DDL was introduced to force DDL/DML operations to wait until the metadata changes are broadcast to all nodes (by statestored). If SYNC_DDL=0 (default), then REFRESH TABLE can return before all nodes are updated. Can you try SET SYNC_DDL=1; before REFRESH TABLE? On Wed, Jun 12, 2019 at 7:37 AM Sunil Parmar <[email protected]> wrote: > Impala version : impalad version 2.12.0 > OS: Centos 6.10 > Table size : 88TB > Partitions : 7K > Type : Parquet, file size compacted 256MB > > > We ingest data every minute to the table partition and run the refresh > table to load the data. There is a separate compaction process that runs > every hour and merges smaller files into big. The set up was working fine > for months until recently we are running into a strange issue of > inconsistent behavior between few nodes. Randomly some nodes appear to have > inconsistent metadata i.e. even though refresh table command ran > successfully some nodes still didn't have correct files so they referred > older files for those partitions. > > We tried invalidating metadata ( followed by 'describe table' to fix > metadata) but it didn't help. Even re-running refresh doesn't help all the > time. We need some help/points to figure out the issue. > > * Is there a way to check if all Impala nodes have stale metadata? > * How to fix metadata for individual node? Is there a command? > * Anyone has faced similar issue? Can you share your experience and fix? > > Sunil Parmar >
