Impala version : impalad version 2.12.0
OS: Centos 6.10
Table size : 88TB
Partitions : 7K
Type : Parquet, file size compacted 256MB


We ingest data every minute to the table partition and run the refresh
table to load the data. There is a separate compaction process that runs
every hour and merges smaller files into big. The set up was working fine
for months until recently we are running into a strange issue of
inconsistent behavior between few nodes. Randomly some nodes appear to have
inconsistent metadata i.e. even though refresh table command ran
successfully some nodes still didn't have correct files so they referred
older files for those partitions.

We tried invalidating metadata ( followed by 'describe table' to fix
metadata) but it didn't help. Even re-running refresh doesn't help all the
time. We need some help/points to figure out the issue.

* Is there a way to check if all Impala nodes have stale metadata?
* How to fix metadata for individual node? Is there a command?
* Anyone has faced similar issue? Can you share your experience and fix?

Sunil Parmar

Reply via email to