This was cross-posted to https://community.cloudera.com/t5/Interactive-Short-cycle-SQL/inconsistent-metadata-between-impala-daemons-in-5-15/m-p/91573#M5712%3Feid=1&aid=1
On Wed, Jun 12, 2019 at 3:48 PM Sunil Parmar <[email protected]> wrote: > Thanks, SYNC_DDL should help! Is there a way to check if one node has out > of sync metadata ? > > Sunil Parmar > > > On Wed, Jun 12, 2019 at 7:49 AM Csaba Ringhofer <[email protected]> > wrote: > >> >We tried invalidating metadata ( followed by 'describe table' to fix >> metadata) but it didn't help. Even re-running refresh doesn't help all the >> time. We need some help/points to figure out the issue. >> Did you to this on the node with the stale metadata, or on another one? >> >> It is normal that some nodes have stale metadata for some time after >> REFRESH was run on another node, unless query option SYNC_DDL=1. >> SYNC_DDL was introduced to force DDL/DML operations to wait until the >> metadata changes are broadcast to all nodes (by statestored). If SYNC_DDL=0 >> (default), then REFRESH TABLE can return before all nodes are updated. >> >> Can you try SET SYNC_DDL=1; before REFRESH TABLE? >> >> >> >> On Wed, Jun 12, 2019 at 7:37 AM Sunil Parmar <[email protected]> >> wrote: >> >>> Impala version : impalad version 2.12.0 >>> OS: Centos 6.10 >>> Table size : 88TB >>> Partitions : 7K >>> Type : Parquet, file size compacted 256MB >>> >>> >>> We ingest data every minute to the table partition and run the refresh >>> table to load the data. There is a separate compaction process that runs >>> every hour and merges smaller files into big. The set up was working fine >>> for months until recently we are running into a strange issue of >>> inconsistent behavior between few nodes. Randomly some nodes appear to have >>> inconsistent metadata i.e. even though refresh table command ran >>> successfully some nodes still didn't have correct files so they referred >>> older files for those partitions. >>> >>> We tried invalidating metadata ( followed by 'describe table' to fix >>> metadata) but it didn't help. Even re-running refresh doesn't help all the >>> time. We need some help/points to figure out the issue. >>> >>> * Is there a way to check if all Impala nodes have stale metadata? >>> * How to fix metadata for individual node? Is there a command? >>> * Anyone has faced similar issue? Can you share your experience and fix? >>> >>> Sunil Parmar >>> >>
