[jira] [Updated] (IMPALA-12856) IllegalStateException in processing RELOAD events due to malformed HMS Partition objects

2024-03-06 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-12856:

Epic Link: IMPALA-11533

> IllegalStateException in processing RELOAD events due to malformed HMS 
> Partition objects
> 
>
> Key: IMPALA-12856
> URL: https://issues.apache.org/jira/browse/IMPALA-12856
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Quanlong Huang
>Assignee: Sai Hemanth Gantasala
>Priority: Critical
>
> When processing RELOAD events on partitions, catalogd fetch the Partition 
> objects from HMS. The returned Partition objects could be malformed which 
> causes an IllegalStateException and stops the event-processor. This was 
> observed when a partition is added and dropped in a loop.
> {noformat}
> E0229 15:19:27.945312 12668 MetastoreEventsProcessor.java:990] Unexpected 
> exception received while processing event
> Java exception follows:
> java.lang.IllegalStateException
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:496)
> at 
> org.apache.impala.catalog.HdfsTable.getTypeCompatiblePartValues(HdfsTable.java:2598)
> at 
> org.apache.impala.catalog.HdfsTable.reloadPartitionsFromNames(HdfsTable.java:2856)
> at 
> org.apache.impala.service.CatalogOpExecutor.reloadPartitionsFromNamesIfExists(CatalogOpExecutor.java:4805)
> at 
> org.apache.impala.service.CatalogOpExecutor.reloadPartitionsIfExist(CatalogOpExecutor.java:4742)
> at 
> org.apache.impala.catalog.events.MetastoreEvents$MetastoreTableEvent.reloadPartitions(MetastoreEvents.java:1050)
> at 
> org.apache.impala.catalog.events.MetastoreEvents$ReloadEvent.processPartitionReload(MetastoreEvents.java:2941)
> at 
> org.apache.impala.catalog.events.MetastoreEvents$ReloadEvent.processTableEvent(MetastoreEvents.java:2906)
> at 
> org.apache.impala.catalog.events.MetastoreEvents$MetastoreTableEvent.process(MetastoreEvents.java:1248)
> at 
> org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:672)
> at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:1164)
> at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:972)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> E0229 15:19:27.963455 12668 MetastoreEventsProcessor.java:1251] Event id: 
> 8697728
> Event Type: RELOAD
> Event time: 1709191166
> Database name: default
> Table name: part_tbl
> Event message: H4s{noformat}
> The failed check is asserting the number of partition columns cached in 
> catalogd matches the number of partition values from the HMS object:
> {code:java}
>   public List getTypeCompatiblePartValues(List values) {
> List result = new ArrayList<>();
> List partitionColumns = getClusteringColumns();
> Preconditions.checkState(partitionColumns.size() == values.size()); // 
> This failed{code}
> After adding some debug logs, I found the Partition obejct got from HMS had 
> an empty values list:
> {noformat}
> I0229 16:04:04.679625 25867 HdfsTable.java:2829] HMS Partition: 
> Partition(values:[], dbName:default, tableName:part_tbl, 
> createTime:1709193844, lastAccessTime:0, 
> sd:StorageDescriptor(cols:[FieldSchema(name:i, type:int, comment:null)], 
> location:hdf
> s://localhost:20500/test-warehouse/part_tbl/p=1, 
> inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
> compressed:false, numBuckets:0, serdeInfo:SerDeInfo(name:null, serializ
> ationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{}), 
> bucketCols:[], sortCols:[], parameters:{}, 
> skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
> skewedColValueLocationMaps:{}), storedAsSubDirectories:false), par
> ameters:{}, catName:hive, writeId:0)
> I0229 16:04:04.680133 25867 MetastoreEventsProcessor.java:1189] Time elapsed 
> in processing event batch: 17.145ms
> E0229 16:04:04.680475 25867 MetastoreEven

[jira] [Updated] (IMPALA-12856) IllegalStateException in processing RELOAD events due to malformed HMS Partition objects

2024-04-22 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-12856:

Labels: catalog-2024  (was: )

> IllegalStateException in processing RELOAD events due to malformed HMS 
> Partition objects
> 
>
> Key: IMPALA-12856
> URL: https://issues.apache.org/jira/browse/IMPALA-12856
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Quanlong Huang
>Assignee: Sai Hemanth Gantasala
>Priority: Critical
>  Labels: catalog-2024
>
> When processing RELOAD events on partitions, catalogd fetch the Partition 
> objects from HMS. The returned Partition objects could be malformed which 
> causes an IllegalStateException and stops the event-processor. This was 
> observed when a partition is added and dropped in a loop.
> {noformat}
> E0229 15:19:27.945312 12668 MetastoreEventsProcessor.java:990] Unexpected 
> exception received while processing event
> Java exception follows:
> java.lang.IllegalStateException
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:496)
> at 
> org.apache.impala.catalog.HdfsTable.getTypeCompatiblePartValues(HdfsTable.java:2598)
> at 
> org.apache.impala.catalog.HdfsTable.reloadPartitionsFromNames(HdfsTable.java:2856)
> at 
> org.apache.impala.service.CatalogOpExecutor.reloadPartitionsFromNamesIfExists(CatalogOpExecutor.java:4805)
> at 
> org.apache.impala.service.CatalogOpExecutor.reloadPartitionsIfExist(CatalogOpExecutor.java:4742)
> at 
> org.apache.impala.catalog.events.MetastoreEvents$MetastoreTableEvent.reloadPartitions(MetastoreEvents.java:1050)
> at 
> org.apache.impala.catalog.events.MetastoreEvents$ReloadEvent.processPartitionReload(MetastoreEvents.java:2941)
> at 
> org.apache.impala.catalog.events.MetastoreEvents$ReloadEvent.processTableEvent(MetastoreEvents.java:2906)
> at 
> org.apache.impala.catalog.events.MetastoreEvents$MetastoreTableEvent.process(MetastoreEvents.java:1248)
> at 
> org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:672)
> at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:1164)
> at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:972)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> E0229 15:19:27.963455 12668 MetastoreEventsProcessor.java:1251] Event id: 
> 8697728
> Event Type: RELOAD
> Event time: 1709191166
> Database name: default
> Table name: part_tbl
> Event message: H4s{noformat}
> The failed check is asserting the number of partition columns cached in 
> catalogd matches the number of partition values from the HMS object:
> {code:java}
>   public List getTypeCompatiblePartValues(List values) {
> List result = new ArrayList<>();
> List partitionColumns = getClusteringColumns();
> Preconditions.checkState(partitionColumns.size() == values.size()); // 
> This failed{code}
> After adding some debug logs, I found the Partition obejct got from HMS had 
> an empty values list:
> {noformat}
> I0229 16:04:04.679625 25867 HdfsTable.java:2829] HMS Partition: 
> Partition(values:[], dbName:default, tableName:part_tbl, 
> createTime:1709193844, lastAccessTime:0, 
> sd:StorageDescriptor(cols:[FieldSchema(name:i, type:int, comment:null)], 
> location:hdf
> s://localhost:20500/test-warehouse/part_tbl/p=1, 
> inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
> compressed:false, numBuckets:0, serdeInfo:SerDeInfo(name:null, serializ
> ationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{}), 
> bucketCols:[], sortCols:[], parameters:{}, 
> skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
> skewedColValueLocationMaps:{}), storedAsSubDirectories:false), par
> ameters:{}, catName:hive, writeId:0)
> I0229 16:04:04.680133 25867 MetastoreEventsProcessor.java:1189] Time elapsed 
> in processing event batch: 17.145ms
>