[ https://issues.apache.org/jira/browse/AMBARI-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dmytro Grinenko updated AMBARI-25576: ------------------------------------- Assignee: Dmytro Grinenko Status: Patch Available (was: Open) > Primary key duplication error during flushing alerts from alerts cache > ---------------------------------------------------------------------- > > Key: AMBARI-25576 > URL: https://issues.apache.org/jira/browse/AMBARI-25576 > Project: Ambari > Issue Type: Bug > Components: ambari-server > Affects Versions: 2.7.5 > Reporter: Dmytro Vitiuk > Assignee: Dmytro Grinenko > Priority: Major > Fix For: 2.7.6 > > Time Spent: 10m > Remaining Estimate: 0h > > Sometimes there are commit errors for clusters with a lot of hosts and > enabled alert caching: > {code:java} > 2020-10-09 19:53:14,444 ERROR [alert-event-bus-4] > AmbariJpaLocalTxnInterceptor:180 - [DETAILED ERROR] Rollback reason: > Local Exception Stack: > Exception [EclipseLink-4002] (Eclipse Persistence Services - > 2.6.2.v20151217-774c696): org.eclipse.persistence.exceptions.DatabaseException > Internal Exception: java.sql.BatchUpdateException: Batch entry 1 INSERT INTO > alert_history (alert_id, alert_instance, alert_label, alert_state, > alert_text, alert_timestamp, cluster_id, component_name, host_name, > service_name, alert_definition_id) VALUES (15363461, NULL, 'DataNode Web UI', > 'OK', 'HTTP 200 response in 0.000s', 1602286496756, 2, 'DATANODE', 'host1', > 'HDFS', 53) was aborted: ERROR: duplicate key value violates unique > constraint "pk_alert_history" > Detail: Key (alert_id)=(15363461) already exists. Call getNextException to > see other errors in the batch. > Error Code: 0 > Call: INSERT INTO alert_history (alert_id, alert_instance, alert_label, > alert_state, alert_text, alert_timestamp, cluster_id, component_name, > host_name, service_name, alert_definition_id) VALUES (?, ?, ?, ?, ?, ?, ?, ?, > ?, ?, ?) > bind => [11 parameters bound] > {code} > This is not often issue, but anyway it has extensive logging. Also this issue > can cause other rare problems, so it should be fixed. > The reason of the issue is we have a shareable cache which can be updated > with just merged value before this value will be really committed into DB. In > this case other thread (from CachedAlertFlushService or AlertEventPublisher) > can try to also merge already merged entity. > For example, we've created a new AlertHistoryEntity and set it to existing > AlertCurrentEntity. A first thread started transaction, merged current entity > to context, saved merged value to the cache and paused execution. After that > a second thread tries to merge all content of cache and also merges just > updated current entity. So we have two transaction and both think they should > update current entity and create the new history entity. As result one of > them is failing on duplicate error. -- This message was sent by Atlassian Jira (v8.3.4#803005)