[ https://issues.apache.org/jira/browse/RANGER-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512155#comment-17512155 ]
Xuze Yang commented on RANGER-3681: ----------------------------------- [~abhayk] Yes, READ_COMMITTED is usually the default isolation level for most databases, REPEATABLE-READ will make concurrency less efficient and more prone to deadlocks. But REPEATABLE-READ is the default choice of mysql, and in our production environment, considering other modules' operation, we should keep REPEATABLE-READ level. So I think a better way is to modify the sql so that no deadlock will occur under these two isolation levels > Ranger Database deadlock when createPolicy is running parallel > -------------------------------------------------------------- > > Key: RANGER-3681 > URL: https://issues.apache.org/jira/browse/RANGER-3681 > Project: Ranger > Issue Type: Bug > Components: admin > Affects Versions: 2.1.0 > Environment: ranger-version: 2.1.0 > mysql-verison: 5.7.34-log > mysql-engine: InnoDB > mysql-isolation-level: REPEATABLE-READ > Reporter: Xuze Yang > Assignee: Xuze Yang > Priority: Major > > h1. Error > ranger-log: > {noformat} > 2022-03-21 20:14:29,685 [http-bio-6080-exec-13] ERROR > org.apache.ranger.rest.ServiceREST (ServiceREST.java:1709) - > createPolicy(RangerPolicy={id={null} guid={null} isEnabled={true} > createdBy={null} updatedBy={null} createTime={null} updateTime={null} > version={1} service={default-Hive} > name={dcp-desensitize_a162c40cdc0140b1848b98415575be6c-1647864869626} > policyType={0} policyPriority={0} description={} > resourceSignature={4f15e3de95c81650ad869cb93a8c47a132bbec54bdf5de8c01f5075c19754cd7} > isAuditEnabled={true} serviceType={null} > resources={database={RangerPolicyResource={values={dcp } isExcludes={false} > isRecursive={false} }} column={RangerPolicyResource={values={* } > isExcludes={false} isRecursive={false} }} > table={RangerPolicyResource={values={desensitize_a162c40cdc0140b1848b98415575be6c > } isExcludes={false} isRecursive={false} }} } policyLabels={Consoler } > policyConditions={} > policyItems={RangerPolicyItem={accessTypes={RangerPolicyItemAccess={type={all} > isAllowed={true} }} users={tangbiao2 } groups={} roles={} conditions={} > delegateAdmin={false} }} denyPolicyItems={} allowExceptions={} > denyExceptions={} dataMaskPolicyItems={} rowFilterPolicyItems={} options={} > validitySchedules={, zoneName=null, isDenyAllElse={false} }}) failed > javax.persistence.PersistenceException: Exception [EclipseLink-4002] (Eclipse > Persistence Services - 2.5.2.v20140319-9ad6abd): > org.eclipse.persistence.exceptions.DatabaseException > Internal Exception: > com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock > found when trying to get lock; try restarting transaction > Error Code: 1213 > Call: INSERT INTO x_policy_ref_resource (ADDED_BY_ID, CREATE_TIME, policy_id, > resource_def_id, resource_name, UPDATE_TIME, UPD_BY_ID) VALUES (?, ?, ?, ?, > ?, ?, ?) > bind => [7 parameters bound] > Query: ValueReadQuery(name="x_policy_ref_resource_SEQ" sql="SELECT > LAST_INSERT_ID()") > at > org.eclipse.persistence.internal.jpa.EntityManagerImpl.flush(EntityManagerImpl.java:868) > at sun.reflect.GeneratedMethodAccessor98.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.springframework.orm.jpa.SharedEntityManagerCreator$SharedEntityManagerInvocationHandler.invoke(SharedEntityManagerCreator.java:301) > at com.sun.proxy.$Proxy30.flush(Unknown Source) > at org.apache.ranger.common.db.BaseDao.batchCreate(BaseDao.java:102) > ...{noformat} > mysql-deadlock: > {noformat} > ------------------------ > LATEST DETECTED DEADLOCK > ------------------------ > 2022-03-21 09:47:22 0x7ff3a4859700 > *** (1) TRANSACTION: > TRANSACTION 7036760, ACTIVE 0 sec inserting > mysql tables in use 1, locked 1 > LOCK WAIT 23 lock struct(s), heap size 1136, 12 row lock(s), undo log entries > 2 > MySQL thread id 27293, OS thread handle 140684415063808, query id 383930 > 192.168.0.76 DHCloudBG update > INSERT INTO x_policy_ref_resource (ADDED_BY_ID, CREATE_TIME, policy_id, > resource_def_id, resource_name, UPDATE_TIME, UPD_BY_ID) VALUES (1, > '2022-03-20 12:47:22.666', 13921, 5, 'database', '2022-03-20 12:47:22.681', 1) > *** (1) WAITING FOR THIS LOCK TO BE GRANTED: > RECORD LOCKS space id 531 page no 4 n bits 376 index > x_policy_ref_res_UK_polId_resDefId of table `ranger`.`x_policy_ref_resource` > trx id 7036760 lock_mode X insert intention waiting > Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits > 0 > 0: len 8; hex 73757072656d756d; asc supremum;; > *** (2) TRANSACTION: > TRANSACTION 7036759, ACTIVE 0 sec inserting > mysql tables in use 1, locked 1 > 23 lock struct(s), heap size 1136, 12 row lock(s), undo log entries 2 > MySQL thread id 27295, OS thread handle 140684413998848, query id 383932 > 192.168.0.76 DHCloudBG update > INSERT INTO x_policy_ref_resource (ADDED_BY_ID, CREATE_TIME, policy_id, > resource_def_id, resource_name, UPDATE_TIME, UPD_BY_ID) VALUES (1, > '2022-03-20 12:47:22.666', 13920, 5, 'database', '2022-03-20 12:47:22.682', 1) > *** (2) HOLDS THE LOCK(S): > RECORD LOCKS space id 531 page no 4 n bits 376 index > x_policy_ref_res_UK_polId_resDefId of table `ranger`.`x_policy_ref_resource` > trx id 7036759 lock_mode X > Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits > 0 > 0: len 8; hex 73757072656d756d; asc supremum;; > *** (2) WAITING FOR THIS LOCK TO BE GRANTED: > RECORD LOCKS space id 531 page no 4 n bits 376 index > x_policy_ref_res_UK_polId_resDefId of table `ranger`.`x_policy_ref_resource` > trx id 7036759 lock_mode X insert intention waiting > Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits > 0 > 0: len 8; hex 73757072656d756d; asc supremum;;*** WE ROLL BACK TRANSACTION > (2){noformat} > h1. Description > In our production environment, we may create ranger hive policy in parallel. > In this case it will occasionally report deadlock errors. I turned on the > general_log in mysql, and captured the sql when the deadlock exception > occurred, as shown below (some sql was simplified, and only the sql that > caused the deadlock was kept): > {noformat} > 2022-03-21T20:16:26.776033+08:00 239903 Query DELETE FROM > x_policy_ref_resource WHERE (policy_id = 16032) > 2022-03-21T20:16:26.776463+08:00 239902 Query DELETE FROM > x_policy_ref_resource WHERE (policy_id = 16033) > 2022-03-21T20:16:26.784333+08:00 239903 Query INSERT INTO > x_policy_ref_resource (ADDED_BY_ID, CREATE_TIME, policy_id, resource_def_id, > resource_name, UPDATE_TIME, UPD_BY_ID) VALUES (1, '2022-03-20 23:16:26.766', > 16032, 5, 'database', '2022-03-20 23:16:26.783', 1) > 2022-03-21T20:16:26.785484+08:00 239902 Query INSERT INTO > x_policy_ref_resource (ADDED_BY_ID, CREATE_TIME, policy_id, resource_def_id, > resource_name, UPDATE_TIME, UPD_BY_ID) VALUES (1, '2022-03-20 23:16:26.767', > 16033, 5, 'database', '2022-03-20 23:16:26.784', 1) > 2022-03-21T20:16:26.787844+08:00 239903 Query INSERT INTO > x_policy_ref_resource (ADDED_BY_ID, CREATE_TIME, policy_id, resource_def_id, > resource_name, UPDATE_TIME, UPD_BY_ID) VALUES (1, '2022-03-20 23:16:26.766', > 16032, 8, 'column', '2022-03-20 23:16:26.786', 1) > 2022-03-21T20:16:26.788728+08:00 239903 Query INSERT INTO > x_policy_ref_resource (ADDED_BY_ID, CREATE_TIME, policy_id, resource_def_id, > resource_name, UPDATE_TIME, UPD_BY_ID) VALUES (1, '2022-03-20 23:16:26.766', > 16032, 6, 'table', '2022-03-20 23:16:26.787', 1) > 2022-03-21T20:16:26.810405+08:00 239902 Query rollback > 2022-03-21T20:16:26.810781+08:00 239902 Query SET autocommit=1 > 2022-03-21T20:16:26.831309+08:00 239903 Query commit{noformat} > I exported the data in the x_policy_ref_resource table using mysqldump and > simulated it in the local environment and found that there is a deadlock when > the sql of the two transactions is executed as following order: > ||transation1||transation2|| > |begin|begin| > |DELETE FROM x_policy_ref_resource WHERE (policy_id = 16032);| | > | |DELETE FROM x_policy_ref_resource WHERE (policy_id = 16033);| > |INSERT INTO x_policy_ref_resource (ADDED_BY_ID, CREATE_TIME, policy_id, > resource_def_id, resource_name, UPDATE_TIME, UPD_BY_ID) VALUES (1, > '2022-03-20 23:16:26.766', 16032, 5, 'database', '2022-03-20 23:16:26.783', > 1);| | > | |INSERT INTO x_policy_ref_resource (ADDED_BY_ID, CREATE_TIME, policy_id, > resource_def_id, resource_name, UPDATE_TIME, UPD_BY_ID) VALUES (1, > '2022-03-20 23:16:26.767', 16033, 5, 'database', '2022-03-20 23:16:26.784', > 1);| > # When transaction1 and transaction2 execute delete sql, they will both > obtain gap lock, because gap locks are compatible. > # Then transaction 1 executes insert sql, it will try to acquire the insert > intention lock, which conflicts with the gap lock held by transaction 2, and > enters the lock waiting state. > # Finally, transaction2 executes insert sql, and also tries to acquire the > insert intention lock, which conflicts with the gap lock held by > transaction1. At this point, it is caught in a cycle where transaction1 waits > for transaction 2 to release the gap lock, and transaction2 waits for > transaction1 to release the gap lock. Two transactions waiting for each other > to release locks eventually lead to a deadlock. > -- This message was sent by Atlassian Jira (v8.20.1#820001)