[ https://issues.apache.org/jira/browse/HBASE-21854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Duo Zhang updated HBASE-21854: ------------------------------ Status: Patch Available (was: Open) > Race condition in TestProcedureSkipPersistence > ----------------------------------------------- > > Key: HBASE-21854 > URL: https://issues.apache.org/jira/browse/HBASE-21854 > Project: HBase > Issue Type: Bug > Components: proc-v2 > Affects Versions: 2.1.3 > Reporter: Peter Somogyi > Assignee: Peter Somogyi > Priority: Major > Attachments: HBASE-21854.patch > > > There is a race condition inĀ TestProcedureSkipPersistence. After the > procedure is added, the test stops ProcedureExecutor. In some cases the > procedure is not added to the queue in time. > Failing execution: > {noformat} > 2019-02-06 14:18:11,133 INFO [Time-limited test] > procedure2.TimeoutExecutorThread(82): ADDED pid=-1, state=WAITING_TIMEOUT; > org.apache.hadoop.hbase.procedure2.CompletedProcedureCleaner; timeout=30000, > timestamp=1549491521133 > 2019-02-06 14:18:11,135 INFO [PEWorker-1] > procedure2.TimeoutExecutorThread(82): ADDED pid=1, state=WAITING_TIMEOUT, > locked=true; > org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure; > timeout=2000, timestamp=1549491493135 > 2019-02-06 14:18:11,137 INFO [Time-limited test] hbase.Waiter(189): Waiting > up to [30,000] milli-secs(wait.for.ratio=[1]) > 2019-02-06 14:18:11,139 INFO [Time-limited test] > procedure2.ProcedureTestingUtility(125): RESTART - Stop > 2019-02-06 14:18:11,139 INFO [Time-limited test] > procedure2.ProcedureExecutor(702): Stopping > 2019-02-06 14:18:11,139 INFO [Time-limited test] wal.WALProcedureStore(331): > Stopping the WAL Procedure Store, isAbort=false > 2019-02-06 14:18:11,140 DEBUG [PEWorker-1] > procedure2.RootProcedureState(153): Add procedure pid=1, > state=WAITING_TIMEOUT, locked=true; > org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure > as the 0th rollback step > 2019-02-06 14:18:11,141 WARN [PEWorker-1] > procedure2.ProcedureExecutor$WorkerThread(2074): Worker terminating > UNNATURALLY null > java.lang.RuntimeException: the store must be running before inserting data > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:710) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.update(WALProcedureStore.java:603) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.updateStoreOnExec(ProcedureExecutor.java:1943) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1809) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1481) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1200(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2058) > 2019-02-06 14:18:11,145 INFO [Time-limited test] > procedure2.ProcedureTestingUtility(137): RESTART - Start{noformat} > In a successful run the ProcExecutor is stopped AFTER the procedure is > actually in the queue. > Successful: > {noformat} > 2019-02-07 15:48:08,731 INFO [Time-limited test] > procedure2.TimeoutExecutorThread(82): ADDED pid=-1, state=WAITING_TIMEOUT; > org.apache.hadoop.hbase.procedure2.CompletedProcedureCleaner; timeout=30000, > timestamp=1549550918731 > 2019-02-07 15:48:08,731 INFO [PEWorker-1] > procedure2.TimeoutExecutorThread(82): ADDED pid=1, state=WAITING_TIMEOUT, > locked=true; > org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure; > timeout=2000, timestamp=1549550890731 > 2019-02-07 15:48:08,732 DEBUG [PEWorker-1] > procedure2.RootProcedureState(153): Add procedure pid=1, > state=WAITING_TIMEOUT, locked=true; > org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure > as the 0th rollback step > 2019-02-07 15:48:08,732 INFO [WALProcedureStoreSyncThread] > wal.WALProcedureStore(1217): Remove all state logs with ID less than 1, since > all the active procedures are in the latest log > 2019-02-07 15:48:08,733 DEBUG [WALProcedureStoreSyncThread] > wal.WALProcedureStore(1239): Removed > log=file:/Users/peter.somogyi/Cloudera/hbase/hbase-procedure/target/test-data/b9a1969a-85a4-15e8-7da5-6198f5acf2de/proc-logs/pv2-00000000000000000001.log, > > activeLogs=[/Users/peter.somogyi/Cloudera/hbase/hbase-procedure/target/test-data/b9a1969a-85a4-15e8-7da5-6198f5acf2de/proc-logs/pv2-00000000000000000002.log] > 2019-02-07 15:48:08,734 INFO [Time-limited test] hbase.Waiter(189): Waiting > up to [30,000] milli-secs(wait.for.ratio=[1]) > 2019-02-07 15:48:08,734 INFO [Time-limited test] > procedure2.ProcedureTestingUtility(125): RESTART - Stop > 2019-02-07 15:48:08,734 INFO [Time-limited test] > procedure2.ProcedureExecutor(702): Stopping > 2019-02-07 15:48:08,735 INFO [Time-limited test] wal.WALProcedureStore(331): > Stopping the WAL Procedure Store, isAbort=false > 2019-02-07 15:48:08,735 INFO [Time-limited test] > procedure2.ProcedureTestingUtility(137): RESTART - Start{noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)