[ 
https://issues.apache.org/jira/browse/ASTERIXDB-1450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15287741#comment-15287741
 ] 

Young-Seok Kim commented on ASTERIXDB-1450:
-------------------------------------------

This issue could be caused by the checkpoint thread. 
If the checkpoint thread removed a log file which includes the first LSN's log 
record from the Job to be aborted, when the recovery manager aborts the job, 
the file including the aborted job's first LSN's log record may not exist. 
So, if this is the cause of the issue, I think what should be done is as 
follows:
The checkpoint thread (or some other component) should provide the LSN of the 
first valid log record in the log files. 
So, the LSN of the first valid log record should be used as a starting LSN to 
be read for the abort.

> Transaction log file not found on recovery intermittent hang on integration 
> test
> --------------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-1450
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1450
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Michael Blow
>            Assignee: Murtadha Hubail
>
> See 
> https://asterix-jenkins.ics.uci.edu/job/asterix-coverage/99/artifact/asterixdb/asterix-installer/target/asterix-installer-0.8.9-SNAPSHOT-binary-assembly/clusters/local/working_dir/logs/asterix_nc2.log
> INFO: { lock : 1, instantLock : 0, tryLock : 13133, instantTryLock : 46159, 
> unlock : 13134, releaseLocks : 2511 }
> Exception in thread "Thread-1" java.lang.Error: 
> org.apache.asterix.common.exceptions.ACIDException: Could not complete 
> rollback! System is in an inconsistent state
>       at 
> org.apache.asterix.runtime.job.listener.JobEventListenerFactory$1.jobletFinish(JobEventListenerFactory.java:61)
>       at org.apache.hyracks.control.nc.Joblet.performCleanup(Joblet.java:317)
>       at org.apache.hyracks.control.nc.Joblet.removeTask(Joblet.java:153)
>       at 
> org.apache.hyracks.control.nc.work.NotifyTaskFailureWork.run(NotifyTaskFailureWork.java:54)
>       at 
> org.apache.hyracks.control.common.work.WorkQueue$WorkerThread.run(WorkQueue.java:132)
> Caused by: org.apache.asterix.common.exceptions.ACIDException: Could not 
> complete rollback! System is in an inconsistent state
>       at 
> org.apache.asterix.transaction.management.service.transaction.TransactionManager.abortTransaction(TransactionManager.java:72)
>       at 
> org.apache.asterix.transaction.management.service.transaction.TransactionManager.completedTransaction(TransactionManager.java:130)
>       at 
> org.apache.asterix.runtime.job.listener.JobEventListenerFactory$1.jobletFinish(JobEventListenerFactory.java:58)
>       ... 4 more
> Caused by: java.lang.IllegalStateException
>       at 
> org.apache.asterix.transaction.management.service.logging.LogManager.getFileChannel(LogManager.java:449)
>       at 
> org.apache.asterix.transaction.management.service.logging.LogReader.getFileChannel(LogReader.java:276)
>       at 
> org.apache.asterix.transaction.management.service.logging.LogReader.initializeScan(LogReader.java:74)
>       at 
> org.apache.asterix.transaction.management.service.recovery.RecoveryManager.rollbackTransaction(RecoveryManager.java:717)
>       at 
> org.apache.asterix.transaction.management.service.transaction.TransactionManager.abortTransaction(TransactionManager.java:64)
>       ... 6 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to