[
https://issues.apache.org/jira/browse/ASTERIXDB-1450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15287808#comment-15287808
]
Murtadha Hubail commented on ASTERIXDB-1450:
--------------------------------------------
I already thought about this and started working on it, however, I hit another
issue. When a log file partition is closed because it cannot fit the current
log record, the append LSN in the log manager jumps to the LSN of the beginning
of the next log file partition, however, the flush LSN (last flushed log record
on disk) is not updated properly. This causes a deadlock during a rollback
operation since the flush LSN will always be smaller then the aborted job last
log record. Yingyi faced another issue where an invalid LSN was provided to the
log reader (ASTERIXDB-1425)
> Transaction log file not found on recovery intermittent hang on integration
> test
> --------------------------------------------------------------------------------
>
> Key: ASTERIXDB-1450
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-1450
> Project: Apache AsterixDB
> Issue Type: Bug
> Reporter: Michael Blow
> Assignee: Murtadha Hubail
>
> See
> https://asterix-jenkins.ics.uci.edu/job/asterix-coverage/99/artifact/asterixdb/asterix-installer/target/asterix-installer-0.8.9-SNAPSHOT-binary-assembly/clusters/local/working_dir/logs/asterix_nc2.log
> INFO: { lock : 1, instantLock : 0, tryLock : 13133, instantTryLock : 46159,
> unlock : 13134, releaseLocks : 2511 }
> Exception in thread "Thread-1" java.lang.Error:
> org.apache.asterix.common.exceptions.ACIDException: Could not complete
> rollback! System is in an inconsistent state
> at
> org.apache.asterix.runtime.job.listener.JobEventListenerFactory$1.jobletFinish(JobEventListenerFactory.java:61)
> at org.apache.hyracks.control.nc.Joblet.performCleanup(Joblet.java:317)
> at org.apache.hyracks.control.nc.Joblet.removeTask(Joblet.java:153)
> at
> org.apache.hyracks.control.nc.work.NotifyTaskFailureWork.run(NotifyTaskFailureWork.java:54)
> at
> org.apache.hyracks.control.common.work.WorkQueue$WorkerThread.run(WorkQueue.java:132)
> Caused by: org.apache.asterix.common.exceptions.ACIDException: Could not
> complete rollback! System is in an inconsistent state
> at
> org.apache.asterix.transaction.management.service.transaction.TransactionManager.abortTransaction(TransactionManager.java:72)
> at
> org.apache.asterix.transaction.management.service.transaction.TransactionManager.completedTransaction(TransactionManager.java:130)
> at
> org.apache.asterix.runtime.job.listener.JobEventListenerFactory$1.jobletFinish(JobEventListenerFactory.java:58)
> ... 4 more
> Caused by: java.lang.IllegalStateException
> at
> org.apache.asterix.transaction.management.service.logging.LogManager.getFileChannel(LogManager.java:449)
> at
> org.apache.asterix.transaction.management.service.logging.LogReader.getFileChannel(LogReader.java:276)
> at
> org.apache.asterix.transaction.management.service.logging.LogReader.initializeScan(LogReader.java:74)
> at
> org.apache.asterix.transaction.management.service.recovery.RecoveryManager.rollbackTransaction(RecoveryManager.java:717)
> at
> org.apache.asterix.transaction.management.service.transaction.TransactionManager.abortTransaction(TransactionManager.java:64)
> ... 6 more
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)