[
https://issues.apache.org/jira/browse/PHOENIX-7672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Himanshu Gwalani updated PHOENIX-7672:
--------------------------------------
Summary: Replication Log Replay should acquire lease on unclosed file
before processing (was: ReplicationLogReplay should acquire lease on unclosed
file before processing)
> Replication Log Replay should acquire lease on unclosed file before processing
> ------------------------------------------------------------------------------
>
> Key: PHOENIX-7672
> URL: https://issues.apache.org/jira/browse/PHOENIX-7672
> Project: Phoenix
> Issue Type: Sub-task
> Reporter: Himanshu Gwalani
> Assignee: Himanshu Gwalani
> Priority: Major
> Fix For: PHOENIX-7562-feature
>
>
> As of now, if a replication log file is not closed gracefully and reader
> start reading it before HDFS lease timeout on the file, it can lead to
> partial data being read by the reader. Hence the replication replay must
> ensure either file is closed or if not, acquire the lease and accordingly
> validate header and trailer in the file before processing it.
> Pseudo code (by Andrew Purtell)
> {code:java}
> isClosed = ((LeaseRecoverable) fs).isFileClosed(filePath); // May throw
> ClassCastException - bad!
> if (isClosed) {
> // This will assert that the file has a valid header and trailer.
> reader = createLogReader(filePath); // may throw IOE
> } else {
> // Recover lease via custom method using LeaseRecoverable#recoverLease.
> // Wait until we get the lease with retries.
> recoverLease(filePath); // may throw IOE
> if (fs.getFileStatus(filePath).getLen() > 0) { // may throw IOE
> try {
> // This will assert that the file has a valid header and trailer.
> reader = createLogReader(filePath); // may throw IOE
> } catch (IOException e) { // how about MissingTrailerException to
> give clarity?
> // check some exception details to confirm it was a trailer issue.
> // if not a trailer issue, just rethrow the exception. otherwise,
> // should we continue even though the file is truncated? we are
> never going
> // to get that truncated data back, whatever it was. ignoring the
> whole
> // file converts potential data loss into certain data loss.
> LOG.warn("Replication log file {} is missing its trailer,
> continuing", filePath);
> reader = createLogReader(filePath, false); // may throw another
> IOE
> }
> } else {
> // Ignore the file.
> LOG.info("Ignoring zero length replication log file {}", filePath);
> }
> }
> // Clean up. Remove the replication log file at filePath. {code}
> After PHOENIX-7669{-}{-}, the low level reader would throw appropriate
> exceptions and those needs to be handled in replication log replay as part of
> this Jira (along with acquire lease logic and mentioned in above pseudo code)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)