Importing a long history from Perforce into git using the git-p4 tool can be especially challenging. The `git p4 clone` operation is based on an all-or-nothing transactionality guarantee. Under real-world conditions like network unreliability or a busy Perforce server, `git p4 clone` and `git p4 sync` operations can easily fail, forcing a user to restart the import process from the beginning. The longer the history being imported, the more likely a fault occurs during the process. Long enough imports thus become statistically unlikely to ever succeed.
The underlying git fast-import protocol supports an explicit checkpoint command. The idea here is to optionally allow the user to force an explicit checkpoint every <x> seconds. If the sync/clone operation fails branches are left updated at the appropriate commit available during the latest checkpoint. This allows a user to resume importing Perforce history while only having to repeat at most approximately <x> seconds worth of import activity. Signed-off-by: Ori Rawlings <orirawli...@gmail.com> --- git-p4.py | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/git-p4.py b/git-p4.py index fd5ca52..40cb64f 100755 --- a/git-p4.py +++ b/git-p4.py @@ -2244,6 +2244,7 @@ class P4Sync(Command, P4UserMap): optparse.make_option("-/", dest="cloneExclude", action="append", type="string", help="exclude depot path"), + optparse.make_option("--checkpoint-period", dest="checkpointPeriod", type="int", help="Period in seconds between explict git fast-import checkpoints (by default, no explicit checkpoints are performed)"), ] self.description = """Imports from Perforce into a git repository.\n example: @@ -2276,6 +2277,7 @@ class P4Sync(Command, P4UserMap): self.tempBranches = [] self.tempBranchLocation = "refs/git-p4-tmp" self.largeFileSystem = None + self.checkpointPeriod = -1 if gitConfig('git-p4.largeFileSystem'): largeFileSystemConstructor = globals()[gitConfig('git-p4.largeFileSystem')] @@ -3031,6 +3033,8 @@ class P4Sync(Command, P4UserMap): def importChanges(self, changes): cnt = 1 + if self.checkpointPeriod > -1: + self.lastCheckpointTime = time.time() for change in changes: description = p4_describe(change) self.updateOptionDict(description) @@ -3107,6 +3111,10 @@ class P4Sync(Command, P4UserMap): self.initialParent) # only needed once, to connect to the previous commit self.initialParent = "" + + if self.checkpointPeriod > -1 and time.time() - self.lastCheckpointTime > self.checkpointPeriod: + self.checkpoint() + self.lastCheckpointTime = time.time() except IOError: print self.gitError.read() sys.exit(1) -- 2.7.4 (Apple Git-66)