Importing a long history from Perforce into git using the git-p4 tool
can be especially challenging. The `git p4 clone` operation is based
on an all-or-nothing transactionality guarantee. Under real-world
conditions like network unreliability or a busy Perforce server,
`git p4 clone` and  `git p4 sync` operations can easily fail, forcing a
user to restart the import process from the beginning. The longer the
history being imported, the more likely a fault occurs during the
process. Long enough imports thus become statistically unlikely to ever
succeed.

The underlying git fast-import protocol supports an explicit checkpoint
command. The idea here is to optionally allow the user to force an
explicit checkpoint every <x> seconds. If the sync/clone operation fails
branches are left updated at the appropriate commit available during the
latest checkpoint. This allows a user to resume importing Perforce
history while only having to repeat at most approximately <x> seconds
worth of import activity.

Signed-off-by: Ori Rawlings <orirawli...@gmail.com>
---
 git-p4.py | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/git-p4.py b/git-p4.py
index fd5ca52..40cb64f 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2244,6 +2244,7 @@ class P4Sync(Command, P4UserMap):
                 optparse.make_option("-/", dest="cloneExclude",
                                      action="append", type="string",
                                      help="exclude depot path"),
+                optparse.make_option("--checkpoint-period", 
dest="checkpointPeriod", type="int", help="Period in seconds between explict 
git fast-import checkpoints (by default, no explicit checkpoints are 
performed)"),
         ]
         self.description = """Imports from Perforce into a git repository.\n
     example:
@@ -2276,6 +2277,7 @@ class P4Sync(Command, P4UserMap):
         self.tempBranches = []
         self.tempBranchLocation = "refs/git-p4-tmp"
         self.largeFileSystem = None
+        self.checkpointPeriod = -1
 
         if gitConfig('git-p4.largeFileSystem'):
             largeFileSystemConstructor = 
globals()[gitConfig('git-p4.largeFileSystem')]
@@ -3031,6 +3033,8 @@ class P4Sync(Command, P4UserMap):
 
     def importChanges(self, changes):
         cnt = 1
+        if self.checkpointPeriod > -1:
+            self.lastCheckpointTime = time.time()
         for change in changes:
             description = p4_describe(change)
             self.updateOptionDict(description)
@@ -3107,6 +3111,10 @@ class P4Sync(Command, P4UserMap):
                                 self.initialParent)
                     # only needed once, to connect to the previous commit
                     self.initialParent = ""
+
+                    if self.checkpointPeriod > -1 and time.time() - 
self.lastCheckpointTime > self.checkpointPeriod:
+                        self.checkpoint()
+                        self.lastCheckpointTime = time.time()
             except IOError:
                 print self.gitError.read()
                 sys.exit(1)
-- 
2.7.4 (Apple Git-66)

Reply via email to