[PATCH v4] git-p4: Use -m when running p4 changes
Simply running p4 changes on a large branch can result in a too many rows scanned error from the Perforce server. It is better to use a sequence of smaller calls to p4 changes, using the -m option to limit the size of each call. Signed-off-by: Lex Spoon l...@lexspoon.org Reviewed-by: Junio C Hamano gits...@pobox.com Reviewed-by: Luke Diamand l...@diamand.org --- Updated to avoid the crash Luke pointed out. All t98* tests pass now except for t9814, which is already failing on master for some reason. Documentation/git-p4.txt | 17 ++--- git-p4.py| 52 ++- t/t9818-git-p4-block.sh | 64 3 files changed, 119 insertions(+), 14 deletions(-) create mode 100755 t/t9818-git-p4-block.sh diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt index a1664b9..82aa5d6 100644 --- a/Documentation/git-p4.txt +++ b/Documentation/git-p4.txt @@ -225,9 +225,20 @@ Git repository: they can find the p4 branches in refs/heads. --max-changes n:: - Limit the number of imported changes to 'n'. Useful to - limit the amount of history when using the '@all' p4 revision - specifier. + Import at most 'n' changes, rather than the entire range of + changes included in the given revision specifier. A typical + usage would be use '@all' as the revision specifier, but then + to use '--max-changes 1000' to import only the last 1000 + revisions rather than the entire revision history. + +--changes-block-size n:: + The internal block size to use when converting a revision + specifier such as '@all' into a list of specific change + numbers. Instead of using a single call to 'p4 changes' to + find the full list of changes for the conversion, there are a + sequence of calls to 'p4 changes -m', each of which requests + one block of changes of the given size. The default block size + is 500, which should usually be suitable. --keep-path:: The mapping of file names from the p4 depot path to Git, by diff --git a/git-p4.py b/git-p4.py index 549022e..e28033f 100755 --- a/git-p4.py +++ b/git-p4.py @@ -740,17 +740,43 @@ def createOrUpdateBranchesFromOrigin(localRefPrefix = refs/remotes/p4/, silent def originP4BranchesExist(): return gitBranchExists(origin) or gitBranchExists(origin/p4) or gitBranchExists(origin/p4/master) -def p4ChangesForPaths(depotPaths, changeRange): +def p4ChangesForPaths(depotPaths, changeRange, block_size): assert depotPaths -cmd = ['changes'] -for p in depotPaths: -cmd += [%s...%s % (p, changeRange)] -output = p4_read_pipe_lines(cmd) +assert block_size + +# Parse the change range into start and end +if changeRange is None or changeRange == '': +changeStart = '@1' +changeEnd = '#head' +else: +parts = changeRange.split(',') +assert len(parts) == 2 +changeStart = parts[0] +changeEnd = parts[1] +# Accumulate change numbers in a dictionary to avoid duplicates changes = {} -for line in output: -changeNum = int(line.split( )[1]) -changes[changeNum] = True + +for p in depotPaths: +# Retrieve changes a block at a time, to prevent running +# into a MaxScanRows error from the server. +start = changeStart +end = changeEnd +get_another_block = True +while get_another_block: +new_changes = [] +cmd = ['changes'] +cmd += ['-m', str(block_size)] +cmd += [%s...%s,%s % (p, start, end)] +for line in p4_read_pipe_lines(cmd): +changeNum = int(line.split( )[1]) +new_changes.append(changeNum) +changes[changeNum] = True +if len(new_changes) == block_size: +get_another_block = True +end = '@' + str(min(new_changes)) +else: +get_another_block = False changelist = changes.keys() changelist.sort() @@ -1911,7 +1937,10 @@ class P4Sync(Command, P4UserMap): optparse.make_option(--import-labels, dest=importLabels, action=store_true), optparse.make_option(--import-local, dest=importIntoRemotes, action=store_false, help=Import into refs/heads/ , not refs/remotes), -optparse.make_option(--max-changes, dest=maxChanges), +optparse.make_option(--max-changes, dest=maxChanges, + help=Maximum number of changes to import), +optparse.make_option(--changes-block-size, dest=changes_block_size, type=int, + help=Internal block size to use when iteratively calling p4 changes), optparse.make_option(--keep-path, dest=keepRepoPath, action='store_true',
Re: [PATCH v4] git-p4: Use -m when running p4 changes
Sorry - could you resubmit your patch (PATCHv4 it will be) with this change squashed in please? It will make life much easier, especially for Junio! Thanks! Luke On 20 April 2015 at 16:00, Lex Spoon l...@lexspoon.org wrote: Simply running p4 changes on a large branch can result in a too many rows scanned error from the Perforce server. It is better to use a sequence of smaller calls to p4 changes, using the -m option to limit the size of each call. Signed-off-by: Lex Spoon l...@lexspoon.org Reviewed-by: Junio C Hamano gits...@pobox.com Reviewed-by: Luke Diamand l...@diamand.org --- Updated to avoid the crash Luke pointed out. All t98* tests pass now except for t9814, which is already failing on master for some reason. Documentation/git-p4.txt | 17 ++--- git-p4.py| 52 ++- t/t9818-git-p4-block.sh | 64 3 files changed, 119 insertions(+), 14 deletions(-) create mode 100755 t/t9818-git-p4-block.sh diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt index a1664b9..82aa5d6 100644 --- a/Documentation/git-p4.txt +++ b/Documentation/git-p4.txt @@ -225,9 +225,20 @@ Git repository: they can find the p4 branches in refs/heads. --max-changes n:: - Limit the number of imported changes to 'n'. Useful to - limit the amount of history when using the '@all' p4 revision - specifier. + Import at most 'n' changes, rather than the entire range of + changes included in the given revision specifier. A typical + usage would be use '@all' as the revision specifier, but then + to use '--max-changes 1000' to import only the last 1000 + revisions rather than the entire revision history. + +--changes-block-size n:: + The internal block size to use when converting a revision + specifier such as '@all' into a list of specific change + numbers. Instead of using a single call to 'p4 changes' to + find the full list of changes for the conversion, there are a + sequence of calls to 'p4 changes -m', each of which requests + one block of changes of the given size. The default block size + is 500, which should usually be suitable. --keep-path:: The mapping of file names from the p4 depot path to Git, by diff --git a/git-p4.py b/git-p4.py index 549022e..e28033f 100755 --- a/git-p4.py +++ b/git-p4.py @@ -740,17 +740,43 @@ def createOrUpdateBranchesFromOrigin(localRefPrefix = refs/remotes/p4/, silent def originP4BranchesExist(): return gitBranchExists(origin) or gitBranchExists(origin/p4) or gitBranchExists(origin/p4/master) -def p4ChangesForPaths(depotPaths, changeRange): +def p4ChangesForPaths(depotPaths, changeRange, block_size): assert depotPaths -cmd = ['changes'] -for p in depotPaths: -cmd += [%s...%s % (p, changeRange)] -output = p4_read_pipe_lines(cmd) +assert block_size + +# Parse the change range into start and end +if changeRange is None or changeRange == '': +changeStart = '@1' +changeEnd = '#head' +else: +parts = changeRange.split(',') +assert len(parts) == 2 +changeStart = parts[0] +changeEnd = parts[1] +# Accumulate change numbers in a dictionary to avoid duplicates changes = {} -for line in output: -changeNum = int(line.split( )[1]) -changes[changeNum] = True + +for p in depotPaths: +# Retrieve changes a block at a time, to prevent running +# into a MaxScanRows error from the server. +start = changeStart +end = changeEnd +get_another_block = True +while get_another_block: +new_changes = [] +cmd = ['changes'] +cmd += ['-m', str(block_size)] +cmd += [%s...%s,%s % (p, start, end)] +for line in p4_read_pipe_lines(cmd): +changeNum = int(line.split( )[1]) +new_changes.append(changeNum) +changes[changeNum] = True +if len(new_changes) == block_size: +get_another_block = True +end = '@' + str(min(new_changes)) +else: +get_another_block = False changelist = changes.keys() changelist.sort() @@ -1911,7 +1937,10 @@ class P4Sync(Command, P4UserMap): optparse.make_option(--import-labels, dest=importLabels, action=store_true), optparse.make_option(--import-local, dest=importIntoRemotes, action=store_false, help=Import into refs/heads/ , not refs/remotes), -optparse.make_option(--max-changes, dest=maxChanges), +optparse.make_option(--max-changes, dest=maxChanges, + help=Maximum number of
Re: [PATCH v4] git-p4: Use -m when running p4 changes
Luke Diamand l...@diamand.org writes: Sorry - could you resubmit your patch (PATCHv4 it will be) with this change squashed in please? It will make life much easier, especially for Junio! Thanks for caring, but this seems to be a full patch to replace v3. It was sent with your Reviewed-by already in, but I'd tentatively remove that line while queuing it to 'pu' and ask you to double check if the patch makes sense (and after your yes, it does, I'd add the Reviewed-by back). Thanks. Thanks! Luke On 20 April 2015 at 16:00, Lex Spoon l...@lexspoon.org wrote: Simply running p4 changes on a large branch can result in a too many rows scanned error from the Perforce server. It is better to use a sequence of smaller calls to p4 changes, using the -m option to limit the size of each call. Signed-off-by: Lex Spoon l...@lexspoon.org Reviewed-by: Junio C Hamano gits...@pobox.com Reviewed-by: Luke Diamand l...@diamand.org --- Updated to avoid the crash Luke pointed out. All t98* tests pass now except for t9814, which is already failing on master for some reason. Documentation/git-p4.txt | 17 ++--- git-p4.py| 52 ++- t/t9818-git-p4-block.sh | 64 3 files changed, 119 insertions(+), 14 deletions(-) create mode 100755 t/t9818-git-p4-block.sh diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt index a1664b9..82aa5d6 100644 --- a/Documentation/git-p4.txt +++ b/Documentation/git-p4.txt @@ -225,9 +225,20 @@ Git repository: they can find the p4 branches in refs/heads. --max-changes n:: - Limit the number of imported changes to 'n'. Useful to - limit the amount of history when using the '@all' p4 revision - specifier. + Import at most 'n' changes, rather than the entire range of + changes included in the given revision specifier. A typical + usage would be use '@all' as the revision specifier, but then + to use '--max-changes 1000' to import only the last 1000 + revisions rather than the entire revision history. + +--changes-block-size n:: + The internal block size to use when converting a revision + specifier such as '@all' into a list of specific change + numbers. Instead of using a single call to 'p4 changes' to + find the full list of changes for the conversion, there are a + sequence of calls to 'p4 changes -m', each of which requests + one block of changes of the given size. The default block size + is 500, which should usually be suitable. --keep-path:: The mapping of file names from the p4 depot path to Git, by diff --git a/git-p4.py b/git-p4.py index 549022e..e28033f 100755 --- a/git-p4.py +++ b/git-p4.py @@ -740,17 +740,43 @@ def createOrUpdateBranchesFromOrigin(localRefPrefix = refs/remotes/p4/, silent def originP4BranchesExist(): return gitBranchExists(origin) or gitBranchExists(origin/p4) or gitBranchExists(origin/p4/master) -def p4ChangesForPaths(depotPaths, changeRange): +def p4ChangesForPaths(depotPaths, changeRange, block_size): assert depotPaths -cmd = ['changes'] -for p in depotPaths: -cmd += [%s...%s % (p, changeRange)] -output = p4_read_pipe_lines(cmd) +assert block_size + +# Parse the change range into start and end +if changeRange is None or changeRange == '': +changeStart = '@1' +changeEnd = '#head' +else: +parts = changeRange.split(',') +assert len(parts) == 2 +changeStart = parts[0] +changeEnd = parts[1] +# Accumulate change numbers in a dictionary to avoid duplicates changes = {} -for line in output: -changeNum = int(line.split( )[1]) -changes[changeNum] = True + +for p in depotPaths: +# Retrieve changes a block at a time, to prevent running +# into a MaxScanRows error from the server. +start = changeStart +end = changeEnd +get_another_block = True +while get_another_block: +new_changes = [] +cmd = ['changes'] +cmd += ['-m', str(block_size)] +cmd += [%s...%s,%s % (p, start, end)] +for line in p4_read_pipe_lines(cmd): +changeNum = int(line.split( )[1]) +new_changes.append(changeNum) +changes[changeNum] = True +if len(new_changes) == block_size: +get_another_block = True +end = '@' + str(min(new_changes)) +else: +get_another_block = False changelist = changes.keys() changelist.sort() @@ -1911,7 +1937,10 @@ class P4Sync(Command, P4UserMap): optparse.make_option(--import-labels, dest=importLabels, action=store_true), optparse.make_option(--import-local,
Re: [PATCH v4] git-p4: Use -m when running p4 changes
Junio C Hamano gits...@pobox.com writes: Luke Diamand l...@diamand.org writes: Sorry - could you resubmit your patch (PATCHv4 it will be) with this change squashed in please? It will make life much easier, especially for Junio! Thanks for caring, but this seems to be a full patch to replace v3. It was sent with your Reviewed-by already in, but I'd tentatively remove that line while queuing it to 'pu' and ask you to double check if the patch makes sense (and after your yes, it does, I'd add the Reviewed-by back). Thanks. Just to make it easier to see, the interdiff between v3 and v4 looks like this: git-p4.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/git-p4.py b/git-p4.py index 1fba3aa..e28033f 100755 --- a/git-p4.py +++ b/git-p4.py @@ -2608,7 +2608,7 @@ class P4Sync(Command, P4UserMap): return -def importNewBranch(self, branch, maxChange, changes_block_size): +def importNewBranch(self, branch, maxChange): # make fast-import flush all changes to disk and update the refs using the checkpoint # command so that we can try to find the branch parent in the git history self.gitStream.write(checkpoint\n\n); @@ -2616,7 +2616,7 @@ class P4Sync(Command, P4UserMap): branchPrefix = self.depotPaths[0] + branch + / range = @1,%s % maxChange #print prefix + branchPrefix -changes = p4ChangesForPaths([branchPrefix], range, changes_block_size) +changes = p4ChangesForPaths([branchPrefix], range, self.changes_block_size) if len(changes) = 0: return False firstChange = changes[0] -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] git-p4: Use -m when running p4 changes
On Mon, Apr 20, 2015 at 11:15 AM, Luke Diamand l...@diamand.org wrote: Sorry - could you resubmit your patch (PATCHv4 it will be) with this change squashed in please? It will make life much easier, especially for Junio! The message you just responded is already the squashed version. It's a single patch that includes all changes so far discussed. The subject line says PATCH v4, although since it's in the same thread, not all email clients will show the subject change. Let me know if I can do more to make the process go smoothly. Lex Spoon -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] git-p4: Use -m when running p4 changes
On 20/04/15 16:25, Lex Spoon wrote: On Mon, Apr 20, 2015 at 11:15 AM, Luke Diamand l...@diamand.org wrote: Sorry - could you resubmit your patch (PATCHv4 it will be) with this change squashed in please? It will make life much easier, especially for Junio! The message you just responded is already the squashed version. It's a single patch that includes all changes so far discussed. The subject line says PATCH v4, although since it's in the same thread, not all email clients will show the subject change. Not sure how I missed that! It looks good, now, Ack! Thanks! Luke -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html