[PATCH 1 of 2 stable v2] tests: set Git author and committer name and email settings explicitly
# HG changeset patch # User Manuel Jacob # Date 1680121377 -7200 # Wed Mar 29 22:22:57 2023 +0200 # Branch stable # Node ID 30082bb9719eb00f3be0081b7221d7c3061d4345 # Parent 0a9ddb8cd8c117671ecaf2b4126c3eef09e80ce8 # EXP-Topic tests-git tests: set Git author and committer name and email settings explicitly Passing at least GIT_COMMITTER_NAME and GIT_COMMITTER_EMAIL as environment variables is necessary on my machine, which has the user.useconfigonly config set. The author could be passed via command-line options, but it seems best to pass everything uniformly. diff --git a/kallithea/tests/other/test_vcs_operations.py b/kallithea/tests/other/test_vcs_operations.py --- a/kallithea/tests/other/test_vcs_operations.py +++ b/kallithea/tests/other/test_vcs_operations.py @@ -167,6 +167,30 @@ return tempfile.mkdtemp(dir=base.TESTS_TMP_PATH, prefix=prefix, suffix=suffix) +def _commit(vcs, dest_dir, message, *extra_args): +email = 'm...@example.com' +if os.name == 'nt': +name = 'User' +else: +name = 'User ǝɯɐᴎ' + +return Command(dest_dir).execute( +vcs, +'commit', +'-m', +'"%s"' % message, +*extra_args, +HGUSER='%s <%s>' % (name, email), +# If the user.useconfigonly config is set, Git won't try to auto-detect +# the name and email. For this case, we need to pass them as +# environment variables. +GIT_AUTHOR_NAME=name, +GIT_AUTHOR_EMAIL=email, +GIT_COMMITTER_NAME=name, +GIT_COMMITTER_EMAIL=email, +) + + def _add_files(vcs, dest_dir, files_no=3): """ Generate some files, add it to dest_dir repo and push back @@ -179,24 +203,10 @@ open(os.path.join(dest_dir, added_file), 'a').close() Command(dest_dir).execute(vcs, 'add', added_file) -email = 'm...@example.com' -if os.name == 'nt': -author_str = 'User <%s>' % email -else: -author_str = 'User ǝɯɐᴎ <%s>' % email for i in range(files_no): cmd = """echo "added_line%s" >> %s""" % (i, added_file) Command(dest_dir).execute(cmd) -if vcs == 'hg': -cmd = """hg commit -m "committed new %s" -u "%s" "%s" """ % ( -i, author_str, added_file -) -elif vcs == 'git': -cmd = """git commit -m "committed new %s" --author "%s" "%s" """ % ( -i, author_str, added_file -) -# git commit needs EMAIL on some machines -Command(dest_dir).execute(cmd, EMAIL=email) +_commit(vcs, dest_dir, "committed new %s" % i, added_file) def _add_files_and_push(webserver, vt, dest_dir, clone_url, ignoreReturnCode=False, files_no=3): _add_files(vt.repo_type, dest_dir, files_no=files_no) @@ -618,7 +628,7 @@ # add submodule stdout, stderr = Command(base.TESTS_TMP_PATH).execute('git clone', fork_url, dest_dir) stdout, stderr = Command(dest_dir).execute('git submodule add', clone_url, 'testsubmodule') -stdout, stderr = Command(dest_dir).execute('git commit -am "added testsubmodule pointing to', clone_url, '"', EMAIL=base.TEST_USER_ADMIN_EMAIL) +stdout, stderr = _commit('git', dest_dir, "added testsubmodule pointing to %s" % clone_url, "-a") stdout, stderr = Command(dest_dir).execute('git push', fork_url, 'master') # check for testsubmodule link in files page ___ kallithea-general mailing list kallithea-general@sfconservancy.org https://lists.sfconservancy.org/mailman/listinfo/kallithea-general
[PATCH 2 of 2 stable v2] tests: prevent Git system and global configuration from loading
# HG changeset patch # User Manuel Jacob # Date 1680139355 -7200 # Thu Mar 30 03:22:35 2023 +0200 # Branch stable # Node ID e5251abd0a3c677d7bb0828f3a744789bd6fe4cb # Parent 30082bb9719eb00f3be0081b7221d7c3061d4345 # EXP-Topic tests-git tests: prevent Git system and global configuration from loading This reduces differences between different testing environments. Something similar is already done for Mercurial (in the lines directly above this change). The parent changeset has originally been added to support user.useconfigonly. With this changeset, the original motivation for it becomes obsolete. However, it is still necessary to set the committer name via a environment variable, at least on my machine. diff --git a/kallithea/tests/other/test_vcs_operations.py b/kallithea/tests/other/test_vcs_operations.py --- a/kallithea/tests/other/test_vcs_operations.py +++ b/kallithea/tests/other/test_vcs_operations.py @@ -150,6 +150,8 @@ testenv['LANGUAGE'] = 'en_US:en' testenv['HGPLAIN'] = '' testenv['HGRCPATH'] = '' +testenv['GIT_CONFIG_SYSTEM'] = '' +testenv['GIT_CONFIG_GLOBAL'] = '' testenv.update(environ) p = Popen(command, shell=True, stdout=PIPE, stderr=PIPE, cwd=self.cwd, env=testenv) stdout, stderr = p.communicate() @@ -181,9 +183,6 @@ '"%s"' % message, *extra_args, HGUSER='%s <%s>' % (name, email), -# If the user.useconfigonly config is set, Git won't try to auto-detect -# the name and email. For this case, we need to pass them as -# environment variables. GIT_AUTHOR_NAME=name, GIT_AUTHOR_EMAIL=email, GIT_COMMITTER_NAME=name, ___ kallithea-general mailing list kallithea-general@sfconservancy.org https://lists.sfconservancy.org/mailman/listinfo/kallithea-general
Re: Timeout with git clone
Hi, Thanks for the answers and comments. > Yes, I agree that it probably would be much better to go back to use > dulwich both for protocol serving and for providing data for the web > frontend, instead of forking out to git. Disclaimer: I don't know has > fast dulwich is these days. It could perhaps also be relevant to > research what other python git hosting solutions do. Are there other python git hosting solutions? The very reason I'm here is that I didn't really find anything else... > If interested in contributing in this area, a first step could be to > create a proof of concept of switching back to Dulwich and doing some > benchmarks - both for local cloning with infinite network bandwidth > (where I doubt dulwich can match pure git) and for more realistic remote > internet bandwidth (where I guess it doesn't matter much). Sounds like a good plan. I don't know if I'll find the time, but I'll try. > But also note that subprocessio no longer only is used by pygrack. It is > also used for run_git_command in > kallithea/lib/vcs/backends/git/repository.py (introduced in > 1f4d4b8d72f5), mainly for cloning and listing changesets. A full > solution would require somehow replacing run_git_command with dulwich. > But that can be done one at a time. Yes I'm aware of that. Kind regards, Quentin ___ kallithea-general mailing list kallithea-general@sfconservancy.org https://lists.sfconservancy.org/mailman/listinfo/kallithea-general
Re: Timeout with git clone
Hi The code in this area has worked surprisingly well since Kallithea inherited it, even though it has popped up regularly needing tricky maintenance. I agree it would be nice to refactor / reimplement this area. It is just that nobody invested time or sponsorship in doing it. I guess it hasn't caused enough pain for anybody to justify it ;-) Yes, I agree that it probably would be much better to go back to use dulwich both for protocol serving and for providing data for the web frontend, instead of forking out to git. Disclaimer: I don't know has fast dulwich is these days. It could perhaps also be relevant to research what other python git hosting solutions do. If interested in contributing in this area, a first step could be to create a proof of concept of switching back to Dulwich and doing some benchmarks - both for local cloning with infinite network bandwidth (where I doubt dulwich can match pure git) and for more realistic remote internet bandwidth (where I guess it doesn't matter much). But also note that subprocessio no longer only is used by pygrack. It is also used for run_git_command in kallithea/lib/vcs/backends/git/repository.py (introduced in 1f4d4b8d72f5), mainly for cloning and listing changesets. A full solution would require somehow replacing run_git_command with dulwich. But that can be done one at a time. /Mads On 18/04/2023 16:55, Quentin Wenger wrote: Digging a bit deeper: - The changeset that you linked (https://kallithea-scm.org/repos/kallithea/changeset/034e4fe1ebb2#rhodecodelibsubprocessiopy_n127) actually shows that historically it went the other way round, that is at first dulwich's server was used but then considered "buggy", therefore the implementation was replaced by some custom code. - That custom code looks like coming from https://github.com/dvdotsenko/git_http_backend.py. That repo hasn't been updated since 2012, neither do its forks show any sign of recent activity. - In contrast, dulwich, while officially still in beta, is actively developed. IMhO the proper move would be to go back to dulwich. Chances are that those buggy things have been fixed in the last ten years. And if they haven't, better report them upstream than reinvent the wheel. By the way, do we have any more precise idea of what was considered buggy at the time? What do you think? ___ kallithea-general mailing list kallithea-general@sfconservancy.org https://lists.sfconservancy.org/mailman/listinfo/kallithea-general
Re: Timeout with git clone
Digging a bit deeper: - The changeset that you linked (https://kallithea-scm.org/repos/kallithea/changeset/034e4fe1ebb2#rhodecodelibsubprocessiopy_n127) actually shows that historically it went the other way round, that is at first dulwich's server was used but then considered "buggy", therefore the implementation was replaced by some custom code. - That custom code looks like coming from https://github.com/dvdotsenko/git_http_backend.py. That repo hasn't been updated since 2012, neither do its forks show any sign of recent activity. - In contrast, dulwich, while officially still in beta, is actively developed. IMhO the proper move would be to go back to dulwich. Chances are that those buggy things have been fixed in the last ten years. And if they haven't, better report them upstream than reinvent the wheel. By the way, do we have any more precise idea of what was considered buggy at the time? What do you think? ___ kallithea-general mailing list kallithea-general@sfconservancy.org https://lists.sfconservancy.org/mailman/listinfo/kallithea-general
Re: Timeout with git clone
Hi Mads, I can try that, but I'm a bit worried that it is monkey-patching and half-solving at best. And the fact that this area is considered "obscure code" is even worse. Trying to get a broader picture: There are comments like `TODO: This function now uses os underlying 'git' command which is generally not good.` all over the place. Maybe there should be a larger refactoring of the git backend taking place, where all uses of native Git are replaced by dulwich? That way the cryptic code in lib/vcs/subprocessio.py will also go away. Is there any specific reason that those TODOs haven't been handled so far, apart from limited dev resources? Thanks, Quentin ___ kallithea-general mailing list kallithea-general@sfconservancy.org https://lists.sfconservancy.org/mailman/listinfo/kallithea-general