[PATCH 1 of 2 stable v2] tests: set Git author and committer name and email settings explicitly

2023-04-18 Thread Manuel Jacob
# HG changeset patch
# User Manuel Jacob 
# Date 1680121377 -7200
#  Wed Mar 29 22:22:57 2023 +0200
# Branch stable
# Node ID 30082bb9719eb00f3be0081b7221d7c3061d4345
# Parent  0a9ddb8cd8c117671ecaf2b4126c3eef09e80ce8
# EXP-Topic tests-git
tests: set Git author and committer name and email settings explicitly

Passing at least GIT_COMMITTER_NAME and GIT_COMMITTER_EMAIL as environment
variables is necessary on my machine, which has the user.useconfigonly config
set. The author could be passed via command-line options, but it seems best to
pass everything uniformly.

diff --git a/kallithea/tests/other/test_vcs_operations.py 
b/kallithea/tests/other/test_vcs_operations.py
--- a/kallithea/tests/other/test_vcs_operations.py
+++ b/kallithea/tests/other/test_vcs_operations.py
@@ -167,6 +167,30 @@
 return tempfile.mkdtemp(dir=base.TESTS_TMP_PATH, prefix=prefix, 
suffix=suffix)
 
 
+def _commit(vcs, dest_dir, message, *extra_args):
+email = 'm...@example.com'
+if os.name == 'nt':
+name = 'User'
+else:
+name = 'User ǝɯɐᴎ'
+
+return Command(dest_dir).execute(
+vcs,
+'commit',
+'-m',
+'"%s"' % message,
+*extra_args,
+HGUSER='%s <%s>' % (name, email),
+# If the user.useconfigonly config is set, Git won't try to auto-detect
+# the name and email. For this case, we need to pass them as
+# environment variables.
+GIT_AUTHOR_NAME=name,
+GIT_AUTHOR_EMAIL=email,
+GIT_COMMITTER_NAME=name,
+GIT_COMMITTER_EMAIL=email,
+)
+
+
 def _add_files(vcs, dest_dir, files_no=3):
 """
 Generate some files, add it to dest_dir repo and push back
@@ -179,24 +203,10 @@
 open(os.path.join(dest_dir, added_file), 'a').close()
 Command(dest_dir).execute(vcs, 'add', added_file)
 
-email = 'm...@example.com'
-if os.name == 'nt':
-author_str = 'User <%s>' % email
-else:
-author_str = 'User ǝɯɐᴎ <%s>' % email
 for i in range(files_no):
 cmd = """echo "added_line%s" >> %s""" % (i, added_file)
 Command(dest_dir).execute(cmd)
-if vcs == 'hg':
-cmd = """hg commit -m "committed new %s" -u "%s" "%s" """ % (
-i, author_str, added_file
-)
-elif vcs == 'git':
-cmd = """git commit -m "committed new %s" --author "%s" "%s" """ % 
(
-i, author_str, added_file
-)
-# git commit needs EMAIL on some machines
-Command(dest_dir).execute(cmd, EMAIL=email)
+_commit(vcs, dest_dir, "committed new %s" % i, added_file)
 
 def _add_files_and_push(webserver, vt, dest_dir, clone_url, 
ignoreReturnCode=False, files_no=3):
 _add_files(vt.repo_type, dest_dir, files_no=files_no)
@@ -618,7 +628,7 @@
 # add submodule
 stdout, stderr = Command(base.TESTS_TMP_PATH).execute('git clone', 
fork_url, dest_dir)
 stdout, stderr = Command(dest_dir).execute('git submodule add', 
clone_url, 'testsubmodule')
-stdout, stderr = Command(dest_dir).execute('git commit -am "added 
testsubmodule pointing to', clone_url, '"', EMAIL=base.TEST_USER_ADMIN_EMAIL)
+stdout, stderr = _commit('git', dest_dir, "added testsubmodule 
pointing to %s" % clone_url, "-a")
 stdout, stderr = Command(dest_dir).execute('git push', fork_url, 
'master')
 
 # check for testsubmodule link in files page
___
kallithea-general mailing list
kallithea-general@sfconservancy.org
https://lists.sfconservancy.org/mailman/listinfo/kallithea-general


[PATCH 2 of 2 stable v2] tests: prevent Git system and global configuration from loading

2023-04-18 Thread Manuel Jacob
# HG changeset patch
# User Manuel Jacob 
# Date 1680139355 -7200
#  Thu Mar 30 03:22:35 2023 +0200
# Branch stable
# Node ID e5251abd0a3c677d7bb0828f3a744789bd6fe4cb
# Parent  30082bb9719eb00f3be0081b7221d7c3061d4345
# EXP-Topic tests-git
tests: prevent Git system and global configuration from loading

This reduces differences between different testing environments. Something
similar is already done for Mercurial (in the lines directly above this
change).

The parent changeset has originally been added to support user.useconfigonly.
With this changeset, the original motivation for it becomes obsolete. However,
it is still necessary to set the committer name via a environment variable, at
least on my machine.

diff --git a/kallithea/tests/other/test_vcs_operations.py 
b/kallithea/tests/other/test_vcs_operations.py
--- a/kallithea/tests/other/test_vcs_operations.py
+++ b/kallithea/tests/other/test_vcs_operations.py
@@ -150,6 +150,8 @@
 testenv['LANGUAGE'] = 'en_US:en'
 testenv['HGPLAIN'] = ''
 testenv['HGRCPATH'] = ''
+testenv['GIT_CONFIG_SYSTEM'] = ''
+testenv['GIT_CONFIG_GLOBAL'] = ''
 testenv.update(environ)
 p = Popen(command, shell=True, stdout=PIPE, stderr=PIPE, cwd=self.cwd, 
env=testenv)
 stdout, stderr = p.communicate()
@@ -181,9 +183,6 @@
 '"%s"' % message,
 *extra_args,
 HGUSER='%s <%s>' % (name, email),
-# If the user.useconfigonly config is set, Git won't try to auto-detect
-# the name and email. For this case, we need to pass them as
-# environment variables.
 GIT_AUTHOR_NAME=name,
 GIT_AUTHOR_EMAIL=email,
 GIT_COMMITTER_NAME=name,

___
kallithea-general mailing list
kallithea-general@sfconservancy.org
https://lists.sfconservancy.org/mailman/listinfo/kallithea-general


Re: Timeout with git clone

2023-04-18 Thread Quentin Wenger
Hi,

Thanks for the answers and comments.

> Yes, I agree that it probably would be much better to go back to use
> dulwich both for protocol serving and for providing data for the web
> frontend, instead of forking out to git. Disclaimer: I don't know has
> fast dulwich is these days. It could perhaps also be relevant to
> research what other python git hosting solutions do.

Are there other python git hosting solutions? The very reason I'm here is that 
I didn't really find anything else...

> If interested in contributing in this area, a first step could be to
> create a proof of concept of switching back to Dulwich and doing some
> benchmarks - both for local cloning with infinite network bandwidth
> (where I doubt dulwich can match pure git) and for more realistic remote
> internet bandwidth (where I guess it doesn't matter much).

Sounds like a good plan. I don't know if I'll find the time, but I'll try.

> But also note that subprocessio no longer only is used by pygrack. It is
> also used for run_git_command in
> kallithea/lib/vcs/backends/git/repository.py (introduced in
> 1f4d4b8d72f5), mainly for cloning and listing changesets. A full
> solution would require somehow replacing run_git_command with dulwich.
> But that can be done one at a time.

Yes I'm aware of that.

Kind regards,
Quentin
___
kallithea-general mailing list
kallithea-general@sfconservancy.org
https://lists.sfconservancy.org/mailman/listinfo/kallithea-general


Re: Timeout with git clone

2023-04-18 Thread Mads Kiilerich

Hi

The code in this area has worked surprisingly well since Kallithea 
inherited it, even though it has popped up regularly needing tricky 
maintenance. I agree it would be nice to refactor / reimplement this 
area. It is just that nobody invested time or sponsorship in doing it. I 
guess it hasn't caused enough pain for anybody to justify it ;-)


Yes, I agree that it probably would be much better to go back to use 
dulwich both for protocol serving and for providing data for the web 
frontend, instead of forking out to git. Disclaimer: I don't know has 
fast dulwich is these days. It could perhaps also be relevant to 
research what other python git hosting solutions do.


If interested in contributing in this area, a first step could be to 
create a proof of concept of switching back to Dulwich and doing some 
benchmarks - both for local cloning with infinite network bandwidth 
(where I doubt dulwich can match pure git) and for more realistic remote 
internet bandwidth (where I guess it doesn't matter much).


But also note that subprocessio no longer only is used by pygrack. It is 
also used for run_git_command in 
kallithea/lib/vcs/backends/git/repository.py (introduced in 
1f4d4b8d72f5), mainly for cloning and listing changesets. A full 
solution would require somehow replacing run_git_command with dulwich. 
But that can be done one at a time.


/Mads


On 18/04/2023 16:55, Quentin Wenger wrote:

Digging a bit deeper:

- The changeset that you linked 
(https://kallithea-scm.org/repos/kallithea/changeset/034e4fe1ebb2#rhodecodelibsubprocessiopy_n127)
 actually shows that historically it went the other way round, that is at first dulwich's 
server was used but then considered "buggy", therefore the implementation was 
replaced by some custom code.

- That custom code looks like coming from 
https://github.com/dvdotsenko/git_http_backend.py. That repo hasn't been 
updated since 2012, neither do its forks show any sign of recent activity.

- In contrast, dulwich, while officially still in beta, is actively developed.

IMhO the proper move would be to go back to dulwich. Chances are that those 
buggy things have been fixed in the last ten years. And if they haven't, better 
report them upstream than reinvent the wheel. By the way, do we have any more 
precise idea of what was considered buggy at the time?

What do you think?



___
kallithea-general mailing list
kallithea-general@sfconservancy.org
https://lists.sfconservancy.org/mailman/listinfo/kallithea-general


Re: Timeout with git clone

2023-04-18 Thread Quentin Wenger
Digging a bit deeper:

- The changeset that you linked 
(https://kallithea-scm.org/repos/kallithea/changeset/034e4fe1ebb2#rhodecodelibsubprocessiopy_n127)
 actually shows that historically it went the other way round, that is at first 
dulwich's server was used but then considered "buggy", therefore the 
implementation was replaced by some custom code.

- That custom code looks like coming from 
https://github.com/dvdotsenko/git_http_backend.py. That repo hasn't been 
updated since 2012, neither do its forks show any sign of recent activity.

- In contrast, dulwich, while officially still in beta, is actively developed.

IMhO the proper move would be to go back to dulwich. Chances are that those 
buggy things have been fixed in the last ten years. And if they haven't, better 
report them upstream than reinvent the wheel. By the way, do we have any more 
precise idea of what was considered buggy at the time?

What do you think?
___
kallithea-general mailing list
kallithea-general@sfconservancy.org
https://lists.sfconservancy.org/mailman/listinfo/kallithea-general


Re: Timeout with git clone

2023-04-18 Thread Quentin Wenger
Hi Mads,

I can try that, but I'm a bit worried that it is monkey-patching and 
half-solving at best. And the fact that this area is considered "obscure code" 
is even worse.

Trying to get a broader picture: There are comments like `TODO: This function 
now uses os underlying 'git' command which is generally not good.` all over the 
place. Maybe there should be a larger refactoring of the git backend taking 
place, where all uses of native Git are replaced by dulwich? That way the 
cryptic code in lib/vcs/subprocessio.py will also go away.

Is there any specific reason that those TODOs haven't been handled so far, 
apart from limited dev resources?

Thanks,
Quentin
___
kallithea-general mailing list
kallithea-general@sfconservancy.org
https://lists.sfconservancy.org/mailman/listinfo/kallithea-general