Re: Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py
On 12/11/23 4:54 AM, Yasuhito FUTATSUKI wrote: > On 2023/12/10 4:22, Yasuhito FUTATSUKI wrote: >> Hi, >> >> On 2023/12/09 0:04, Daniel Sahlberg wrote: >> >>> Den fre 8 dec. 2023 kl 05:40 skrev Yasuhito FUTATSUKI < >>> futat...@yf.bsdclub.org>: >> Index: tools/hook-scripts/mailer/mailer.py === --- tools/hook-scripts/mailer/mailer.py (revision 1913728) +++ tools/hook-scripts/mailer/mailer.py (working copy) @@ -488,7 +488,7 @@ # collect the set of groups and the unique sets of params for the options self.groups = { } for path, change in self.changelist: - for (group, params) in self.cfg.which_groups(path, log): + for (group, params) in self.cfg.which_groups(to_str(path), log): # turn the params into a hashable object and stash it away param_list = sorted(params.items()) # collect the set of paths belonging to this group @@ -1486,9 +1486,9 @@ "Return the path's associated groups." groups = [] for group, pattern, exclude_pattern, repos_params, search_logmsg_re in self._group_re: - match = pattern.match(to_str(path)) + match = pattern.match(path) if match: -if exclude_pattern and exclude_pattern.match(to_str(path)): +if exclude_pattern and exclude_pattern.match(path): continue params = repos_params.copy() params.update(match.groupdict()) ]]] Cheers, -- Yasuhito FUTATSUKI >>> >>> This looks good to me! Thanks for the detailed explaination! >> >> Thank you for the review. However, it turned out that even with this >> patch, mailer.py did not work for post-revprop-change hook. >> It caused exception like >> >> [[[ >> svn: E165001: post-revprop-change hook failed (exit code 1) with output: >> Traceback (most recent call last): >> File >> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", >> line 1593, in >> ret = svn.core.run_app(main, cmd, config_fname, repos_dir, >> File "/usr/local/lib/python3.9/site-packages/svn/core.py", line 324, in >> run_app >> return func(application_pool, *args, **kw) >> File >> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", >> line 148, in main >> return messenger.generate(output, pool) >> File >> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", >> line 601, in generate >> output.run(self.cfg.get_diff_cmd(group, { >> File >> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", >> line 224, in run >> self.write_binary(buf) >> AttributeError: 'SMTPOutput' object has no attribute 'write_binary' >> ]]] >> >> On 1.14.x branch, the patch can be applied clearly, and it worked >> for post-revprop-change hook. > > On 1.14.x without this patch, it mailer.py does not work for revprop-change, > lock, and unlock if at least one group definithion with "for_paths" in > mailer.conf, and the patch fixes it. > > (My non-generalized test script did not checked the case that mailer.py > contains for_paths definitions. That is why I missed it.) > > So I'll commit it and nominate for backport, although it is too late > 1.14.3. Thanks this patch did the trick. Regards Rüdiger
Re: Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py
Oh, shoot. And I was trying to be incremental/careful. Thank you for catching that! I also like how you fixed this. The .run() method did need to be removed because of that singular usage. Using the new generate_diff() function is a great solution! (whereas we used to have an object, which made it unclear on how to use it for this situation) I've applied your patch in r1914729. Thanks! -g On Sat, Dec 16, 2023 at 6:08 AM Yasuhito FUTATSUKI wrote: > On 2023/12/11 12:54, Yasuhito FUTATSUKI wrote: > > On 2023/12/10 4:22, Yasuhito FUTATSUKI wrote: > > >> Thank you for the review. However, it turned out that even with this > >> patch, mailer.py did not work for post-revprop-change hook. > >> It caused exception like > >> > >> [[[ > >> svn: E165001: post-revprop-change hook failed (exit code 1) with output: > >> Traceback (most recent call last): > >> File > "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", > line 1593, in > >> ret = svn.core.run_app(main, cmd, config_fname, repos_dir, > >> File "/usr/local/lib/python3.9/site-packages/svn/core.py", line 324, > in run_app > >> return func(application_pool, *args, **kw) > >> File > "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", > line 148, in main > >> return messenger.generate(output, pool) > >> File > "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", > line 601, in generate > >> output.run(self.cfg.get_diff_cmd(group, { > >> File > "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", > line 224, in run > >> self.write_binary(buf) > >> AttributeError: 'SMTPOutput' object has no attribute 'write_binary' > >> ]]] > > The cause was that Output.run() was broken by removal of > Output.write_binary() on r1912978. > > Here is an ad hoc patch: > [[[ > Fix PropChange.generate > > * tools/hook-scripts/mailer/mailer.py > (OutputBase.run): remove, because below was the only usage. > (PropChange.generate): > use generate_diff() to render propchange diff instead of > OutputBase.run() > > Index: tools/hook-scripts/mailer/mailer.py > === > --- tools/hook-scripts/mailer/mailer.py (revision 1914700) > +++ tools/hook-scripts/mailer/mailer.py (working copy) > @@ -211,23 +211,7 @@ > representation.""" > raise NotImplementedError > > - def run(self, cmd): > -"""Override this method, if the default implementation is not > sufficient. > -Execute CMD, writing the stdout produced to the output > representation.""" > -# By default we choose to incorporate child stderr into the output > -pipe_ob = subprocess.Popen(cmd, stdout=subprocess.PIPE, > - stderr=subprocess.STDOUT, > - close_fds=sys.platform != "win32") > > -buf = pipe_ob.stdout.read(self._CHUNKSIZE) > -while buf: > - self.write_binary(buf) > - buf = pipe_ob.stdout.read(self._CHUNKSIZE) > - > -# wait on the child so we don't end up with a billion zombies > -pipe_ob.wait() > - > - > class MailedOutput(OutputBase): > >def start(self, subject_line, group, params): > @@ -598,12 +582,13 @@ >tempfile2 = tempfile.NamedTemporaryFile() >tempfile2.write(self.repos.get_rev_prop(self.propname, > scratch_pool)) >tempfile2.flush() > - output.run(self.cfg.get_diff_cmd(group, { > -'label_from' : 'old property value', > -'label_to' : 'new property value', > -'from' : tempfile1.name, > -'to' : tempfile2.name, > -})) > + for diffs in generate_diff(self.cfg.get_diff_cmd(group, { > + 'label_from' : 'old property value', > + 'label_to' : 'new property value', > + 'from' : tempfile1.name, > + 'to' : tempfile2.name, > + })): > + writer.write(to_str(diffs.raw)) > output.finish() >except MessageSendFailure: > ret = 1 > ]]] > > Cheers, > -- > Yasuhito FUTATSUKI / >
Re: Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py
On 2023/12/11 12:54, Yasuhito FUTATSUKI wrote: > On 2023/12/10 4:22, Yasuhito FUTATSUKI wrote: >> Thank you for the review. However, it turned out that even with this >> patch, mailer.py did not work for post-revprop-change hook. >> It caused exception like >> >> [[[ >> svn: E165001: post-revprop-change hook failed (exit code 1) with output: >> Traceback (most recent call last): >> File >> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", >> line 1593, in >> ret = svn.core.run_app(main, cmd, config_fname, repos_dir, >> File "/usr/local/lib/python3.9/site-packages/svn/core.py", line 324, in >> run_app >> return func(application_pool, *args, **kw) >> File >> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", >> line 148, in main >> return messenger.generate(output, pool) >> File >> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", >> line 601, in generate >> output.run(self.cfg.get_diff_cmd(group, { >> File >> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", >> line 224, in run >> self.write_binary(buf) >> AttributeError: 'SMTPOutput' object has no attribute 'write_binary' >> ]]] The cause was that Output.run() was broken by removal of Output.write_binary() on r1912978. Here is an ad hoc patch: [[[ Fix PropChange.generate * tools/hook-scripts/mailer/mailer.py (OutputBase.run): remove, because below was the only usage. (PropChange.generate): use generate_diff() to render propchange diff instead of OutputBase.run() Index: tools/hook-scripts/mailer/mailer.py === --- tools/hook-scripts/mailer/mailer.py (revision 1914700) +++ tools/hook-scripts/mailer/mailer.py (working copy) @@ -211,23 +211,7 @@ representation.""" raise NotImplementedError - def run(self, cmd): -"""Override this method, if the default implementation is not sufficient. -Execute CMD, writing the stdout produced to the output representation.""" -# By default we choose to incorporate child stderr into the output -pipe_ob = subprocess.Popen(cmd, stdout=subprocess.PIPE, - stderr=subprocess.STDOUT, - close_fds=sys.platform != "win32") -buf = pipe_ob.stdout.read(self._CHUNKSIZE) -while buf: - self.write_binary(buf) - buf = pipe_ob.stdout.read(self._CHUNKSIZE) - -# wait on the child so we don't end up with a billion zombies -pipe_ob.wait() - - class MailedOutput(OutputBase): def start(self, subject_line, group, params): @@ -598,12 +582,13 @@ tempfile2 = tempfile.NamedTemporaryFile() tempfile2.write(self.repos.get_rev_prop(self.propname, scratch_pool)) tempfile2.flush() - output.run(self.cfg.get_diff_cmd(group, { -'label_from' : 'old property value', -'label_to' : 'new property value', -'from' : tempfile1.name, -'to' : tempfile2.name, -})) + for diffs in generate_diff(self.cfg.get_diff_cmd(group, { + 'label_from' : 'old property value', + 'label_to' : 'new property value', + 'from' : tempfile1.name, + 'to' : tempfile2.name, + })): + writer.write(to_str(diffs.raw)) output.finish() except MessageSendFailure: ret = 1 ]]] Cheers, -- Yasuhito FUTATSUKI /
Re: Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py
On 2023/12/10 4:22, Yasuhito FUTATSUKI wrote: > Hi, > > On 2023/12/09 0:04, Daniel Sahlberg wrote: > >> Den fre 8 dec. 2023 kl 05:40 skrev Yasuhito FUTATSUKI < >> futat...@yf.bsdclub.org>: > > > >>> Then, how about this patch? It at least mailer-t1.sh passed both >>> on python=python2.7 and on python=python3.9. >>> >>> [[[ >>> Fix inconsistency in path argment on Config.which_group() >>> >>> Previously, we call Config.which_group() with path as bytes in >>> Commit.__init__() and with path as str in Lock.__init__() and >>> in PropChange.__init__(), but Config.which_group handled path >>> argment as bytes. To fix it we only use str as path argment on >>> Config.wich_group(). >>> >>> * tools/hook-scripts/mailer/mailer.py >>> (Config.which_groups): Treat path as str. >>> (Commit.__init__): convert bytes path into str before calling above. >>> >>> Found by: Ruediger Pluem (rpluem {_AT_} apache.org) >>> >>> Index: tools/hook-scripts/mailer/mailer.py >>> === >>> --- tools/hook-scripts/mailer/mailer.py (revision 1913728) >>> +++ tools/hook-scripts/mailer/mailer.py (working copy) >>> @@ -488,7 +488,7 @@ >>> # collect the set of groups and the unique sets of params for the >>> options >>> self.groups = { } >>> for path, change in self.changelist: >>> - for (group, params) in self.cfg.which_groups(path, log): >>> + for (group, params) in self.cfg.which_groups(to_str(path), log): >>> # turn the params into a hashable object and stash it away >>> param_list = sorted(params.items()) >>> # collect the set of paths belonging to this group >>> @@ -1486,9 +1486,9 @@ >>> "Return the path's associated groups." >>> groups = [] >>> for group, pattern, exclude_pattern, repos_params, search_logmsg_re >>> in self._group_re: >>> - match = pattern.match(to_str(path)) >>> + match = pattern.match(path) >>>if match: >>> -if exclude_pattern and exclude_pattern.match(to_str(path)): >>> +if exclude_pattern and exclude_pattern.match(path): >>>continue >>> params = repos_params.copy() >>> params.update(match.groupdict()) >>> ]]] >>> >>> Cheers, >>> -- >>> Yasuhito FUTATSUKI >>> >> >> This looks good to me! Thanks for the detailed explaination! > > Thank you for the review. However, it turned out that even with this > patch, mailer.py did not work for post-revprop-change hook. > It caused exception like > > [[[ > svn: E165001: post-revprop-change hook failed (exit code 1) with output: > Traceback (most recent call last): > File > "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", > line 1593, in > ret = svn.core.run_app(main, cmd, config_fname, repos_dir, > File "/usr/local/lib/python3.9/site-packages/svn/core.py", line 324, in > run_app > return func(application_pool, *args, **kw) > File > "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", > line 148, in main > return messenger.generate(output, pool) > File > "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", > line 601, in generate > output.run(self.cfg.get_diff_cmd(group, { > File > "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", > line 224, in run > self.write_binary(buf) > AttributeError: 'SMTPOutput' object has no attribute 'write_binary' > ]]] > > On 1.14.x branch, the patch can be applied clearly, and it worked > for post-revprop-change hook. On 1.14.x without this patch, it mailer.py does not work for revprop-change, lock, and unlock if at least one group definithion with "for_paths" in mailer.conf, and the patch fixes it. (My non-generalized test script did not checked the case that mailer.py contains for_paths definitions. That is why I missed it.) So I'll commit it and nominate for backport, although it is too late 1.14.3. Cheers, -- Yasuhito FUTATSUKI /
Re: Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py
Hi, On 2023/12/09 0:04, Daniel Sahlberg wrote: > Den fre 8 dec. 2023 kl 05:40 skrev Yasuhito FUTATSUKI < > futat...@yf.bsdclub.org>: >> Then, how about this patch? It at least mailer-t1.sh passed both >> on python=python2.7 and on python=python3.9. >> >> [[[ >> Fix inconsistency in path argment on Config.which_group() >> >> Previously, we call Config.which_group() with path as bytes in >> Commit.__init__() and with path as str in Lock.__init__() and >> in PropChange.__init__(), but Config.which_group handled path >> argment as bytes. To fix it we only use str as path argment on >> Config.wich_group(). >> >> * tools/hook-scripts/mailer/mailer.py >> (Config.which_groups): Treat path as str. >> (Commit.__init__): convert bytes path into str before calling above. >> >> Found by: Ruediger Pluem (rpluem {_AT_} apache.org) >> >> Index: tools/hook-scripts/mailer/mailer.py >> === >> --- tools/hook-scripts/mailer/mailer.py (revision 1913728) >> +++ tools/hook-scripts/mailer/mailer.py (working copy) >> @@ -488,7 +488,7 @@ >> # collect the set of groups and the unique sets of params for the >> options >> self.groups = { } >> for path, change in self.changelist: >> - for (group, params) in self.cfg.which_groups(path, log): >> + for (group, params) in self.cfg.which_groups(to_str(path), log): >> # turn the params into a hashable object and stash it away >> param_list = sorted(params.items()) >> # collect the set of paths belonging to this group >> @@ -1486,9 +1486,9 @@ >> "Return the path's associated groups." >> groups = [] >> for group, pattern, exclude_pattern, repos_params, search_logmsg_re >> in self._group_re: >> - match = pattern.match(to_str(path)) >> + match = pattern.match(path) >>if match: >> -if exclude_pattern and exclude_pattern.match(to_str(path)): >> +if exclude_pattern and exclude_pattern.match(path): >>continue >> params = repos_params.copy() >> params.update(match.groupdict()) >> ]]] >> >> Cheers, >> -- >> Yasuhito FUTATSUKI >> > > This looks good to me! Thanks for the detailed explaination! Thank you for the review. However, it turned out that even with this patch, mailer.py did not work for post-revprop-change hook. It caused exception like [[[ svn: E165001: post-revprop-change hook failed (exit code 1) with output: Traceback (most recent call last): File "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", line 1593, in ret = svn.core.run_app(main, cmd, config_fname, repos_dir, File "/usr/local/lib/python3.9/site-packages/svn/core.py", line 324, in run_app return func(application_pool, *args, **kw) File "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", line 148, in main return messenger.generate(output, pool) File "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", line 601, in generate output.run(self.cfg.get_diff_cmd(group, { File "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", line 224, in run self.write_binary(buf) AttributeError: 'SMTPOutput' object has no attribute 'write_binary' ]]] On 1.14.x branch, the patch can be applied clearly, and it worked for post-revprop-change hook. Cheers, -- Yasuhito FUTATSUKI /
Re: Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py
Hi, Den fre 8 dec. 2023 kl 05:40 skrev Yasuhito FUTATSUKI < futat...@yf.bsdclub.org>: > > > On 2023/12/07 19:33, Daniel Sahlberg wrote: > > Den tors 7 dec. 2023 kl 09:51 skrev Ruediger Pluem : > > > >> I stumbled accross a Python 3 compatibility issue in > >> tools/hook-scripts/mailer/mailer.py. > >> > >> The call of self.cfg.which_groups in line 565 passes an empty string as > >> first parameter. > >> In which_groups this empty string is passed to to_str in line 1489. > >> In line 88 to_str does x.decode('utf-8') > >> In Python 2 str objects have a decode method at least in later Python 3 > >> versions they have not. Hence I propose the following patch which fixes > the > >> issue for me: > > > > There was some work done on mailer.py by Greg Stein, started here: > > https://lists.apache.org/thread/v5bh1j8qz8kk0jv1tlctxqt1k454tz0h > > However I don't think that touched these parts of the code. > > Yes, I also think the issue has existed before it. > > > I think there is more to this than that simple fix, but I'm no expert in > > Python or in the Python bindings. There are two more calls to > > which_groups(): > > > > Line 491, called with the return from > > svn.repos.ChangeCollector(...).get_changes().items() (first argument in a > > tuple). I don't know which encoding this uses. > > We decided that for all C APIs which returns char * values, their Python 3 > wrapper functions return them as bytes object, and also we don't convert > those values into str within helper functions in svn modules. Thus their > path elements should be bytes object, and as they are internal paths, > their encoding is UTF-8, regardless of locale. > > > Line 663, called with the return from sys.stdin.buffer.readlines(), > already > > processed by to_str() once. > > > > In particular the call on 663 should be double decoded unless I'm missing > > something. > > Yes, there is inconsistency here. > > > So depending on what get_changes() return, maybe we should remove the > > to_str() from which_groups() instead? > > As Config.which_group() call from Commit.__init__() has worked correctly, > the pathes as matching pattern in which_group() are expected as str. > So if remove to_str() from which_groups(), path argment for > Config.which_group() should be a str object. > > > Note that we are still somewhat supporting Python 2 so we need to make > sure > > any changes does work on Python 2 as well. > > Then, how about this patch? It at least mailer-t1.sh passed both > on python=python2.7 and on python=python3.9. > > [[[ > Fix inconsistency in path argment on Config.which_group() > > Previously, we call Config.which_group() with path as bytes in > Commit.__init__() and with path as str in Lock.__init__() and > in PropChange.__init__(), but Config.which_group handled path > argment as bytes. To fix it we only use str as path argment on > Config.wich_group(). > > * tools/hook-scripts/mailer/mailer.py > (Config.which_groups): Treat path as str. > (Commit.__init__): convert bytes path into str before calling above. > > Found by: Ruediger Pluem (rpluem {_AT_} apache.org) > > Index: tools/hook-scripts/mailer/mailer.py > === > --- tools/hook-scripts/mailer/mailer.py (revision 1913728) > +++ tools/hook-scripts/mailer/mailer.py (working copy) > @@ -488,7 +488,7 @@ > # collect the set of groups and the unique sets of params for the > options > self.groups = { } > for path, change in self.changelist: > - for (group, params) in self.cfg.which_groups(path, log): > + for (group, params) in self.cfg.which_groups(to_str(path), log): > # turn the params into a hashable object and stash it away > param_list = sorted(params.items()) > # collect the set of paths belonging to this group > @@ -1486,9 +1486,9 @@ > "Return the path's associated groups." > groups = [] > for group, pattern, exclude_pattern, repos_params, search_logmsg_re > in self._group_re: > - match = pattern.match(to_str(path)) > + match = pattern.match(path) >if match: > -if exclude_pattern and exclude_pattern.match(to_str(path)): > +if exclude_pattern and exclude_pattern.match(path): >continue > params = repos_params.copy() > params.update(match.groupdict()) > ]]] > > Cheers, > -- > Yasuhito FUTATSUKI > This looks good to me! Thanks for the detailed explaination! Kind regards, Daniel Sahlberg
Re: Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py
On 2023/12/07 19:33, Daniel Sahlberg wrote: > Den tors 7 dec. 2023 kl 09:51 skrev Ruediger Pluem : > >> I stumbled accross a Python 3 compatibility issue in >> tools/hook-scripts/mailer/mailer.py. >> >> The call of self.cfg.which_groups in line 565 passes an empty string as >> first parameter. >> In which_groups this empty string is passed to to_str in line 1489. >> In line 88 to_str does x.decode('utf-8') >> In Python 2 str objects have a decode method at least in later Python 3 >> versions they have not. Hence I propose the following patch which fixes the >> issue for me: > There was some work done on mailer.py by Greg Stein, started here: > https://lists.apache.org/thread/v5bh1j8qz8kk0jv1tlctxqt1k454tz0h > However I don't think that touched these parts of the code. Yes, I also think the issue has existed before it. > I think there is more to this than that simple fix, but I'm no expert in > Python or in the Python bindings. There are two more calls to > which_groups(): > > Line 491, called with the return from > svn.repos.ChangeCollector(...).get_changes().items() (first argument in a > tuple). I don't know which encoding this uses. We decided that for all C APIs which returns char * values, their Python 3 wrapper functions return them as bytes object, and also we don't convert those values into str within helper functions in svn modules. Thus their path elements should be bytes object, and as they are internal paths, their encoding is UTF-8, regardless of locale. > Line 663, called with the return from sys.stdin.buffer.readlines(), already > processed by to_str() once. > > In particular the call on 663 should be double decoded unless I'm missing > something. Yes, there is inconsistency here. > So depending on what get_changes() return, maybe we should remove the > to_str() from which_groups() instead? As Config.which_group() call from Commit.__init__() has worked correctly, the pathes as matching pattern in which_group() are expected as str. So if remove to_str() from which_groups(), path argment for Config.which_group() should be a str object. > Note that we are still somewhat supporting Python 2 so we need to make sure > any changes does work on Python 2 as well. Then, how about this patch? It at least mailer-t1.sh passed both on python=python2.7 and on python=python3.9. [[[ Fix inconsistency in path argment on Config.which_group() Previously, we call Config.which_group() with path as bytes in Commit.__init__() and with path as str in Lock.__init__() and in PropChange.__init__(), but Config.which_group handled path argment as bytes. To fix it we only use str as path argment on Config.wich_group(). * tools/hook-scripts/mailer/mailer.py (Config.which_groups): Treat path as str. (Commit.__init__): convert bytes path into str before calling above. Found by: Ruediger Pluem (rpluem {_AT_} apache.org) Index: tools/hook-scripts/mailer/mailer.py === --- tools/hook-scripts/mailer/mailer.py (revision 1913728) +++ tools/hook-scripts/mailer/mailer.py (working copy) @@ -488,7 +488,7 @@ # collect the set of groups and the unique sets of params for the options self.groups = { } for path, change in self.changelist: - for (group, params) in self.cfg.which_groups(path, log): + for (group, params) in self.cfg.which_groups(to_str(path), log): # turn the params into a hashable object and stash it away param_list = sorted(params.items()) # collect the set of paths belonging to this group @@ -1486,9 +1486,9 @@ "Return the path's associated groups." groups = [] for group, pattern, exclude_pattern, repos_params, search_logmsg_re in self._group_re: - match = pattern.match(to_str(path)) + match = pattern.match(path) if match: -if exclude_pattern and exclude_pattern.match(to_str(path)): +if exclude_pattern and exclude_pattern.match(path): continue params = repos_params.copy() params.update(match.groupdict()) ]]] Cheers, -- Yasuhito FUTATSUKI
Re: Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py
Hello, On 2023/12/07 17:51, Ruediger Pluem wrote: I stumbled accross a Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py. The call of self.cfg.which_groups in line 565 passes an empty string as first parameter. In which_groups this empty string is passed to to_str in line 1489. In line 88 to_str does x.decode('utf-8') In Python 2 str objects have a decode method at least in later Python 3 versions they have not. Hence I propose the following patch which fixes the issue for me: Thank you for the report and proposing a patch. This was caused by my overlook. I've checked the usage of Config.which_groups(), then it turned out that only the call from Commits.__init__ passes a path as a bytes, and calls from Look.__init__ and PropChange.__init__ pass a path as str. More over, in Config.which_groups(), the path argment is always converted to str when it is used. So I prefer changing Config.which_groups to accept str path, and also changing Commits.__init__ to pass a str path. Index: mailer.py === --- mailer.py (revision 1914422) +++ mailer.py (working copy) @@ -562,7 +562,7 @@ # collect the set of groups and the unique sets of params for the options self.groups = { } -for (group, params) in self.cfg.which_groups('', None): +for (group, params) in self.cfg.which_groups(b'', None): # turn the params into a hashable object and stash it away param_list = sorted(params.items()) self.groups[group, tuple(param_list)] = params If I would get a go ahead here on list I would commit. Regards Rüdiger Thanks, -- Yasuhito FUTATSUKI
Re: Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py
Den tors 7 dec. 2023 kl 09:51 skrev Ruediger Pluem : > I stumbled accross a Python 3 compatibility issue in > tools/hook-scripts/mailer/mailer.py. > > The call of self.cfg.which_groups in line 565 passes an empty string as > first parameter. > In which_groups this empty string is passed to to_str in line 1489. > In line 88 to_str does x.decode('utf-8') > In Python 2 str objects have a decode method at least in later Python 3 > versions they have not. Hence I propose the following patch which fixes the > issue for me: > > Index: mailer.py > === > --- mailer.py (revision 1914422) > +++ mailer.py (working copy) > @@ -562,7 +562,7 @@ > > # collect the set of groups and the unique sets of params for the > options > self.groups = { } > -for (group, params) in self.cfg.which_groups('', None): > +for (group, params) in self.cfg.which_groups(b'', None): ># turn the params into a hashable object and stash it away >param_list = sorted(params.items()) >self.groups[group, tuple(param_list)] = params > > > If I would get a go ahead here on list I would commit. > > Regards > > Rüdiger > There was some work done on mailer.py by Greg Stein, started here: https://lists.apache.org/thread/v5bh1j8qz8kk0jv1tlctxqt1k454tz0h However I don't think that touched these parts of the code. I think there is more to this than that simple fix, but I'm no expert in Python or in the Python bindings. There are two more calls to which_groups(): Line 491, called with the return from svn.repos.ChangeCollector(...).get_changes().items() (first argument in a tuple). I don't know which encoding this uses. Line 663, called with the return from sys.stdin.buffer.readlines(), already processed by to_str() once. In particular the call on 663 should be double decoded unless I'm missing something. So depending on what get_changes() return, maybe we should remove the to_str() from which_groups() instead? Note that we are still somewhat supporting Python 2 so we need to make sure any changes does work on Python 2 as well. Kind regards, Daniel