Re: Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py

2023-12-22 Thread Ruediger Pluem



On 12/11/23 4:54 AM, Yasuhito FUTATSUKI wrote:
> On 2023/12/10 4:22, Yasuhito FUTATSUKI wrote:
>> Hi,
>>
>> On 2023/12/09 0:04, Daniel Sahlberg wrote:
>>
>>> Den fre 8 dec. 2023 kl 05:40 skrev Yasuhito FUTATSUKI <
>>> futat...@yf.bsdclub.org>:
>>

 Index: tools/hook-scripts/mailer/mailer.py
 ===
 --- tools/hook-scripts/mailer/mailer.py (revision 1913728)
 +++ tools/hook-scripts/mailer/mailer.py (working copy)
 @@ -488,7 +488,7 @@
  # collect the set of groups and the unique sets of params for the
 options
  self.groups = { }
  for path, change in self.changelist:
 -  for (group, params) in self.cfg.which_groups(path, log):
 +  for (group, params) in self.cfg.which_groups(to_str(path), log):
  # turn the params into a hashable object and stash it away
  param_list = sorted(params.items())
  # collect the set of paths belonging to this group
 @@ -1486,9 +1486,9 @@
  "Return the path's associated groups."
  groups = []
  for group, pattern, exclude_pattern, repos_params, search_logmsg_re
 in self._group_re:
 -  match = pattern.match(to_str(path))
 +  match = pattern.match(path)
if match:
 -if exclude_pattern and exclude_pattern.match(to_str(path)):
 +if exclude_pattern and exclude_pattern.match(path):
continue
  params = repos_params.copy()
  params.update(match.groupdict())
 ]]]

 Cheers,
 --
 Yasuhito FUTATSUKI 

>>>
>>> This looks good to me! Thanks for the detailed explaination!
>>
>> Thank you for the review. However, it turned out that even with this
>> patch, mailer.py did not work for post-revprop-change hook.
>> It caused exception like
>>
>> [[[
>> svn: E165001: post-revprop-change hook failed (exit code 1) with output:
>> Traceback (most recent call last):
>>   File 
>> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", 
>> line 1593, in 
>> ret = svn.core.run_app(main, cmd, config_fname, repos_dir,
>>   File "/usr/local/lib/python3.9/site-packages/svn/core.py", line 324, in 
>> run_app
>> return func(application_pool, *args, **kw)
>>   File 
>> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", 
>> line 148, in main
>> return messenger.generate(output, pool)
>>   File 
>> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", 
>> line 601, in generate
>> output.run(self.cfg.get_diff_cmd(group, {
>>   File 
>> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", 
>> line 224, in run
>> self.write_binary(buf)
>> AttributeError: 'SMTPOutput' object has no attribute 'write_binary'
>> ]]]
>>
>> On 1.14.x branch, the patch can be applied clearly, and it worked
>> for post-revprop-change hook.
> 
> On 1.14.x without this patch, it mailer.py does not work for revprop-change,
> lock, and unlock if at least one group definithion with "for_paths" in
> mailer.conf, and the patch fixes it.
> 
> (My non-generalized test script did not checked the case that mailer.py
> contains for_paths definitions. That is why I missed it.)
> 
> So I'll commit it and nominate for backport, although it is too late
> 1.14.3.

Thanks this patch did the trick.

Regards

Rüdiger




Re: Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py

2023-12-16 Thread Greg Stein
Oh, shoot. And I was trying to be incremental/careful. Thank you for
catching that!

I also like how you fixed this. The .run() method did need to be removed
because of that singular usage. Using the new generate_diff() function is a
great solution! (whereas we used to have an object, which made it unclear
on how to use it for this situation)

I've applied your patch in r1914729.

Thanks!
-g


On Sat, Dec 16, 2023 at 6:08 AM Yasuhito FUTATSUKI 
wrote:

> On 2023/12/11 12:54, Yasuhito FUTATSUKI wrote:
> > On 2023/12/10 4:22, Yasuhito FUTATSUKI wrote:
>
> >> Thank you for the review. However, it turned out that even with this
> >> patch, mailer.py did not work for post-revprop-change hook.
> >> It caused exception like
> >>
> >> [[[
> >> svn: E165001: post-revprop-change hook failed (exit code 1) with output:
> >> Traceback (most recent call last):
> >>   File
> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py",
> line 1593, in 
> >> ret = svn.core.run_app(main, cmd, config_fname, repos_dir,
> >>   File "/usr/local/lib/python3.9/site-packages/svn/core.py", line 324,
> in run_app
> >> return func(application_pool, *args, **kw)
> >>   File
> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py",
> line 148, in main
> >> return messenger.generate(output, pool)
> >>   File
> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py",
> line 601, in generate
> >> output.run(self.cfg.get_diff_cmd(group, {
> >>   File
> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py",
> line 224, in run
> >> self.write_binary(buf)
> >> AttributeError: 'SMTPOutput' object has no attribute 'write_binary'
> >> ]]]
>
> The cause was that Output.run() was broken by removal of
> Output.write_binary() on r1912978.
>
> Here is an ad hoc patch:
> [[[
> Fix PropChange.generate
>
> * tools/hook-scripts/mailer/mailer.py
>   (OutputBase.run): remove, because below was the only usage.
>   (PropChange.generate):
> use generate_diff() to render propchange diff instead of
> OutputBase.run()
>
> Index: tools/hook-scripts/mailer/mailer.py
> ===
> --- tools/hook-scripts/mailer/mailer.py (revision 1914700)
> +++ tools/hook-scripts/mailer/mailer.py (working copy)
> @@ -211,23 +211,7 @@
>  representation."""
>  raise NotImplementedError
>
> -  def run(self, cmd):
> -"""Override this method, if the default implementation is not
> sufficient.
> -Execute CMD, writing the stdout produced to the output
> representation."""
> -# By default we choose to incorporate child stderr into the output
> -pipe_ob = subprocess.Popen(cmd, stdout=subprocess.PIPE,
> -   stderr=subprocess.STDOUT,
> -   close_fds=sys.platform != "win32")
>
> -buf = pipe_ob.stdout.read(self._CHUNKSIZE)
> -while buf:
> -  self.write_binary(buf)
> -  buf = pipe_ob.stdout.read(self._CHUNKSIZE)
> -
> -# wait on the child so we don't end up with a billion zombies
> -pipe_ob.wait()
> -
> -
>  class MailedOutput(OutputBase):
>
>def start(self, subject_line, group, params):
> @@ -598,12 +582,13 @@
>tempfile2 = tempfile.NamedTemporaryFile()
>tempfile2.write(self.repos.get_rev_prop(self.propname,
> scratch_pool))
>tempfile2.flush()
> -  output.run(self.cfg.get_diff_cmd(group, {
> -'label_from' : 'old property value',
> -'label_to' : 'new property value',
> -'from' : tempfile1.name,
> -'to' : tempfile2.name,
> -}))
> +  for diffs in generate_diff(self.cfg.get_diff_cmd(group, {
> +  'label_from' : 'old property value',
> +  'label_to' : 'new property value',
> +  'from' : tempfile1.name,
> +  'to' : tempfile2.name,
> +  })):
> +  writer.write(to_str(diffs.raw))
>  output.finish()
>except MessageSendFailure:
>  ret = 1
> ]]]
>
> Cheers,
> --
> Yasuhito FUTATSUKI /
>


Re: Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py

2023-12-16 Thread Yasuhito FUTATSUKI
On 2023/12/11 12:54, Yasuhito FUTATSUKI wrote:
> On 2023/12/10 4:22, Yasuhito FUTATSUKI wrote:

>> Thank you for the review. However, it turned out that even with this
>> patch, mailer.py did not work for post-revprop-change hook.
>> It caused exception like
>>
>> [[[
>> svn: E165001: post-revprop-change hook failed (exit code 1) with output:
>> Traceback (most recent call last):
>>   File 
>> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", 
>> line 1593, in 
>> ret = svn.core.run_app(main, cmd, config_fname, repos_dir,
>>   File "/usr/local/lib/python3.9/site-packages/svn/core.py", line 324, in 
>> run_app
>> return func(application_pool, *args, **kw)
>>   File 
>> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", 
>> line 148, in main
>> return messenger.generate(output, pool)
>>   File 
>> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", 
>> line 601, in generate
>> output.run(self.cfg.get_diff_cmd(group, {
>>   File 
>> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", 
>> line 224, in run
>> self.write_binary(buf)
>> AttributeError: 'SMTPOutput' object has no attribute 'write_binary'
>> ]]]

The cause was that Output.run() was broken by removal of
Output.write_binary() on r1912978.

Here is an ad hoc patch:
[[[
Fix PropChange.generate

* tools/hook-scripts/mailer/mailer.py
  (OutputBase.run): remove, because below was the only usage.
  (PropChange.generate):
use generate_diff() to render propchange diff instead of OutputBase.run()

Index: tools/hook-scripts/mailer/mailer.py
===
--- tools/hook-scripts/mailer/mailer.py (revision 1914700)
+++ tools/hook-scripts/mailer/mailer.py (working copy)
@@ -211,23 +211,7 @@
 representation."""
 raise NotImplementedError

-  def run(self, cmd):
-"""Override this method, if the default implementation is not sufficient.
-Execute CMD, writing the stdout produced to the output representation."""
-# By default we choose to incorporate child stderr into the output
-pipe_ob = subprocess.Popen(cmd, stdout=subprocess.PIPE,
-   stderr=subprocess.STDOUT,
-   close_fds=sys.platform != "win32")

-buf = pipe_ob.stdout.read(self._CHUNKSIZE)
-while buf:
-  self.write_binary(buf)
-  buf = pipe_ob.stdout.read(self._CHUNKSIZE)
-
-# wait on the child so we don't end up with a billion zombies
-pipe_ob.wait()
-
-
 class MailedOutput(OutputBase):

   def start(self, subject_line, group, params):
@@ -598,12 +582,13 @@
   tempfile2 = tempfile.NamedTemporaryFile()
   tempfile2.write(self.repos.get_rev_prop(self.propname, scratch_pool))
   tempfile2.flush()
-  output.run(self.cfg.get_diff_cmd(group, {
-'label_from' : 'old property value',
-'label_to' : 'new property value',
-'from' : tempfile1.name,
-'to' : tempfile2.name,
-}))
+  for diffs in generate_diff(self.cfg.get_diff_cmd(group, {
+  'label_from' : 'old property value',
+  'label_to' : 'new property value',
+  'from' : tempfile1.name,
+  'to' : tempfile2.name,
+  })):
+  writer.write(to_str(diffs.raw))
 output.finish()
   except MessageSendFailure:
 ret = 1
]]]

Cheers,
-- 
Yasuhito FUTATSUKI /


Re: Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py

2023-12-10 Thread Yasuhito FUTATSUKI
On 2023/12/10 4:22, Yasuhito FUTATSUKI wrote:
> Hi,
> 
> On 2023/12/09 0:04, Daniel Sahlberg wrote:
> 
>> Den fre 8 dec. 2023 kl 05:40 skrev Yasuhito FUTATSUKI <
>> futat...@yf.bsdclub.org>:
> 
> 
> 
>>> Then, how about this patch? It at least mailer-t1.sh passed both
>>> on python=python2.7 and on python=python3.9.
>>>
>>> [[[
>>> Fix inconsistency in path argment on Config.which_group()
>>>
>>> Previously, we call Config.which_group() with path as bytes in
>>> Commit.__init__() and with path as str in Lock.__init__() and
>>> in PropChange.__init__(), but Config.which_group handled path
>>> argment as bytes. To fix it we only use str as path argment on
>>> Config.wich_group().
>>>
>>> * tools/hook-scripts/mailer/mailer.py
>>>   (Config.which_groups): Treat path as str.
>>>   (Commit.__init__): convert bytes path into str before calling above.
>>>
>>> Found by: Ruediger Pluem (rpluem {_AT_} apache.org)
>>>
>>> Index: tools/hook-scripts/mailer/mailer.py
>>> ===
>>> --- tools/hook-scripts/mailer/mailer.py (revision 1913728)
>>> +++ tools/hook-scripts/mailer/mailer.py (working copy)
>>> @@ -488,7 +488,7 @@
>>>  # collect the set of groups and the unique sets of params for the
>>> options
>>>  self.groups = { }
>>>  for path, change in self.changelist:
>>> -  for (group, params) in self.cfg.which_groups(path, log):
>>> +  for (group, params) in self.cfg.which_groups(to_str(path), log):
>>>  # turn the params into a hashable object and stash it away
>>>  param_list = sorted(params.items())
>>>  # collect the set of paths belonging to this group
>>> @@ -1486,9 +1486,9 @@
>>>  "Return the path's associated groups."
>>>  groups = []
>>>  for group, pattern, exclude_pattern, repos_params, search_logmsg_re
>>> in self._group_re:
>>> -  match = pattern.match(to_str(path))
>>> +  match = pattern.match(path)
>>>if match:
>>> -if exclude_pattern and exclude_pattern.match(to_str(path)):
>>> +if exclude_pattern and exclude_pattern.match(path):
>>>continue
>>>  params = repos_params.copy()
>>>  params.update(match.groupdict())
>>> ]]]
>>>
>>> Cheers,
>>> --
>>> Yasuhito FUTATSUKI 
>>>
>>
>> This looks good to me! Thanks for the detailed explaination!
> 
> Thank you for the review. However, it turned out that even with this
> patch, mailer.py did not work for post-revprop-change hook.
> It caused exception like
> 
> [[[
> svn: E165001: post-revprop-change hook failed (exit code 1) with output:
> Traceback (most recent call last):
>   File 
> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", 
> line 1593, in 
> ret = svn.core.run_app(main, cmd, config_fname, repos_dir,
>   File "/usr/local/lib/python3.9/site-packages/svn/core.py", line 324, in 
> run_app
> return func(application_pool, *args, **kw)
>   File 
> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", 
> line 148, in main
> return messenger.generate(output, pool)
>   File 
> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", 
> line 601, in generate
> output.run(self.cfg.get_diff_cmd(group, {
>   File 
> "/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", 
> line 224, in run
> self.write_binary(buf)
> AttributeError: 'SMTPOutput' object has no attribute 'write_binary'
> ]]]
> 
> On 1.14.x branch, the patch can be applied clearly, and it worked
> for post-revprop-change hook.

On 1.14.x without this patch, it mailer.py does not work for revprop-change,
lock, and unlock if at least one group definithion with "for_paths" in
mailer.conf, and the patch fixes it.

(My non-generalized test script did not checked the case that mailer.py
contains for_paths definitions. That is why I missed it.)

So I'll commit it and nominate for backport, although it is too late
1.14.3.

Cheers,
-- 
Yasuhito FUTATSUKI /


Re: Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py

2023-12-09 Thread Yasuhito FUTATSUKI
Hi,

On 2023/12/09 0:04, Daniel Sahlberg wrote:

> Den fre 8 dec. 2023 kl 05:40 skrev Yasuhito FUTATSUKI <
> futat...@yf.bsdclub.org>:



>> Then, how about this patch? It at least mailer-t1.sh passed both
>> on python=python2.7 and on python=python3.9.
>>
>> [[[
>> Fix inconsistency in path argment on Config.which_group()
>>
>> Previously, we call Config.which_group() with path as bytes in
>> Commit.__init__() and with path as str in Lock.__init__() and
>> in PropChange.__init__(), but Config.which_group handled path
>> argment as bytes. To fix it we only use str as path argment on
>> Config.wich_group().
>>
>> * tools/hook-scripts/mailer/mailer.py
>>   (Config.which_groups): Treat path as str.
>>   (Commit.__init__): convert bytes path into str before calling above.
>>
>> Found by: Ruediger Pluem (rpluem {_AT_} apache.org)
>>
>> Index: tools/hook-scripts/mailer/mailer.py
>> ===
>> --- tools/hook-scripts/mailer/mailer.py (revision 1913728)
>> +++ tools/hook-scripts/mailer/mailer.py (working copy)
>> @@ -488,7 +488,7 @@
>>  # collect the set of groups and the unique sets of params for the
>> options
>>  self.groups = { }
>>  for path, change in self.changelist:
>> -  for (group, params) in self.cfg.which_groups(path, log):
>> +  for (group, params) in self.cfg.which_groups(to_str(path), log):
>>  # turn the params into a hashable object and stash it away
>>  param_list = sorted(params.items())
>>  # collect the set of paths belonging to this group
>> @@ -1486,9 +1486,9 @@
>>  "Return the path's associated groups."
>>  groups = []
>>  for group, pattern, exclude_pattern, repos_params, search_logmsg_re
>> in self._group_re:
>> -  match = pattern.match(to_str(path))
>> +  match = pattern.match(path)
>>if match:
>> -if exclude_pattern and exclude_pattern.match(to_str(path)):
>> +if exclude_pattern and exclude_pattern.match(path):
>>continue
>>  params = repos_params.copy()
>>  params.update(match.groupdict())
>> ]]]
>>
>> Cheers,
>> --
>> Yasuhito FUTATSUKI 
>>
> 
> This looks good to me! Thanks for the detailed explaination!

Thank you for the review. However, it turned out that even with this
patch, mailer.py did not work for post-revprop-change hook.
It caused exception like

[[[
svn: E165001: post-revprop-change hook failed (exit code 1) with output:
Traceback (most recent call last):
  File 
"/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", line 
1593, in 
ret = svn.core.run_app(main, cmd, config_fname, repos_dir,
  File "/usr/local/lib/python3.9/site-packages/svn/core.py", line 324, in 
run_app
return func(application_pool, *args, **kw)
  File 
"/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", line 
148, in main
return messenger.generate(output, pool)
  File 
"/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", line 
601, in generate
output.run(self.cfg.get_diff_cmd(group, {
  File 
"/home/futatuki/tmp/svn-test/mailer_test/repo-smtpoutput/hooks/mailer.py", line 
224, in run
self.write_binary(buf)
AttributeError: 'SMTPOutput' object has no attribute 'write_binary'
]]]

On 1.14.x branch, the patch can be applied clearly, and it worked
for post-revprop-change hook.

Cheers,
-- 
Yasuhito FUTATSUKI /


Re: Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py

2023-12-08 Thread Daniel Sahlberg
Hi,

Den fre 8 dec. 2023 kl 05:40 skrev Yasuhito FUTATSUKI <
futat...@yf.bsdclub.org>:

>
>
> On 2023/12/07 19:33, Daniel Sahlberg wrote:
> > Den tors 7 dec. 2023 kl 09:51 skrev Ruediger Pluem :
> >
> >> I stumbled accross a Python 3 compatibility issue in
> >> tools/hook-scripts/mailer/mailer.py.
> >>
> >> The call of self.cfg.which_groups in line 565 passes an empty string as
> >> first parameter.
> >> In which_groups this empty string is passed to to_str in line 1489.
> >> In line 88 to_str does x.decode('utf-8')
> >> In Python 2 str objects have a decode method at least in later Python 3
> >> versions they have not. Hence I propose the following patch which fixes
> the
> >> issue for me:
> 
>
> > There was some work done on mailer.py by Greg Stein, started here:
> > https://lists.apache.org/thread/v5bh1j8qz8kk0jv1tlctxqt1k454tz0h
> > However I don't think that touched these parts of the code.
>
> Yes, I also think the issue has existed before it.
>
> > I think there is more to this than that simple fix, but I'm no expert in
> > Python or in the Python bindings. There are two more calls to
> > which_groups():
> >
> > Line 491, called with the return from
> > svn.repos.ChangeCollector(...).get_changes().items() (first argument in a
> > tuple). I don't know which encoding this uses.
>
> We decided that for all C APIs which returns char * values, their Python 3
> wrapper functions return them as bytes object, and also we don't convert
> those values into str within helper functions in svn modules. Thus their
> path elements should be bytes object, and as they are internal paths,
> their encoding is UTF-8, regardless of locale.
>
> > Line 663, called with the return from sys.stdin.buffer.readlines(),
> already
> > processed by to_str() once.
> >
> > In particular the call on 663 should be double decoded unless I'm missing
> > something.
>
> Yes, there is inconsistency here.
>
> > So depending on what get_changes() return, maybe we should remove the
> > to_str() from which_groups() instead?
>
> As Config.which_group() call from Commit.__init__() has worked correctly,
> the pathes as matching pattern in which_group() are expected as str.
> So if remove to_str() from which_groups(), path argment for
> Config.which_group() should be a str object.
>
> > Note that we are still somewhat supporting Python 2 so we need to make
> sure
> > any changes does work on Python 2 as well.
>
> Then, how about this patch? It at least mailer-t1.sh passed both
> on python=python2.7 and on python=python3.9.
>
> [[[
> Fix inconsistency in path argment on Config.which_group()
>
> Previously, we call Config.which_group() with path as bytes in
> Commit.__init__() and with path as str in Lock.__init__() and
> in PropChange.__init__(), but Config.which_group handled path
> argment as bytes. To fix it we only use str as path argment on
> Config.wich_group().
>
> * tools/hook-scripts/mailer/mailer.py
>   (Config.which_groups): Treat path as str.
>   (Commit.__init__): convert bytes path into str before calling above.
>
> Found by: Ruediger Pluem (rpluem {_AT_} apache.org)
>
> Index: tools/hook-scripts/mailer/mailer.py
> ===
> --- tools/hook-scripts/mailer/mailer.py (revision 1913728)
> +++ tools/hook-scripts/mailer/mailer.py (working copy)
> @@ -488,7 +488,7 @@
>  # collect the set of groups and the unique sets of params for the
> options
>  self.groups = { }
>  for path, change in self.changelist:
> -  for (group, params) in self.cfg.which_groups(path, log):
> +  for (group, params) in self.cfg.which_groups(to_str(path), log):
>  # turn the params into a hashable object and stash it away
>  param_list = sorted(params.items())
>  # collect the set of paths belonging to this group
> @@ -1486,9 +1486,9 @@
>  "Return the path's associated groups."
>  groups = []
>  for group, pattern, exclude_pattern, repos_params, search_logmsg_re
> in self._group_re:
> -  match = pattern.match(to_str(path))
> +  match = pattern.match(path)
>if match:
> -if exclude_pattern and exclude_pattern.match(to_str(path)):
> +if exclude_pattern and exclude_pattern.match(path):
>continue
>  params = repos_params.copy()
>  params.update(match.groupdict())
> ]]]
>
> Cheers,
> --
> Yasuhito FUTATSUKI 
>

This looks good to me! Thanks for the detailed explaination!

Kind regards,
Daniel Sahlberg


Re: Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py

2023-12-07 Thread Yasuhito FUTATSUKI



On 2023/12/07 19:33, Daniel Sahlberg wrote:
> Den tors 7 dec. 2023 kl 09:51 skrev Ruediger Pluem :
> 
>> I stumbled accross a Python 3 compatibility issue in
>> tools/hook-scripts/mailer/mailer.py.
>>
>> The call of self.cfg.which_groups in line 565 passes an empty string as
>> first parameter.
>> In which_groups this empty string is passed to to_str in line 1489.
>> In line 88 to_str does x.decode('utf-8')
>> In Python 2 str objects have a decode method at least in later Python 3
>> versions they have not. Hence I propose the following patch which fixes the
>> issue for me:


> There was some work done on mailer.py by Greg Stein, started here:
> https://lists.apache.org/thread/v5bh1j8qz8kk0jv1tlctxqt1k454tz0h
> However I don't think that touched these parts of the code.

Yes, I also think the issue has existed before it.

> I think there is more to this than that simple fix, but I'm no expert in
> Python or in the Python bindings. There are two more calls to
> which_groups():
> 
> Line 491, called with the return from
> svn.repos.ChangeCollector(...).get_changes().items() (first argument in a
> tuple). I don't know which encoding this uses.

We decided that for all C APIs which returns char * values, their Python 3
wrapper functions return them as bytes object, and also we don't convert
those values into str within helper functions in svn modules. Thus their
path elements should be bytes object, and as they are internal paths,
their encoding is UTF-8, regardless of locale.

> Line 663, called with the return from sys.stdin.buffer.readlines(), already
> processed by to_str() once.
> 
> In particular the call on 663 should be double decoded unless I'm missing
> something.

Yes, there is inconsistency here.

> So depending on what get_changes() return, maybe we should remove the
> to_str() from which_groups() instead?

As Config.which_group() call from Commit.__init__() has worked correctly,
the pathes as matching pattern in which_group() are expected as str.
So if remove to_str() from which_groups(), path argment for
Config.which_group() should be a str object.

> Note that we are still somewhat supporting Python 2 so we need to make sure
> any changes does work on Python 2 as well.

Then, how about this patch? It at least mailer-t1.sh passed both
on python=python2.7 and on python=python3.9.

[[[
Fix inconsistency in path argment on Config.which_group()

Previously, we call Config.which_group() with path as bytes in
Commit.__init__() and with path as str in Lock.__init__() and
in PropChange.__init__(), but Config.which_group handled path
argment as bytes. To fix it we only use str as path argment on
Config.wich_group().

* tools/hook-scripts/mailer/mailer.py
  (Config.which_groups): Treat path as str.
  (Commit.__init__): convert bytes path into str before calling above.

Found by: Ruediger Pluem (rpluem {_AT_} apache.org)

Index: tools/hook-scripts/mailer/mailer.py
===
--- tools/hook-scripts/mailer/mailer.py (revision 1913728)
+++ tools/hook-scripts/mailer/mailer.py (working copy)
@@ -488,7 +488,7 @@
 # collect the set of groups and the unique sets of params for the options
 self.groups = { }
 for path, change in self.changelist:
-  for (group, params) in self.cfg.which_groups(path, log):
+  for (group, params) in self.cfg.which_groups(to_str(path), log):
 # turn the params into a hashable object and stash it away
 param_list = sorted(params.items())
 # collect the set of paths belonging to this group
@@ -1486,9 +1486,9 @@
 "Return the path's associated groups."
 groups = []
 for group, pattern, exclude_pattern, repos_params, search_logmsg_re in 
self._group_re:
-  match = pattern.match(to_str(path))
+  match = pattern.match(path)
   if match:
-if exclude_pattern and exclude_pattern.match(to_str(path)):
+if exclude_pattern and exclude_pattern.match(path):
   continue
 params = repos_params.copy()
 params.update(match.groupdict())
]]]

Cheers,
-- 
Yasuhito FUTATSUKI 


Re: Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py

2023-12-07 Thread Yasuhito FUTATSUKI

Hello,

On 2023/12/07 17:51, Ruediger Pluem wrote:

I stumbled accross a Python 3 compatibility issue in 
tools/hook-scripts/mailer/mailer.py.

The call of self.cfg.which_groups in line 565 passes an empty string as first 
parameter.
In which_groups this empty string is passed to to_str in line 1489.
In line 88 to_str does x.decode('utf-8')
In Python 2 str objects have a decode method at least in later Python 3 
versions they have not. Hence I propose the following patch which fixes the 
issue for me:


Thank you for the report and proposing a patch.

This was caused by my overlook.

I've checked the usage of Config.which_groups(), then it turned out
that only the call from Commits.__init__  passes a path as a bytes,
and calls from Look.__init__ and PropChange.__init__ pass a path
as str. More over, in Config.which_groups(), the path argment is
always converted to str when it is used.

So I prefer changing Config.which_groups to accept str path, and
also changing Commits.__init__ to pass a str path.



Index: mailer.py
===
--- mailer.py   (revision 1914422)
+++ mailer.py   (working copy)
@@ -562,7 +562,7 @@
  
  # collect the set of groups and the unique sets of params for the options

  self.groups = { }
-for (group, params) in self.cfg.which_groups('', None):
+for (group, params) in self.cfg.which_groups(b'', None):
# turn the params into a hashable object and stash it away
param_list = sorted(params.items())
self.groups[group, tuple(param_list)] = params


If I would get a go ahead here on list I would commit.

Regards

Rüdiger


Thanks,
--
Yasuhito FUTATSUKI 


Re: Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py

2023-12-07 Thread Daniel Sahlberg
Den tors 7 dec. 2023 kl 09:51 skrev Ruediger Pluem :

> I stumbled accross a Python 3 compatibility issue in
> tools/hook-scripts/mailer/mailer.py.
>
> The call of self.cfg.which_groups in line 565 passes an empty string as
> first parameter.
> In which_groups this empty string is passed to to_str in line 1489.
> In line 88 to_str does x.decode('utf-8')
> In Python 2 str objects have a decode method at least in later Python 3
> versions they have not. Hence I propose the following patch which fixes the
> issue for me:
>
> Index: mailer.py
> ===
> --- mailer.py   (revision 1914422)
> +++ mailer.py   (working copy)
> @@ -562,7 +562,7 @@
>
>  # collect the set of groups and the unique sets of params for the
> options
>  self.groups = { }
> -for (group, params) in self.cfg.which_groups('', None):
> +for (group, params) in self.cfg.which_groups(b'', None):
># turn the params into a hashable object and stash it away
>param_list = sorted(params.items())
>self.groups[group, tuple(param_list)] = params
>
>
> If I would get a go ahead here on list I would commit.
>
> Regards
>
> Rüdiger
>

There was some work done on mailer.py by Greg Stein, started here:
https://lists.apache.org/thread/v5bh1j8qz8kk0jv1tlctxqt1k454tz0h
However I don't think that touched these parts of the code.

I think there is more to this than that simple fix, but I'm no expert in
Python or in the Python bindings. There are two more calls to
which_groups():

Line 491, called with the return from
svn.repos.ChangeCollector(...).get_changes().items() (first argument in a
tuple). I don't know which encoding this uses.

Line 663, called with the return from sys.stdin.buffer.readlines(), already
processed by to_str() once.

In particular the call on 663 should be double decoded unless I'm missing
something.

So depending on what get_changes() return, maybe we should remove the
to_str() from which_groups() instead?

Note that we are still somewhat supporting Python 2 so we need to make sure
any changes does work on Python 2 as well.

Kind regards,
Daniel


Python 3 compatibility issue in tools/hook-scripts/mailer/mailer.py

2023-12-07 Thread Ruediger Pluem
I stumbled accross a Python 3 compatibility issue in 
tools/hook-scripts/mailer/mailer.py.

The call of self.cfg.which_groups in line 565 passes an empty string as first 
parameter.
In which_groups this empty string is passed to to_str in line 1489.
In line 88 to_str does x.decode('utf-8')
In Python 2 str objects have a decode method at least in later Python 3 
versions they have not. Hence I propose the following patch which fixes the 
issue for me:

Index: mailer.py
===
--- mailer.py   (revision 1914422)
+++ mailer.py   (working copy)
@@ -562,7 +562,7 @@
 
 # collect the set of groups and the unique sets of params for the options
 self.groups = { }
-for (group, params) in self.cfg.which_groups('', None):
+for (group, params) in self.cfg.which_groups(b'', None):
   # turn the params into a hashable object and stash it away
   param_list = sorted(params.items())
   self.groups[group, tuple(param_list)] = params


If I would get a go ahead here on list I would commit.

Regards

Rüdiger