Re: The failing test in utf8-cmdline

Timofei Zhakov Sat, 16 May 2026 06:08:13 -0700

On Fri, May 15, 2026 at 4:02 PM Jun Omae <[email protected]> wrote:
>
> On 2026/05/15 0:28, Timofei Zhakov wrote:
> > There is a test called basic_tests.py:argv_with_best_fit_chars. It
> > checks that svn rejects Unicode symbols. Functionality which was
> > illegal before changes introduced in that branch.
>
> In the branch, svn command receives the arguments as utf-8 bytes, but the
> output of the pipe is applied best-fit encoding conversion.
>
> [[[
> diff --git a/subversion/tests/cmdline/basic_tests.py 
> b/subversion/tests/cmdline/basic_tests.py
> index 88f43bfae7..edb697b795 100755
> --- a/subversion/tests/cmdline/basic_tests.py
> +++ b/subversion/tests/cmdline/basic_tests.py
> @@ -3357,20 +3357,22 @@ def argv_with_best_fit_chars(sbox):
>        yield chr(c), mbcs
>
>    count = 0
> -  # E721113: Conversion from UTF-16 failed: No mapping for the Unicode
> -  # character exists in the target multi-byte code page.
> -  expected_stderr = 'svn: E721113: '
> +  # The argument is received as utf-8 bytes, but the output to the pipe
> +  # is applied best-fit encoding conversion.
>    for wc, mbcs in iter_bestfit_chars():
>      count += 1
>      logger.info('Code page %r - U+%04x -> 0x%s', codepage, ord(wc), 
> mbcs.hex())
>      if mbcs == b'"':
> -      svntest.actions.run_and_verify_svn2(None, expected_stderr, 1, 'help',
> +      expected_stderr = r'^"foo" "bar": unknown command'
> +      svntest.actions.run_and_verify_svn2(None, expected_stderr, 0, 'help',
>                                            'foo{0} {0}bar'.format(wc))
>      elif mbcs == b'\\':
> -      svntest.actions.run_and_verify_svn2(None, expected_stderr, 1, 'help',
> +      expected_stderr = r'^"foo\\" \\"bar": unknown command'
> +      svntest.actions.run_and_verify_svn2(None, expected_stderr, 0, 'help',
>                                            'foo{0}" {0}"bar'.format(wc))
>      elif mbcs == b' ':
> -      svntest.actions.run_and_verify_svn2(None, expected_stderr, 1, 'help',
> +      expected_stderr = r'^"foo bar": unknown command'
> +      svntest.actions.run_and_verify_svn2(None, expected_stderr, 0, 'help',
>                                            'foo{0}bar'.format(wc))
>    if count == 0:
>      raise svntest.Skip('No best fit characters in code page %r' % codepage)
> ]]]


I tested this patch and can confirm that it works. I don't know why
but as far as I remember I was doing exactly the same thing, but it
didn't work for me.

I remember I once heard that "everything looks like physics if you
don't know magic". That's exactly the case. Sometimes we just need a
pair of fresh eyes. :-)

+1 for the changes

> Recently, I'm trying 1.14.x with utf-8 code page using activeCodePage
> manifest [1]. It almost works fine (e.g. add emoji filenames and checkout,
> ...) however output to stderr is garbled and not fixed yet.
>
> [1] 
> https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page

That sounds interesting. If as you are saying output is converted to
the local encoding, it introduces a lot of inconsistency and yeah we
have no emojis.

Since it's almost always that the encoding is UTF-8 on the majority of
Unix systems, I think it makes a lot of sense to take the same
approach on Windows.

-- 
Timofei Zhakov

Re: The failing test in utf8-cmdline

Reply via email to