On Fri, May 15, 2026 at 4:02 PM Jun Omae <[email protected]> wrote: > > On 2026/05/15 0:28, Timofei Zhakov wrote: > > There is a test called basic_tests.py:argv_with_best_fit_chars. It > > checks that svn rejects Unicode symbols. Functionality which was > > illegal before changes introduced in that branch. > > In the branch, svn command receives the arguments as utf-8 bytes, but the > output of the pipe is applied best-fit encoding conversion. > > [[[ > diff --git a/subversion/tests/cmdline/basic_tests.py > b/subversion/tests/cmdline/basic_tests.py > index 88f43bfae7..edb697b795 100755 > --- a/subversion/tests/cmdline/basic_tests.py > +++ b/subversion/tests/cmdline/basic_tests.py > @@ -3357,20 +3357,22 @@ def argv_with_best_fit_chars(sbox): > yield chr(c), mbcs > > count = 0 > - # E721113: Conversion from UTF-16 failed: No mapping for the Unicode > - # character exists in the target multi-byte code page. > - expected_stderr = 'svn: E721113: ' > + # The argument is received as utf-8 bytes, but the output to the pipe > + # is applied best-fit encoding conversion. > for wc, mbcs in iter_bestfit_chars(): > count += 1 > logger.info('Code page %r - U+%04x -> 0x%s', codepage, ord(wc), > mbcs.hex()) > if mbcs == b'"': > - svntest.actions.run_and_verify_svn2(None, expected_stderr, 1, 'help', > + expected_stderr = r'^"foo" "bar": unknown command' > + svntest.actions.run_and_verify_svn2(None, expected_stderr, 0, 'help', > 'foo{0} {0}bar'.format(wc)) > elif mbcs == b'\\': > - svntest.actions.run_and_verify_svn2(None, expected_stderr, 1, 'help', > + expected_stderr = r'^"foo\\" \\"bar": unknown command' > + svntest.actions.run_and_verify_svn2(None, expected_stderr, 0, 'help', > 'foo{0}" {0}"bar'.format(wc)) > elif mbcs == b' ': > - svntest.actions.run_and_verify_svn2(None, expected_stderr, 1, 'help', > + expected_stderr = r'^"foo bar": unknown command' > + svntest.actions.run_and_verify_svn2(None, expected_stderr, 0, 'help', > 'foo{0}bar'.format(wc)) > if count == 0: > raise svntest.Skip('No best fit characters in code page %r' % codepage) > ]]]
I tested this patch and can confirm that it works. I don't know why but as far as I remember I was doing exactly the same thing, but it didn't work for me. I remember I once heard that "everything looks like physics if you don't know magic". That's exactly the case. Sometimes we just need a pair of fresh eyes. :-) +1 for the changes > Recently, I'm trying 1.14.x with utf-8 code page using activeCodePage > manifest [1]. It almost works fine (e.g. add emoji filenames and checkout, > ...) however output to stderr is garbled and not fixed yet. > > [1] > https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page That sounds interesting. If as you are saying output is converted to the local encoding, it introduces a lot of inconsistency and yeah we have no emojis. Since it's almost always that the encoding is UTF-8 on the majority of Unix systems, I think it makes a lot of sense to take the same approach on Windows. -- Timofei Zhakov

