On 2021-03-01 08:06, Brian Inglis wrote:
On 2021-03-01 04:17, John Vincent via Cygwin wrote:
I'm running cygwin on Windows 10, using UTF8 in English. I run cygwin bash inside a cygwin mintty terminal. I've noticed a minor problem when using cygstart with wildcard parameters.
I type:
    $ cygstart *.??p
If there is a matching file then everything works as I expect. However if
there is no matching file I get an error message as follows:
Unable to start '.p': The specified file was not found.
When I look at this using the "od" command I see the following:
$ cygstart *.??p 2>&1 | od -tx1 -c
0000000  55  6e  61  62  6c  65  20  74  6f  20  73  74  61  72  74  20
          U   n   a   b   l   e       t   o       s   t   a   r   t
0000020  27  ef  80  aa  2e  ef  80  bf  ef  80  bf  70  27  3a  20  54
          ' 357 200 252   . 357 200 277 357 200 277   p   '   :       T
0000040  68  65  20  73  70  65  63  69  66  69  65  64  20  66  69  6c
          h   e       s   p   e   c   i   f   i   e   d       f   i   l
0000060  65  20  77  61  73  20  6e  6f  74  20  66  6f  75  6e  64  2e
          e       w   a   s       n   o   t       f   o   u   n   d   .
0000100  0a
         \n
It looks to me like cygstart is not outputting the correct UTF-8 for either
the * character or the ? character. I think this is a bug.
To support POSIX path names, Cygwin allows any characters other than \0 and /, so it maps Windows special characters into the UTF-8 BMP PUA:

https://cygwin.com/cygwin-ug-net/using-specialnames.html#pathnames-specialchars

http://www.unicode.org/faq/private_use.html

https://en.wikipedia.org/wiki/Private_Use_Areas

It may also prefix unsupported codes in a code page with CAN/0x18.

The bug is in displaying in the error message the remapped string with undisplayable PUA characters, rather than either the reverse mapped string or the original input path name.

As above and:

$ cygstart ?*?.log
Unable to start '.log': The specified file was not found.
$ cygstart ?*?.log |& xxd -g1
00000000: 55 6e 61 62 6c 65 20 74 6f 20 73 74 61 72 74 20  Unable to start
00000010: 27 ef 80 bf ef 80 aa ef 80 bf 2e 6c 6f 67 27 3a  '..........log':
00000020: 20 54 68 65 20 73 70 65 63 69 66 69 65 64 20 66   The specified f
00000030: 69 6c 65 20 77 61 73 20 6e 6f 74 20 66 6f 75 6e  ile was not foun
00000040: 64 2e 0a                                         d..

?*? 0x3f2a3f --> 0xf03f 0xf02a 0xf03f
-> 0xef 0x80 0xbf 0xef 0x80 0xaa 0xef 0x80 0xbf

--
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in binary units and prefixes, physical quantities in SI.]

Reply via email to