---------- Forwarded Message ---------
From: Eugene Roshal <ros...@rarlab.com>
Subject: Re: Fwd: Bug#948108: unrar corrupts filenames given as arguments
Date: Jan 4 2020, at 8:35 am
To: Martin Meredith <mar...@sourceguru.net>

Hello,
RAR expects source parameters in local encoding, but converts
them to wchar_t with CharToWide function and uses wchar_t almost
everywhere internally.

RAR has a feature allowing to archive and extract names not belonging
to current locale, such as extended ASCII instead of UTF-8.

When RAR CharToWide function notices names which cannot be correctly
converted by mbsrtowcs, it calls CharToWideMap to perform per byte
conversion and sets the special flag (0xFFFE noncharacter) to tell
WideToChar to apply per byte decoding WideToCharMap to such name.

While it is intended for names read from and saved to archive,
here it is applied to command line parameter, resulting in 0xfffe flag
and per byte conversion visible on the screen and producing this
mangled name.

Since source "x\x92.rar" is 7 bytes length, RAR allocates 7 bytes
output buffer for converted wchar_t string. CharToWideMap output
is longer than that because of special flag inclusion,
so RAR successfully truncates output to buffer size.

While such source parameter conversion is useless, it is harmless as
well. Truncation is a good sign indicating that RAR cares about buffer
size and prevents buffer overflow. Mangled name in output is result of
garbage in input instead of expected local encoding.

So no reason to worry in my opinion.
> Obviously, unrar should not mangle filenames, as filenames are
> octet-strings, not locale-encoded.

Normally RAR expects locale-encoded names here.
Eugene

Reply via email to