On 23 December 2015 at 18:19, Eli Zaretskii <[email protected]> wrote: >> If you had any test files where characters are disappearing, it would >> be interesting if I could see them. > > I have shown one case here: > > http://lists.gnu.org/archive/html/bug-wget/2015-12/msg00110.html > > Convert the %NN URL-encoding into 8-bit bytes, and convert the result > from CP1255 to UTF-8 -- the last character, ה, will disappear if you > don't call 'iconv' with NULL arguments.
Here's a test file that demonstrates the problem. In the second line, there is an extra character being viewed as part of the cross-reference, likely because of this problem. (Test file is in CP1255 and I viewed in a UTF-8 terminal.) (Text direction is also wrong but that's beside the point.)
cp1255.info
Description: Binary data
