Package: perl Version: 5.24.1-3 Severity: normal Tags: upstream Dear Maintainer,
In some cases, some valid utf-8 chinese (or japanese Kanji) chars in a perl string makes perl die on "Malformed UTF-8" while matching a regexp. Here is the smallest programm (all in ascii, for safety) creating the problem. #!/usr/bin/perl use strict; use warnings; my $text = "[quant,_1,\x{55b6}\x{696d}\x{65e5},\x{55b6}\x{696d}\x{65e5}]\x{6bce}"; eval {$text =~ s{((?<!~)(?:~~)*)\[([A-Za-z#*]\w*)(?:,([^\]]+))?\]}{"$1%$2($3)"}eg; }; if ( $@ ) { die "Failed $@"; } else { print "Works, for now\n"; } The very same text, on the very same regexp, did not create problems on the previous (5.20.*, 5.22.*) versions of perl. We use that text, and that regexp, in production environment, using Debian stable, and everything is running fine. Beware : it *often* crashes. Not always. When you add entropy, you can get down to a crash out of two run of the program (use-ing some more stuff, printing, etc). This precise test script seems to crash everytime I use it on my developpement environment. Regards, Benjamin. -- System Information: Debian Release: 9.0 APT prefers unstable APT policy: (500, 'unstable'), (500, 'testing'), (500, 'stable'), (1, 'experimental') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 3.12-1-amd64 (SMP w/3 CPU cores) Locale: LANG=en_US.ISO-8859-15, LC_CTYPE=en_US.ISO-8859-15 (charmap=ISO-8859-15), LANGUAGE=en_US.ISO-8859-15 (charmap=ISO-8859-15) Shell: /bin/sh linked to /bin/bash Init: systemd (via /run/systemd/system) Versions of packages perl depends on: ii dpkg 1.18.24 ii libperl5.24 5.24.1-3 ii perl-base 5.24.1-3 ii perl-modules-5.24 5.24.1-3 Versions of packages perl recommends: ii netbase 5.4 ii rename 0.20-4 Versions of packages perl suggests: ii libterm-readline-gnu-perl 1.35-1 ii libterm-readline-perl-perl 1.0303-1 ii make 4.1-9.1 ii perl-doc 5.24.1-3 -- no debconf information