When trying to convert org.texi to PO I get the following error:

Complex regular subexpression recursion limit (32766) exceeded at 
/opt/local/lib/perl5/5.26/Locale/Po4a/TeX.pm line 697.

That happens between line 11180 and 11200 of the file and it does not seem to 
be triggered by any weird code.

Checking the error on the web, I found that it is possibly related to using * 
in a regex:

https://metacpan.org/pod/XML::Easy::Syntax

> BUGS
> 
> Many of these regular expressions are liable to tickle a serious bug in 
> perl's regexp engine. The bug is that the * and + repeat operators don't 
> always match an unlimited number of repeats: in some cases they are limited 
> to 32767 iterations. Whether this bogus limit applies depends on the 
> complexity of the expression being repeated, whether the string being 
> examined is internally encoded in UTF-8, and the version of perl. In some 
> cases, but not all, a false match failure is preceded by a warning "Complex 
> regular subexpression recursion limit (32766) exceeded".
> 
> This bug is present, in various forms, in all perl versions up to at least 
> 5.8.9 and 5.10.0. Pre-5.10 perls may also overflow their stack space, in 
> similar circumstances, if a resource limit is imposed.
> 
> There is no known feasible workaround for this perl bug. The regular 
> expressions supplied by this module will therefore, unavoidably, fail to 
> accept some lengthy valid inputs. Where this occurs, though, it is likely 
> that other regular expressions being applied to the same or related input 
> will also suffer the same problem. It is pervasive. Do not rely on this 
> module (or perl) to process long inputs on affected perl versions.

Line 697 of TeX.pm is:

 # detect \begin and \end (if they are not commented)
→    if ($buffer =~ /^((?:.*?\n)?                # $1 is
                   (?:[^%]                   # either not a %
                     |                       # or
                      (?<!\\)(?:\\\\)*\\%)*? # a % preceded by an odd nb of \
                  )                          # $2 is a \begin{ with the end of 
the line
                   (${RE_ESCAPE}(?:begin|end)\{.*)$/sx


If there could be a way to simplify this regex, maybe the issue would go away...

org.texi is the only file in the emacs distribution that chokes on this regex.

-- 
Jean-Christophe Helary @brandelune
https://mac4translators.blogspot.com
https://sr.ht/~brandelune/omegat-as-a-book/
_______________________________________________
Devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to