Dan Kogai <[EMAIL PROTECTED]> writes:
> Sine Gisle's patch make use of utf8n_to_uvuni(), it seems to be a
> problem of perl core. So I have checked utf8.c which defines that.
> Seems like it does not make use of PERL_UNICODE_MAX.
>
> The patch against utf8.c fixes that.
Seems like a good idea to have a workaround in Encode for this as
well.
Index: users/gisle/hacks/Encode/Encode.xs
--- Encode/Encode.xs.~1~ Mon Dec 6 10:44:31 2004
+++ Encode/Encode.xs Mon Dec 6 10:44:31 2004
@@ -300,6 +300,10 @@
UTF8_CHECK_ONLY | (strict ? UTF8_ALLOW_STRICT :
UTF8_ALLOW_NONSTRICT)
);
+#if 1 /* perl-5.8.6 and older do not check UTF8_ALLOW_LONG */
+ if (strict && uv > PERL_UNICODE_MAX)
+ ulen = -1;
+#endif
if (ulen == -1) {
if (strict) {
uv = utf8n_to_uvuni(s, e - s, &ulen,
End of Patch.
> --- perl-5.8.x/utf8.c Wed Nov 17 23:11:04 2004
> +++ perl-5.8.x.dan/utf8.c Sun Dec 5 11:38:52 2004
> @@ -429,6 +429,13 @@
> }
> else
> uv = UTF8_ACCUMULATE(uv, *s);
> + /* Checks if ord() > 0x10FFFF -- dankogai */
> + if (uv > PERL_UNICODE_MAX){
> + if (!(flags & UTF8_ALLOW_LONG)) {
> + warning = UTF8_WARN_LONG;
> + goto malformed;
> + }
> + }
> if (!(uv > ouv)) {
> /* These cannot be allowed. */
> if (uv == ouv) {