Re: encoding neutral unpack

Ton Hospel Sat, 29 Jan 2005 05:07:48 -0800

If you use U0 or C0 mode switches in a unpack format like s(nU0v)2S
the n is in a rather weird state: it's C0 mode the first round, U0
mode on later ones.


And in a format like C/(U0)stuff it depends on the value of count 
whether U0 mode will apply to stuff or not.

Both of these convinced me that U0/C0 mode should really be scoped 
to the enclosing () (making it feel more like a compile time pragma than
a runtime pragma). The following patch makes it so.

Before:
perl -wle 'print for unpack("a(U0)U", "b\341\277\274")'
b
8188

After:
./perl -Ilib -wle 'print for unpack("a(U0)U", "b\341\277\274")'
b
225

Patch is relative to 5.8.6:

--- pp_pack.c.old       Sat Jan 29 13:26:27 2005
+++ pp_pack.c   Sat Jan 29 14:05:27 2005
@@ -618,6 +618,10 @@
            while (len--) {
                symptr->patptr = savsym.grpbeg;
                unpack_rec(symptr, ss, strbeg, strend, &ss );
+                if (savsym.flags & FLAG_UNPACK_DO_UTF8)
+                    symptr->flags |=  FLAG_UNPACK_DO_UTF8;
+                else
+                    symptr->flags &= ~FLAG_UNPACK_DO_UTF8;
                 if (ss == strend && savsym.howlen == e_star)
                    break; /* No way to continue */
            }

Re: encoding neutral unpack

Reply via email to