On Wed, Apr 22, 2009 at 02:10:03AM +0100, Ben Morrow wrote:
> Quoth [email protected] (Peter Pentchev):
> > On Tue, Apr 21, 2009 at 10:24:10AM +0100, Paul LeoNerd Evans wrote:
> > > I find lately I've been doing a lot of binary protocol work, taking
> > > messages that live in TCP streams or files or similar, and doing lots of
> > > pack()/unpack() on them.
> > [snip]
> > > 
> > > Is there some neater way to do this? Can I either:
> > > 
> > >  a: Get unpack() to consume bytes from the LVALUE
> > > 
> > >  b: Ask unpack() how much it ate in a generic fashion?
> > 
> > Brief answer:
> > - it's possible by patching the Perl source, see the last paragraph
> >   for an explanation about a possible patch;
> > - it could be done as an external module that must be kept in sync
> >   with Perl's pp_pack.c.
> 
> What am I missing here? It appears to me unpacking "." gives the current
> byte position in the string, which is what is needed.
> 
>     ~% perl -E'my $bin = pack "NN", 10, 12; say for unpack ".N.N.", $bin'
>     0
>     10
>     4
>     12
>     8

Oh... thanks!  Yes, it seems to be already implemented :)

Well, actually, not quite...  What you're missing is, first, documentation
(I couldn't find any documentation for "." in 5.8.9's -f pack/unpack),
and, second, support in earlier versions :(

[r...@straylight ~]$ perl -v | fgrep This; perl -le 'print for unpack(".N.N.", 
pack("NN", 10, 12))'
This is perl, v5.8.9 built for i386-freebsd-thread-multi-64int
Invalid type '.' in unpack at -e line 1.
[r...@straylight ~]$

[r...@bastonne ~]$ perl -v | fgrep This; perl -le 'print for unpack(".N.N.", 
pack("NN", 10, 12))'
This is perl, v5.8.8 built for x86_64-linux-gnu-thread-multi
Invalid type '.' in unpack at -e line 1.
[r...@bastonne ~]$

The first one is a FreeBSD 7.2-PRERELEASE, the second one is
a Debian Etch-and-a-half updated regularly, but still in use.

It turns out that the "." code is indeed present in 5.8.9's pp_pack.c,
yet hidden between a PERL_PACK_CAN_DOT define that is only defined
if CHAR_BIT is *not* defined - and on both Debian Linux and FreeBSD
CHAR_BIT is defined in the system headers.  In fact, the "fix" was
as simple as adding an additional condition to the CHAR_BIT test -
if it is not defined *or* it is equal to 8 - and then my Perl suddenly
learned how to deal with unpack(".") :)  The problem?  This simple
patch makes the test suite fail :(  There might be other issues,
I'll have to take a closer look in the differences between the 5.8
and 5.10 source.

So... unfortunately, the "." in unpack() may NOT be used in stock
Perl 5.8.8 and 5.8.9 on at least two somewhat-popular platforms :(
5.10.0 does not seem to have those ifdef's; I wonder if this could
be fixed in another 5.8 point release?  Is another 5.8 point release
even planned? :)

Just for the record, here's the trivial, NOT WORKING (test suite fails),
PERL_PACK_CAN_DOT patch for 5.8.9.  I decided not to do
#if !defined(..) ||  .. == 8, since there might be C compilers
Out There(tm) that won't like it.  I'll still take a look at
the differences in the source to see how to make it work better,
but it'll be a bit later today.

--- pp_pack.c~
+++ pp_pack.c
@@ -79,11 +79,16 @@
 #define PERL_PACK_CAN_W
 #define PERL_PACK_CAN_DOT
 #else
+#if CHAR_BIT == 8
+#define PERL_PACK_CAN_W
+#define PERL_PACK_CAN_DOT
+#else
 #define PERL_PACK_CHAR_TEMPLATE_BUG_COMPATIBILITY
 #define PERL_PACK_REVERSE_UTF8_MODE_COMPATIBILITY
 #define PERL_PACK_NEVER_UPGRADE_COMPATIBILITY
 #define PERL_PACK_POSITIONS_ARE_BYTES_COMPATIBILITY
 #endif
+#endif
 /* Maximum number of bytes to which a byte can grow due to upgrade */
 #define UTF8_EXPAND    2
 
G'luck,
Peter

-- 
Peter Pentchev  [email protected]    [email protected]    [email protected]
PGP key:        http://people.FreeBSD.org/~roam/roam.key.asc
Key fingerprint FDBA FD79 C26F 3C51 C95E  DF9E ED18 B68D 1619 4553
"yields falsehood, when appended to its quotation." yields falsehood, when 
appended to its quotation.

Attachment: pgp8jXlDmRqIE.pgp
Description: PGP signature

Reply via email to