Okay, here's my first attempt at an RFC, contributing to the community,
and dredging back up my design experience after being forced to hack for
8+ years.  I'm not completely familiar with POD format, so I've probably
made mistakes.  I'm also not a full-blown perl or C guru, so I've
probably made mistakes there too.

Be kind.  If you can't be kind, then at least be polite?

Here Goes!
=head1  TITLE

        Enhanced Pack/Unpack

=head1  VERSION

        Maintainer: Edwin Wiles <[EMAIL PROTECTED]>
        Date: 1 Aug 2000
        Version: DRAFT - Not for the library yet!
        Mailing List: perl6-language
        Number: Unassigned

=head1  ABSTRACT

        Pack and Unpack are percieved as being difficult to use, and
        possibly missing desirable features.

=head1  DESCRIPTION

        The existing pack and unpack methods are dependent upon a
        simple yet complex 'format' structure, which is often
        difficult to get right, and which carries no information
        regarding the associated variable names.

        A more descriptive data description format, which includes
        variable name associations, would make pack and unpack easier
        to use.

        Partial unpacking upon user demand for a named variable may
        also be of use.

=head1  IMPLEMENTATION

        Given the expressed desire to shrink the overall size of the
        perl executable, this should be implemented as a seperate
        module.

        There are currently two possible methods for implementation:

=head2  Native

        A whole new body of code, implementing an enhanced data
        description capability.  Possibly with the following method
        design.  [Note: Design is currently expressed in Perl5
        syntax.  At least I think it is.  PROOFREADERS! ATTACK!]

        $foo = new Structure(...definition...);
                        # Definition must tie data format to known
                        # names for the variables in the structure.
                        # This is to allow hash reference to internal
                        # data of the object.  Packed format of
                        # object, with any and all changes, is
                        # available via 'get'.

        $foo->read(...source...);
                        # sysread binary data from given IO reference.

        $foo->set(...variable...);
                        # accept binary data from normal perl
                        # variable.

        $v = $foo->get();
                        # output binary data to normal perl variable.

        $foo->write(...sink...);
                        # syswrite binary data to given IO reference.

        $foo->{'name'} = $val;
        $val = $foo->{'name'};

        Alternatively, the variable access could be designed along the
        lines of Class::Class.  (i.e. If Class::Class can do it, then
        we should be able to too!)

        $mm->name (42); # set "name" to 42
        $mm->name ( );  # get value of "name"

=head3  Data Definition

        While we could use a C-ish 'struct' syntax, that would imply a
        whole new parser capability.  Something built up out of
        existing perl syntax would be easier to implement?

        For example, assume a set of C structs as follows:

        struct foo {
               int bar;
               int baz;
               int count;
               };

        Followed by 'count' copies of:

        struct stroff {
               int length;
               int offset;
               };

        Followed by 'count' variable length, not necessarily null
        terminated collections of bytes.  Possibly strings, possibly
        not, but we'll consider them strings for now.

        [ 'bar', 'i', 'baz', 'i', 'count', 'i' ]

        This would do for the first structure.  Arrays are used rather
        than hashes to guarantee data order.

        [ 'length', 'i', 'offset', 'i' ]

        Will do for the second structure.  Now how do we join these
        two?

        [ 'bar', 'i', 'baz', 'i', 'count', 'i',
          repeat( 'count', [ 'length', 'i', 'offset', 'i' ] ) ]

        Okay, that looks like it might work, now add in the strings
        referenced by length and offset.  [Ideas anyone?]

=head2  Built Up

        There are a number of packages that come close to doing what
        we want, in combination.  They may be more appropriate than
        native code/scripting.  However, we would likely take a
        performance hit.

        See the references.

=head1  REFERENCES

        Class::Class useful for automatic creation of get/set methods
        for variables on the basis of their names.

        http://search.cpan.org/doc/BINKLEY/Class-Class-0.18/lib/Class/Class.pm
        http://search.cpan.org/search?dist=Class-Class

        File::Binary possibly useful for read/write of binary
        information.

        http://search.cpan.org/doc/SIMONW/File-Binary-0.3/blib/lib/File/Binary.pm
        http://search.cpan.org/search?dist=File-Binary

        PDL::IO::FlexRaw - is almost exactly what we're looking for.
        While it is described as being specifically for Fortran77
        binary files, we should be able to adapt it to anything.

        Combine this with Class::Class and an extended FlexRaw/pack
        data description format, and we've got a powerful tool for
        binary data manipulation.

        http://search.cpan.org/doc/KGB/PDL-2.005/IO/FlexRaw/FlexRaw.pm
        http://search.cpan.org/search?dist=PDL

Reply via email to