RFC 61 (v2) Interfaces for linking C objects into pe

Perl6 RFC Librarian Mon, 07 Aug 2000 17:59:55 -0700
This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Interfaces for linking C objects into perlsubs

=head1 VERSION

  Maintainer: David Nicol <[EMAIL PROTECTED]>
  Date: 7 Aug 2000
  Version: 2
  Mailing List: [EMAIL PROTECTED]
  Number: 61

=head1 ABSTRACT

C<XS>, the set of C macros and structures, is a considerable
barrier to writing extensions of Perl in the C language.  While
not replacing XS, this document specifies easier ways to
integrate a C object file (.o) into a perl program, from the
point of view of a perl user wishing to integrate a well defined
and well behaved C program into a perl program in a more direct
way than a C<system> call.

C<XS> is an excellent medium for sharing the glory of the internals
of Perl with the C programming community.  It is hoped that the
interface deescribed herein will become an excellent medium for
sharing the glory of the internals of well-written C libraries
with the Perl programming community.

This document is not precisely concerned with the details of the
implementation
of the interfaces it specifies, beyond a general attempt to restrict
itself to the possible.

The C<use redirect> pragma is introduced to allow the external code
to only inherit the running programs file handles that the programmer
wants it to inherit.

=head1 SUMMARY

        use externalh @h_files or die;

        sub callmain{

           my use redirect STDIN < $InputData;

           my use redirect STDOUT > PREVIOUSLY_OPENED_HANDLE;

           $value = use externalc @object_files;

        };

        use externalspace $name_of_the_space;

        tie %cstruct, externalc, $StructDefinition;
        
        %cstruct = unpack( $StructDefinition, $Structure );

        $Structure = pack( $StructDefinition, %cstruct );


=head1 DESCRIPTION

This document proposes a set of standard interfaces for linking
compiled (or assembled) object code directly into one's Perl program.

Currently, XS is a large enough additional layer, another ocean of
knowledge, that it is easier to define one's C code into
well behaved utilities and invoke them in backticks than to
convert a stand-alone utility into an XS module.  Rewriting XS
is part of the perl6 mandate.  This RFC is an attempt to suggest
what one alternative would look like.

=head2 default to the C<main(int argc, char* argv[])> interface in C

C programs accept the knowledge of their run-time command line arguments
via two arguments that are passed to the program's C<main> function.

In the absence of other data provided via the C<externalh> method, 
the C<@_> array will be converted for use with this linkage.

=head2 example of using C<use externalc> 

Through use of the C<externalc> use pragma, a set of compiled .o files
are linked into the program at compile time, and offered
the current scope's C<@_>, converted into scalars, as a I<count,values> pair,
if and when the execution path hits that point.

A mechanism must be selected to support gathering the output and
input of a C program as well, something like shell redirects.  The
C<use redirect> is a placeholder for a system to be announced by the
group working on I<file handle object> issues.

        sub DatabaseAccess{

                # place to hold the output
                my $tooloutput; 

                # my: redirection will only last until end of scope
                        # are not all "use" inside things thusly scoped?
                # no, what if you want a subroutine that redirects?
                        # Oh.  Right.  Must require the scoping operator.
                # use redirect: very similar to shell redirection
                my use redirect STDOUT > $tooloutput ;   

                my $retval= use externalc qw(
                        /opt/database/accesstool/src/authen.o
                        /opt/database/accesstool/src/author.o
                        /opt/database/accesstool/src/base64.o
                        /opt/database/accesstool/src/choose_authen.o
                        /opt/database/accesstool/src/daves.o
                        /opt/database/accesstool/src/do_acct.o
                        /opt/database/accesstool/src/encrypt.o
                        /opt/database/accesstool/src/hash.o
                        /opt/database/accesstool/src/hyperauth.o
                        /opt/database/accesstool/src/programs.o
                        /opt/database/accesstool/src/pw.o
                        /opt/database/accesstool/src/pwlib.o
                        /opt/database/accesstool/src/smbencrypt.o
                        /opt/database/accesstool/src/smbutil.o
                );

                # with "main" linkage we have "0 is success" semantics
                $retval and die "database access tool failed. Erno: $retval";

                $tooloutput;

        };

=head2 why not use backticks?

Current practice in this kind of situation is to fully compile
the C program, then write something
like this: 

        sub DatabaseAccess{
                `/opt/database/accesstool @_`;
        };

There is nothing wrong with this, and it will continue to
work reliably.

=head2 Finer control over importing C functions is provided using the C<use external 
function> pragma

Instead of importing whole programs, we would like to
be able to access arbitrary functions inside arbitrary
pieces of external code, and let perl keep track of a lot
of the housekeeping involved in doing that with a C program.



=head2 the C<use external mainroutine> pragma

C<use external*> can offer more flexibility than I<merely> 
extending perl virtual forking to programs which ordinarily would
run in an operating system defined process space. (that is, instead
of calling C<system>).

The name of the
main routine could be changed to something other than C<main>
through manipulation of the C<use external mainroutine> pragma,


        use externalh qw(
                /opt/database/accesstool/src/accesstool.h
        );
        use external mainroutine pwverify;
        use externalc;

Since we have not changed the external name space, C<externalc> will be linking
with the same pile of .o files from last time.  

=head2 the C<use externalh> pragma

The C<use externalh> pragma reads in C header files, with the function
definitions used here, so that C<externalc> can perform reasonable
transformations between the perl data in C<@_> and the arguments specified
in them. 

=head2 C<externalc> tie classnames

Data structures in C, and other languages that use offsets into
fixed-size records to differentiate size-limited fields (which are,
of course, vulnerable to being overrun) are defined with language-specific
data structure definition notations.

Using the perl5 C<tie> operator it would not be difficult to create
a class which would allow access into a particular structure, using
C<substr> and C<unpack> to extract the fields, and returning undef or
even dying (tie-time option) when attempt is made to access a field
which is not named in the structure definition.

This process could even be automated, and a 
C-Header-File-to-perl-tiable-class-definition package would be right at home on the 
CPAN scripts listing. 

        tie %cstruct, externalc, $StructDefinition;

would then tie %cstruct to the structure definition described
in C<$StructDefinition>.  This could be, in the same way of
regular expressions, either a text string or a pre-parsed structure
definition in later releases of perl6.

After being so tied, %cstruct can accept value assignment from
a scalar holding a defined structure and will have its fields instantly 
filled out:


        # New in perl 6 -- by popular demand, fixed length records!
        $FixedInputRecordLength = length($StructDefinition);

        while(%cstruct = <>){   # new in perl 6: "tie" traps assignment to hash
                $Totalpoints += $cstruct{points}
        };



Perl6 variables will have type and range checking built into them, and 
a tied external variable will inherit the limitations of the context it
is being imported from, something you don't get with a Perl5 C<unpack>.

=head2 more fun with structure definitions

If we accept structure definitions into our language, as something
a little stronger than C<pack> templates, these two snippets of
code make sense:


        %cstruct = unpack( $StructDefinition, $Structure );

        $Structure = pack( $StructDefinition, %cstruct );


The information carried in a structure definition, regardless of what
language it is imported from, will include:

=over

=item *
        field names
=item *
        field ranges
=item *
        field types (may include range here)
=item *
        offsets into structure

=back

Like other aspects of perl, as much as possible will be determined
and optimised away at compile time, while general constructs that
can handle arbitrary inputs will remain for run time binding.

=head2 type checking

So, a compile-time compatability table could be maintained, minimizing
the amount of run-type type checking that must be done.  A paranoid
mode would have to be maintained for those who are worried about the
ultrasneaky violating type check rules.



=head2 Instead of package space, external spaces C<use externalnamespace>

It may be possible to extend the lines around a perl5 package
to work for insulating different external packages being imported into
the same perl program, so that each package can only have one
importable package.

Or, external name spaces can be created with C<use externalspace>
which operates analogously to C<package> in perl5.

Since the wrapping of external functions into perlsubs
 will be done in Perl, keeping one external name
space per package forces all access functions for a set of
object files to share the same module.  

=head2 other languages

Adding additional external* pragmata to support other sorts
of objects, such  as system libraries, will happen.



=head1 IMPLEMENTATION

=head2 getting this to work

Getting this to work will require knowledge of how linkers
link.  I believe that all platforms support compilation in parts,
and the perl program is itself compiled in parts, so this 
extension, although challenging to describe, may be trivial to
implement, or may require perl carrying around its own portable
C library, which it pretty much does already.

The automatic translation between Perl scalar data and C data
structures will be necessarily partial.  Advanced conversions may
be done on the perl side with some to-be-determined packing functions,
for instance.


=head2 Compatibility with legacy code

There is enough room in the perl5 syntaces
for including outside code that no new words will need to be
reserved, not counting the C<use external*> pragmata as reserved words.

The stuff about structure definitions and C<%TiedHash = $BinaryStructure>
would not be possible to emulate in perl5 as I understand it.

=head2 Backwards Compatibility

It is very possible that most of the extensions can
be implemented with simple wrappers in XS, for
integrating the new use pragmata into perl5.  And
we keep XS, too, for legacy modules and use by the courageous.


=head2 New Directions

With threads and the new internal fork (whatever it's called)
perl seems to be subsuming more and more of the operating system.
By bringing the linker inside perl, we may be getting closer to
radical compatibility modes in which platform independence could
be offered by setting up portable emulation environments, so that
instead of BSD having a working Linux emulation mode, perl could
have a working linux emulation mode which would run anywhere (as long
as the program running in the environment did not try to do something
outside the realm of what perl can interact with) or perl might
aid projects such as WINE in new ways.


=head1 REFERENCES

/usr/lib/perl5/5.00503/pod/perlxs.pod was looked at
to reverify that XS is frighteningly complex.
RFC 61 (v2) Interfaces for linking C objects into pe

Reply via email to