First, I'd like to address people's concern over the format of the META 
file.  Module users and 99% of module authors have nothing to be concerned
about.   Most folks shouldn't even know the thing exists.

Module::Build has been generating and using META.yml since nearly the 
beginning.  MakeMaker has been generating META.yml automaticly for its 
authors since last July.  CPAN is now full of META.yml files.  5.8.2 comes
with a META.yml file.  If you're just now noticing META.yml then we've done 
our job.  It could have been written in esperanto with FoxPro for all it 
matters to the end user.

If we do our job right, most people should never have to directly read or
write a META.yml file.  Its generated automaticly by MakeMaker and 
Module::Build and module tools (PAUSE, search.cpan.org, CPANPLUS, etc...)
use it without your intervention.

The only people who should be concerned are authors of these tools and folks
involved in gethering CPAN statistics.  Even module authors need not know
about it as its automaticly generated when the run "make dist" or "Build dist".

I say this because every few months someone notices META.yml and asks "Why 
did we use YAML and not X?" which starts the same debate all over again with 
the same answers.  We need a FAQ.


To address the "I don't want to learn another data format, why can't we
just use Perl?" issue.  With YAML, you don't have to learn the data format.
Let's look at how you'd generate a META file with Data::Dumper.

        use Data::Dumper;
        $Data::Dumper::Terse = 1;
        my $meta = { name => "Foo::Bar", version => 1.23 };
        open(META, ">META.perl");
        print META Dumper($meta);
        close META;

And the equivalent with YAML.pm.

        use YAML qw(DumpFile);
        my $meta = { name => "Foo::Bar", version => 1.23 };
        DumpFile("META.yml", $meta);

And now reading back in the Perl version.

        my $meta = do 'META.perl';
        print $meta->{version};

And the YAML version.

        use YAML qw(LoadFile);
        my $meta = LoadFile("META.yml");
        print $meta->{version};

YAML's data model is so similar to Perl's (hashes, scalars, lists) that the
data you put in and take out is almost indistinguishable from Data::Dumper.
All that's different is the transient storage medium.  So even if you wanted
to roll your own META.yml file you never have to learn YAML!  The only
prereq is to have YAML.pm installed.

And, finally, because I know someone's going to ask about the YAML.pm
prereq, YAML.pm is not required by MakeMaker.  It generates META.yml by
hand and doesn't use it for anything.  For Module::Build its an optional 
prereq.

I hope that covers the bases.


Now to address Phil's specific concerns.


On Wed, Nov 12, 2003 at 10:25:00AM -0500, [EMAIL PROTECTED] wrote:
> OK, maybe I'm missing a LOT of context here, 'cause I haven't been
> agressively keeping up with this mailing list, but the security hole
> argument seems a bit odd.
> 
> These META.yml files we're refering to -- these are meta data for
> managing the build process, files that will be distributed along with
> the tarballs we upload to CPAN, right?  

Module::Build might use META.yml a little in its build process, but its not
a requirement.  MakeMaker can't even read META.yml.  Its not META.yml's 
primary purpose to be used to build a module.  Its just turning out
to be really damned useful. :)

The primary purpose of META.yml is to supply module meta data that we'd
normally have to go crawling around in the code to get.  Like $VERSION
(to get this you have to eval a line of the module) and its prerequisites
(to get that you have to run the Makefile.PL and parse the resulting 
Makefile).  

It also contains intangables like what license the code is distributed
under that you'd normally have to go groping around in the POD to try and
figure out.

File lists in META.yml are currently used as hints to the PAUSE indexer so
it can better determine what to index.  For example, 5.8.2's META.yml
contains a listing "private" (which I think is changing to noindex) to tell
PAUSE not to index these files/directories in the module list.  In this case
its because these are dual-life modules that also have CPAN versions.


> So, if I understand this correctly, you're worried about the build
> process eval'ing the contents of a file I sent you.  Hmm.

No, the case where security enters is when someone is grab CPAN meta 
information.  For example, PAUSE, CPAN search sites, anyone trying to do 
CPAN statistics, anyone trying to determine the prerequisites of a module, 
etc...  

CPAN statistics gathering currently involves running large amounts of 
untrusted code.  MakeMaker's parse_version() literally pulls a line of code 
out of the module and evals it.  To determine the prerequisties you have to 
run the Makefile.PL which, as Mark-Jason Dominus once demonstrated with
Memoize (I think it was) could contain rm -rf /.

Of course, when you build and install a module you are trusting that it
won't do anything naughty.  However, a user typically builds and installs
*a* module.  Or a set of modules, picked either by hand or because they're
a prerequisite of a module they picked by hand (thus, the author picked
by hand).  This selectiveness prevents some random hacker from uploading a 
module with an rm -rf in it.  Nobody will bother to install it because 
nobody will know about it.  Security through module obscurity. :)  For
better or worse, this is CPAN's ad-hoc "web of trust" model of security.

In contrast, CPAN statistics gathering usually involves *all* the modules 
on CPAN.  In this case, it will pick up RHACKER's module and blissfully run 
its Makefile.PL and $VERSION line causing who knows what havoc.  We shouldn't
have to set up a chroot jail (as PAUSE does) just to gether some
stats.

Since META.yml is automaticly generated by both Module::Build and MakeMaker,
CPAN is rapidly populating with META.yml files and code scaping for meta-data
will mostly disappear closing one of the big backdoors in CPAN.


-- 
Michael G Schwern        [EMAIL PROTECTED]  http://www.pobox.com/~schwern/
IIRC someone observed that they couldn't name identifiers in Ethiopian, 
because there was an Ethiopian character similar in function to _ which 
wasn't in \w
        -- Nicholas Clark demonstrates that the Internet works
           in <[EMAIL PROTECTED]>

Reply via email to