This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Angle brackets should not be used for file globbing

=head1 VERSION

  Maintainer: Jon Ericson <[EMAIL PROTECTED]>
  Date: 4 August 2000
  Last-Modified: 30 August 2000
  Mailing List: [EMAIL PROTECTED]
  Version: 3
  Number: 34
  Status: Frozen

=head1 ABSTRACT

Angle brackets have two separate meanings in Perl - file globbing and
line input from a filehandle.  This means that perl attempts to Do What
I Mean and often misinterprets me.  Since file globbing can be
accomplished with the glob function and since file input is the more
common use of angle brackets, they should not be used for file globbing.

=head1 DESCRIPTION

=head2 Background

If angle brackets contain is a simple scalar variable (e.g., <$foo>),
then that variable contains the name of the filehandle to input from, or
its typeglob, or a reference to the same.  For example:

           $fh = \*STDIN;
           $line = <$fh>;

If what's within the angle brackets is neither a filehandle nor a simple
scalar variable containing a filehandle name, typeglob, or typeglob
reference, it is interpreted as a filename pattern to be globbed, and
either a list of filenames or the next filename in the list is returned,
depending on context.  This distinction is determined on syntactic
grounds alone.  That means `<$x>' is always a readline() from an
indirect handle, but `<$hash{key}>' is always a glob().  That's because
$x is a simple scalar variable, but `$hash{key}' is not--it's a hash
element.

One level of double-quote interpretation is done first, but you can't
say `<$foo>' because that's an indirect filehandle as explained in the
previous paragraph.  (In older versions of Perl, programmers would
insert curly brackets to force interpretation as a filename glob:
`<${foo}>'.  These days, it's considered cleaner to call the internal
function directly as `glob($foo)', which is probably the right way to
have done it in the first place.)  For example:

           while (<*.c>) {
               chmod 0644, $_;
           }

is roughly equivalent to:

           open(FOO, "echo *.c | tr -s ' \t\r\f'
'\\012\\012\\012\\012'|");
           while (<FOO>) {
               chop;
               chmod 0644, $_;
           }

except that the globbing is actually done internally using the standard
`File::Glob' extension.  Of course, the shortest way to do the above is:

           chmod 0644, <*.c>;

A (file)glob evaluates its (embedded) argument only when it is starting
a new list.  All values must be read before it will start over.  In list
context, this isn't important because you automatically get them all
anyway.  However, in scalar context the operator returns the next value
each time it's called, or C run out.  As with filehandle reads, an
automatic `defined' is generated when the glob occurs in the test part
of a `while', because legal glob returns (e.g. a file called 0) would
otherwise terminate the loop.  Again, `undef' is returned only once.  So
if you're expecting a single value from a glob, it is much better to say

           ($file) = <blurch*>;

than

           $file = <blurch*>;

because the latter will alternate between returning a filename and
returning false.

It you're trying to do variable interpolation, it's definitely better
to use the glob() function, because the older notation can cause people
to become confused with the indirect filehandle notation.

           @files = glob("$dir/*.[ch]");
           @files = glob($files[$i]);

-- perlop/"I/O Operators"

=head2 Areas of confusion

=over

=item Complex filehandle references

    my %filesystem;
    my $filename = '/etc/shells';
    open $filesystem{$filename}, $filename 
        or die "can't open $filename: $!";
    print <$filesystem{$filename}>;
    __END__

    GLOB{0xa042284}

=item Simple file glob variables

    my $fileglob = '/etc/sh*';
    print <$fileglob>;
    __END__

    [nothing printed]

=item Why don't these have the same output?

    $ perl -e 'print </etc/sh*>'
    /etc/shells

    $ perl -e 'print <>' /etc/sh*
    /bin/sh
    /bin/bash

=back

=head2 The glob function exists

Now that we have a function interface to file globbing, why do we need
the obscure operator interface?

=head2 The readline meaning is more common

Angle brackets are a shortcut for either glob() or readline().  If we
are going to eliminate one meaning, it should be the less common
meaning.  

=head2 Syntax niche could be filled with a simple, flexible input
operator

RFC 51 suggests that:

  print <'/etc/shells'>;

should print the content of the /etc/shells file, not:

  '/etc/shells'

=head1 IMPLEMENTATION

Remove the file-globbing behavior of angle brackets.

=head2 Migration from Perl 5

Scripts which use <> for globbing will need to be modified to use the
glob function instead.

=head1 CHANGES

=over

=item v3

  Changed status to 'Frozen' because there was no discussion 
  on the previous version.  I assume that this is an 
  uncontroversial RFC.

=item v2

  Changed mailing list to language-io
  Modified the format of the perlop quote
  Added the Migration implementation section
  Added RFC 51 references

=back

=head1 REFERENCES

perlop/"I/O Operators"

perlfunc/glob

perlfunc/readline

RFC 51: Angle brackets should accept filenames and lists 

Reply via email to