Re: RFC 175 (v1) Add C keyword to force list context (like C)

Tom Christiansen Thu, 31 Aug 2000 20:39:26 -0700
>   > This is the kind of thing that keeps Perl instructors in business...

>And Perl out of businesses :-(

>More than anything I think the inability to write C<sub list> DWIMishly
>argues that we need it built-in. But we also need a *very* careful design
>of the semantics.

I'd like to see from this measure's proponents copious examples of
places where this sort of thing is truly desired.  I want to see
what they're doing, or rather, what they think they're doing.

I believe that a listification operation is not only unneeded, but
that it is in fact a dangerously counterintuitive and reality warping
notion.  I believe that we should therefore *not* do this crazy
thing.  In fact, I would be astonished if most of you did not, after
a careful reading of this brief document which now follows, agree
with me in this matter.

As, Damian, you seem to be saying, I also find this list() notion
to be poorly explored [yes, I know the idea was not yours], and I
fruther believe that the passage I cited in an earlier note of mine
in this thread, will, in the final analysis, prove the reasonable
approach--perhaps the only one, in fact.  To bolster this position,
let us explore several varying cases in which the expression is
sensitive to its context, and observe what difference it would have
upon our values if a listification operator were interposed to warp
the context.

1.  grep case, which returns the matches in list context and the
    count of the same in scalar context

        @lines = grep {EXPR} @data;
        $hits  = grep {EXPR} @data;

    If you write

        $burp = LISTIFY( grep {EXPR} @data );

    What's in the burp?  You now have the return list, which must
    be forced to be stored into a single place.  A general list-looking
    thing like (A,B,C) that this is done to evaluates to merely the
    last component.  So what are you left with? You did *not* get
    back a list of hits here; at least, not and be able to store
    it.  You only got the last item from the list return value.
    You do not get the normal scalar return!

2.  getpwuid case, which returns the whole pwent as a list when
    called in list context, and just the pw_name field in scalar
    context.

        @pwent = getpwuid(SUMNUM);
        $login = getpwuid(SUMNUM);

    If you write

        $burp = LISTIFY( getpwuid(SUMNUM) );

    Now what kind of burp do you get?  That turns into the list
    that would have gone into @pwent, but still as a list.  So now
    we have to pluck the last one from it, which here ends up being
    the shell.  Again, you do not get the normal scalar return
    value.

3.  readline case, which returns a list of lines/records left in
    the file when called in list context, but just the next line/record
    when called in scalar context.

        @whole_file = <FH>;
        $next_line  = <FH>;

    If you write

        $burp = LISTIFY( <FH> );

    Then you have a potentially biggissimo temporary list in memory,
    but now you've got to stuff that down into one little burp.  So
    you end up getting the last line/record from the file.  Note
    how different this is from the scalar context return, which is
    the *next* line, not the last one.  And for this convenience,
    you pay an infinite (meaning: arbitrarily large) memory cost,
    possibly twice given the way things often seem to work out.

4.  progressive match case, which returns all the (sub)matches in
    list context, and just the next match (as a boolean) in scalar context,
    keeping magical internal state.

        $next_match  = $string =~ /pattern/g;
        @all_matches = $string =~ /pattern/g;

    If you write

        $burp = LISTIFY( $string =~ /pattern/g );

    Now what is that?  It's the last match (or submatch, if the
    pattern happened to have had parens inside it) of all that were
    located.

As you can all readily see, forcing something that would otherwise
be in scalar context into list context is an fleetingly ephemeral
transformation: though you may evaluate the expression in list
context, it still has to wedge back into that scalar slot.  But
what you end up there is in many, many common cases something
completely different than the scalar sense would otherwise be.

As Damian has pointed out, there's some confusion about just what
this C<sub list> thing (which I, above, have written LISTIFY) would
look like.  One suggestion was that it just be essentially: 

    alpha) sub LISTIFY { @_ }

which was then modified to be the list expansion 

    beta) sub LISTIFY { @_[ 0 .. $#_] }

These are, of course, both quite different.  Doubtless there were
at least some of you out there those of you who were Not Pleased
with the patent nastiness shown by my example cases one through
four given thought in the back of your minds that I was somehow
playing dirty pool by always taking the return list of the function
called in list context, expanding it, and then saving only the final
element in that list.  

Yet this is *precisely* what LISTIFY-beta does--if it's used in
scalar context, as that is the general behaviour of an actual list
whose scalar sense is taken.  If you play with the slice above,
you'll see that this is how slices work also.  (Tangentially: if
you backslash a slice and assign it to a scalar, you get a reference
to the last element.)  It seems more obvious, though, to write that
as 

    beta) sub LISTIFY { $_[$#_] } 

or even, someone appealing:

    beta) sub LISTIFY { pop } 

As those are now invariant in behaviour with respect to calling
context.  Why does that matter?  Well, if LISTIFY-beta is used in
list context, then it doesn't do this silliness.  It just returns
the original list, as that's what you get when you use a list in
list context.  But if that were the case, you wouldn't coerce it
into list context, for of course it would already be there!

So, those of you who didn't care for LISTIFY-beta are probably all
saying, "No, no, no.  I just wanted the *count* of the items that
would be returned if the function were called in list context."

Very well.  That's what LISTIFY-alpha gives you.  Well, assuming that
it were used in scalar context, which would them become an array
used for its scalar sense, which is its item count.  If it weren't
used in scalar context, but in list context, then we're back to the
same inane no-op as we saw in a similar situation with LISTIFY-beta.

Even if one were to adopt either the a or b forms of LISTIFY, 
this is *not* the best way to go about it.  Let's look at our
four test cases and see why, and find better ways.

CASE 1, revisited:

    Rewriting this:

        $burp = LISTIFY( grep {EXPR} @data );

    To get the alpha sense, you should simply use

        $burp = grep {EXPR} @data;

    To get the beta sense, you might as well just use

        $burp = ( grep {EXPR} @data )[-1];

CASE 2, revisited:

    Rewriting this:

        $burp = LISTIFY( getpwuid(SUMNUM) );

    To get the alpha sense is largely a waste of time, since the
    length of the return value list is, theoretically, invariant.
    Actually, that's been being diddled a bit down the years, which
    is at the very least a tad annoying, but so it goes--what are
    you going to do?  For simple counting, if that's well and truly
    all one is languishing for here, it would be incredibly more
    obvious, consistent, and reasonable to use a function whose
    name reflected its function--as list, listify, or whatnot clearly
    do *not*.   "list" doesn't mean "count 'em all up".  So if that's
    what you want, let it be so:

        $burb = ELT_COUNT( getpwuid(SUMNUM) );  # how many in pwent?

    where ELT_COUNT (however it would be named) would be merely
    this:

        sub ELT_COUNT { scalar @_ }

    Notice how unlike the earlier definitions given for the misdesired
    "list" function, this one is invariant in return value across
    contexts.

    As for the beta sense, that is much more clearly written as

        $burp = ( getpwuid(SUMNUM) )[-1];   # fetch shell


CASE 3, revisited:

    Rewriting this:

        $burp = LISTIFY( <FH> );

    for the alpha case (just count them) is critical due to 
    memory concerns.  In fact, as a grader, I'd be sorely tempted
    to deduct points for a student with such flagrant disregard
    for reality here.  That should be written more like:

        1 while <FH>;  $burp = $.;

    or even:

        for ($burp = 0; my $line = <FH>; $burp++) {}

    One should eschew the temptation to write

        () = <FH>; $burp = $.;

    for the memory concerns given above.

    For the beta case, you again desire an iterative and consequently
    scalable approach.  Write it this way:

        $burb = <FH> until eof(TC);

CASE 4, revisted:

    Rewriting this:

        $burp = LISTIFY( $string =~ /pattern/g );

    For the alpha case to get the count is feasible in the blatantly
    obvious way:

        for ($burb = 0; $string =~ /pattern/g; $burb++) {}
    or in the way that I have suggested before:

        $burb = () = $string =~ /pattern/g; 

    This is, presumably, not so costly as the readline case, but
    if you wished to become particularly parsimonious, you would
    elect the former solution just now given.  And no whingeing 
    on assignment to an empty list!  Learn Perl.  If you can't,
    then go ahead and waste your life by writing that as

        $burp = ELT_COUNT( $string =~ /pattern/g );

    per the previous definition, but recognize that you're 
    wasting your life.

    For the beta case, in which the final (sub)match is returned,
    you could, if you wanted, write that as we have seen before:

        $burb = ( $string =~ /pattern/g ) [-1]; 

    If you really wanted not the last match but the last boolean
    produced, the solution is slightly simpler:

        $burb = 1;

    This really just brings home something that Mark once remarked
    upon: the mutually exclusive goals of using a /g 
    to step through a progressive match, well, progressively, 
    and of using list context to extract the list of parenthesized
    submatches.  You end up doing things like

        while ( $string =~ /(pa)(tt)(ern)/g ) {
            my($fee, $fie, $foe) = ($1, $2, $3);
            ...
        } 

    or, heavens, 


        @all = $string =~ /(pa)(tt)(ern)/g;
        while ( my($fee, $fie, $foe) = splice(@all, 0, 3)) {
            ...
        } 

    The very scenario that gave rise to the call for this desired
    listification doodad, this "count the matches", is almost
    certainly an inappropriate diagnose and consequent response to
    a rather different conundrum.

In summary, this passage remains reasonable and prudent:

    There's no C<list> function corresponding to C<scalar> since,
    in practice, one never needs to force evaluation in a list
    context.  That's because any operation that wants I<LIST> already
    provides a list context to its list arguments for free.

Wherever you think you need one of these, try to think again.  Either
it's already in list context, in which case it's silly to put in
the list thing, or else there's always a better way to accomplish
whatever you're trying to do--which, as I have shown, can vary
greatly.

This proposal should be dropped.

--tom
Re: RFC 175 (v1) Add C keyword to force list context (like C)

Reply via email to