Michael G Schwern wrote:

> On Wed, Sep 13, 2000 at 11:34:20PM -0700, Glenn Linderman wrote:
> > The rest is handled adequately and consistently today, and Tom's
> > dequote is adequate to eliminate leading white space... especially
> > among people who cannot agree that a tab in a file means "mod 8"
> > (which it does).
>
> Damnit, I'm going to continue beating this horse until it stops twitching.

That's fine, but it could have been done politely.

I'm all for solving problems, and this message attempts to specify 3 problems, but it 
needs more specification.  You describe three
problems, but it is not clear what the problems are, exactly, because the words you 
used to describe them must not describe the problem
universally.  Let me attempt to describe the problems more completely, and when I 
diverge onto the wrong problem, you can clarify it--
and then maybe we'll be communicating.  I think you've also omitted some of the 
problems-- maybe they shouldn't be classified as major,
but since they are related, and get in the way of some of the possible solutions, I 
think we should mention them all, so I've continued
numbering.

> We have three major problems and three proposed solutions:
>
>     Problems:
>     1 Allowing here-docs to be indented without effecting the ouput.

This is the problem that currently here-doc content must be relative to the left 
margin, so doesn't look nice with respect to nearby
indented code.

>     2 Preserving sub-indentation.

This is not _currently_ a problem.  Perl _currently_ preserves indentation in 
here-docs.  It is not until some other "solutions" gets in
the way, that this problem is a problem.  If problem 1 were solved by independently 
eliminating all leading white space from each line of
the HERE document, then this problem suddenly appears.  So what this "problem" is 
trying to state is that problem #1 cannot be solved
(using your "current stumper" example below) by

      die <<POEM =~ s/^\s*//m;

because that affects the relative horizontal relationships between characters on 
different lines.  So this problem only needs to be
avoided when solving other problems, rather than being a problem today.

>     3 Preserving the output of the here-doc regardless of how its
>       overall indentation is changed (ie. shifted left and right)

This problem appears to be attempting to address what happens when indenting large 
blocks of code, with something equivalent to

     $code =~ s/^/^   /m;  # N.B. that's 3 spaces after the 2nd ^ character

The effect of the indentation is desirable, but the current semantics of here 
documents result in two problems: your number 3, which is
actually subsumes your problem number 1, that the text result of the here document is 
different than it was before the indentation took
place, and also the first additional problem below....

Additional problems:

4 An indented here-doc terminator is not recognized, because perl<6 requires the 
here-doc terminator to be at the left boundary.

5 Because white space is not visible, white space after the here-doc terminator, which 
perl<6 requires must be followed by end-of-line,
can cause apparent here-doc terminators to not be recognized.

6 Because indenting a tab character with non-tab characters changes its starting 
point, its apparant size also changes, thus affecting
the horizontal relationship between characters on different lines of a here-doc.

7 Because people don't all subscribe to the universal definition of the ASCII tab 
character as meaning proceed to the next (mod 8)
horizontal boundary, the appearance of here-docs containing tabs in various 
environments differs in the horizontal relationship between
charactes on different lines of a here-doc.  This can be particularly significant if 
there are different numbers of leading tabs on a
line, or a mixture of tabs and spaces at the front of some lines, or tabs found after 
non-white space characters.

>     Solutions
>     1 <<POD =~ s/some_regex//
>     2 dequote(<<POD)
>     3 indentation of the end-tag
>
> Each solution has their strengths and weaknesses.  Regexes can handle
> problem #1 but only #2 xor #3.  However, they cover a wide variety of
> more general problems.  dequote has the same problem.  #1 is fine, but
> it can only do #2 xor #3.  Not both.

Agreed that there is unlikely to be a single solution that solves all the problems.  
So can we look at solutions to each of the problems,
and then attempt to pick a set of solutions to make available in perl6 that covers the 
problem space?  Before I do that, let's analyze
the current stumper in terms of the problems above, to make sure we are talking about 
the same problems.

> The current stumper, which involves problems 1, 2 and 3 is this:
>
>    if( $is_fitting && $is_just ) {
>         die <<POEM;
>             The old lie
>           Dulce et decorum est
>               Pro patria mori.
>         POEM
>    }
>
> I propose that this work out to
>
>     "    The old lie\n  Dulce et decorum est\n      Pro patria mori.\n"
>
> and always work out to that, no matter how far left or right the
> expression be indented.
>
>    { { { { {
>              if( $is_fitting && $is_just ) {
>                 die <<POEM;
>                     The old lie
>                   Dulce et decorum est
>                       Pro patria mori.
>                 POEM
>    } } } } }
>
> Four spaces, two spaces, six spaces.  Makes sense, everything lines
> up.  So far I have yet to see a regex or dequote() style proposal
> which can accomdate this.

OK, so let's see: I can't tell whether there is any white space after "POEM" on the 
here-doc terminator line, so let's assume there is,
and that problem #5 should be solved.  It is very apparent that you are using white 
space before the here-doc terminator, which is not
part of the here-doc terminator, so you want problem #4 to be solved.  And you want to 
vary the indentation without varying the output,
so you want problem #1 or better, its subsuming problem #3 to be solved.

OK: here's a dequote_like solution that solves it--the additional solution needed is 
just to change perl6 to recoginze here-doc
terminators allowing leading/trailing white space.  Which is what I suggested that RFC 
111 be reduced to.  dequote_like is available in
perl5, but would require shoving "POEM" to the left margin and ensuring no trailing 
white space.  dequote_like is based on the sub
dequote Tom posted, with minor changes, and is given below.  Maybe Tom could comment 
on whether he thinks "dequote_like" is an
improvement over "dequote".  Clearly they could coexist, people could use their 
favorite, and modify at will.


sub dequote_like {
  local $_ = shift;
  my ($leader);  # common white space and common leading string
  if (/^\s*(?:([^\w\s]+).*\n)(?:\s*\1.*\n)+$/) {
    $leader = quotemeta($1);
  } else {
    $leader = '';
  }
  s/^\s*$leader//gm;
  return $_;
}

   { { { { {
             if( $is_fitting && $is_just ) {
                die dequote_like('!', <<POEM);
                !    The old lie
                !  Dulce et decorum est
                !      Pro patria mori.
                POEM
             } # this } had been omitted
   } } } } }


> So solution #1 is powerful, solution #2 is simple, solution #3 solves
> a set of common problems which the others do not (but doesn't provide
> the other's flexibility).  All are orthoganal.  All are fairly simple
> and fairly obvious.  Allow all three.

OK, so now seems time to look at the possible solutions for all the problems... by 
number...

1, 2, & 3.  Use dequote_like.  It solves the variable indent problem, if you assume 
consistent usage of spaces & tabs in the indentation
sequence, at least after the $leader sequence.  Neither does it introduce problem #3.

4 & 5.  This is what I hope RFC 111 turns into.  Allow perl6 to recognize as the 
terminator the here-doc terminator sequence, even if
there is leading or trailing white space on the same line.

6 & 7.  Unresolvable by perl.  User can avoid via consistent use of white space.

> My most common case for needing indented here-docs is this:
>
>     {   {   {   {  # I'm nested
>                     if($error) {
>                         warn "So there's this problem with the starboard warp 
>coupling and oh shit I just ran off the right margin.";
>                     }
>     }   }   }   }
>
> Usually I wind up doing this:
>
>     {   {   {   {  # I'm nested
>                     if($error) {
>                         warn "So there's this problem with the starboard ".
>                              "warp coupling and oh shit I just ran off the ".
>                              "right margin.";
>                     }
>     }   }   }   }
>
> I'd love it if I could do this instead:
>
>     {   {   {   {  # I'm nested
>                     if($error) {
>                         warn <<ERROR =~ s/\n/ /;
>                         So there's this problem with the starboard warp
>                         coupling and hey, now I have lots of room to
>                         pummell you with technobabble!
>                         ERROR
>                     }
>     }   }   }   }
>
> By combining two of the solutions, my problem is solved.  I can indent
> my here-docs and yet keep the output a single line.

sub one_line {
  local $_ = shift;
  s/\s+/ /g;
  s/^\s//;
  return $_;
}

    {   {   {   {  # I'm nested
                    if(1) {
                        warn one_line(<<ERROR);
                        So there's this problem with the starboard warp
                        coupling and hey, now I have lots of room to
                        pummell you with technobabble!
ERROR
                    }
    }   }   }   }

This prints "1", I believe, the result of doing the successful substitutions.  Given

sub one_line {
  local $_ = shift;
  s/\s+/ /g;
  s/^\s//;
  s/\s$//;
  return $_;
}

You could, with the leading&trailing white space allowances for the here-doc 
terminator, use

    {   {   {   {  # I'm nested
                    if(1) {
                        warn one_line(<<ERROR);
                        So there's this problem with the starboard warp
                        coupling and hey, now I have lots of room to
                        pummell you with technobabble!
                        ERROR
                    }
    }   }   }   }


> Show me where this fails and I'll shut up about it.

The syntax for  <<POEM =~ s/regex/subst/;

generally returns 1, and introducing a special case to make it return the string if 
the left hand side is a here-doc seems to be a
pointless inconsistency.   With an intervening string processing sub (be inventive, 
have several in your bag of tricks) for here doc
results for different purposes, you can achieve pretty much any goal you want, and 
certainly all the ones you've stated.

All you really need is that the terminator be recognized in the presence of leading 
white space, and it would be nice to allow trailing
white space too.

Hopefully, we can get a new version of the RFC 111 which recognizes that, and perhaps 
mentions the above subs as ways to achieve these
other goals, but these subs work in perl 5 today, so don't really need to be part of 
the RFC other than in a commentary section about why
just recognizing the terminator embedded within white space cures all these problems.

> --
>
> Michael G Schwern      http://www.pobox.com/~schwern/      [EMAIL PROTECTED]
> Just Another Stupid Consultant                      Perl6 Kwalitee Ashuranse
> Sometimes these hairstyles are exaggerated beyond the laws of physics
>           - Unknown narrator speaking about Anime

--
Glenn
=====
There  are two kinds of people, those
who finish  what they start,  and  so
on...                 -- Robert Byrne



____________NetZero Free Internet Access and Email_________
Download Now     http://www.netzero.net/download/index.html
Request a CDROM  1-800-333-3633
___________________________________________________________

Reply via email to