Branch: refs/heads/blead
  Home:   https://github.com/Perl/perl5
  Commit: f36bd8db363c0f7a01a69a3a1c604b122f2ec09b
      
https://github.com/Perl/perl5/commit/f36bd8db363c0f7a01a69a3a1c604b122f2ec09b
  Author: Joe McMahon <joe.mcma...@gmail.com>
  Date:   2023-10-18 (Wed, 18 Oct 2023)

  Changed paths:
    M MANIFEST
    M lib/perl5db.pl
    M lib/perl5db.t
    A lib/perl5db/t/gh-21350

  Log Message:
  -----------
  [FIX] Make debugger `l` range specs sane

Fixes 21350.

The pattern which matches valid line number ranges in the debugger is
far too complaisant when it comes to strings it will accept. A number of
completely nonsensical values are cheerfully accepted and produce
results that range from nothing at all to very odd results indeed:

 - specifying a float for a line number appends the fractional part of
   the float to ALL line numbers until something else sets the starting
   line number to an integer again.
 - Strings like `.$.$.$`, `.....`, and `22.$` are all acccepted. Some do
   nothing, some generate errors.
 - Negative line numbers are accepted but do nothing.

This tracks back to a catchall line-range pattern that was added in the
Perl *3* debugger at commit a687059cbaf. This pattern looks as if it
were added to eventually implement a number of things, including lines
relative to the current line, negative indexes into the magic source
code array, variable references, and so on. Several of these were
implemented elsewhere in the debugger, but the regex was never
simplified. This change breaks up that regex and cleans up a few others
(including code that internally generates possible negative line
numbers).

Changes to the old catchall regex:
 - Remove the conditional match of a leading `-`. Negative line numbers
   are not permitted, at all. The code that generated them internally
   has been fixed so it does not do so.
 - Remove `$` as a valid character in a linespec in the catchall regex.
   `$scalar` is handled explicitly in the _cmd_l_main code. There is no
   other documented use for bare '$' in a list linespec, even though it
   is accepted.
 - The linespec character class treated `.`,  and digits, as equally
   valid characters in a linespec. This led to the linespecs matching
   floating point numbers, IPv4 addresses, and various nonnumeric
   nonsense that did not translate to a valid line number. The combined
   character match was split into either a single period (handled
   already as "the current line") or a series of digits _only_ (a
   possible line number). This had to be done for both the starting and
   ending linespec.
 - \A and \z were added to the catchall regex to ensure that it matched
   the whole of the remaining line, or failed. This prevents things like
   `...` from being matched as equivalent to `.` alone.
 - Make sure the 'v' command creates good linespecs internally. As
   previously coded, this could generate linespecs that started with a
   negative number. Since we've outlawed negative numbers in linespecs,
   a `v` too close to the start of the file (as occurs in the tests for
   perl5db.pl) would cause the more-stringent range check to throw an
   error.  Instead, the 'v' command now checks the generated step back
   and sets it to 1 if it is less than 1. The existing test is
   sufficient to confirm that the new code in cmd_v fixes the issue.
 - Verify that floating-point line numbers are illegal: This was the
   original bug that triggered this change: the original regex accepted
   floating-point numbers, and because indexing an array in Perl with a
   float works due toimplicit type conversions, and the increment of a
   float by (integer) 1 doesn't downgrade the float to an int, the line
   numbers continued to be displayed with the fractional portion of the
   original number until something intervened with an explicit or
   implicit integer linespec.  We simply don't permit them to be
   specified as line numbers now.
 - Use \w for valid variable name characters: The debugger should eventually
   be UTF-8 compatible, but is not; this is a step along the way to get there.
   It makes the variable match in the cmd_l reference parsing code match
   proper Perl variable names, including UTF-8 "word" characters, and
   now also properly matches regex result variables ($1, $2, etc.). It
   explicitly rules out bare $, which the old regex would match,
   triggering an error. (For now, we've added the `/a` flag to the match
   to ensure only ASCII matches, since the rest of the code isn't
   prepared to handle non-ASCII numerics.)
 - A new perldb/t test file was added to verify that all the range
   cleanups work properly, along with new stanzas in the perl5db.pl test
   suite. It simply lets us start up the debugger and have a place to
   run the bad linespecs from to verify they do _not_ work.

Changes from @mauke's code review

 - Actually add the new test file to MANIFEST, d'oh.
 - Vastly simpler `_cmd_l_calc_initial_end_and_i` logic. Also more
   likely to actually be correct.
 - More precise variable name matching for `l $scalar`.
 - Purposefully limit line spec numbers to ASCII.
 - Update debugger version to pass porting/cmp_version.t.

Clean up calling sequences

The calling sequences were not always consistent, and some of the
changes to clean up the code ment that variables formerly passed in now
no longer needed to be. One sub was using a variable in the caller; this
was fixed to be passed in explicitly.

 - _cmd_l_handle_var_name now gets $subname passed in
 - _cmd_l_calc_initial_end_and_i no longer needs spec
 - _cmd_l_calc_initial_end_and_i chained teriary replaced with chanined
   defined-or


Reply via email to