On Tue, Feb 12 2019, brian m. carlson wrote:
> Gitweb has several hard-coded 40 values throughout it to check for
> values that are passed in or acquired from Git. To simplify the code,
> introduce a regex variable that matches either exactly 40 or exactly 64
> hex characters, and use this variable anywhere we would have previously
> hard-coded a 40 in a regex.
>
> Similarly, switch the code that looks for deleted diffinfo information
> to look for either 40 or 64 zeros, and update one piece of code to use
> this function. Finally, when formatting a log line, allow an
> abbreviated describe output to contain up to 64 characters.
This might be going a bit overboard but I tried this with a variant
where...
> +# A regex matching a valid object ID.
> +our $oid_regex = qr/(?:[0-9a-fA-F]{40}(?:[0-9a-fA-F]{24})?)/;
> +
Instead of this dense regex I did:
my $sha1_len = 40;
my $sha256_extra_len = 24;
my $sha256_len = $sha1_len + $sha256_extra_len;
sub oid_nlen_regex {
my $len = shift;
my $hchr = qr/[0-9a-fA-F]/;
return qr/(?:(?:$hchr){$len})/
}
our $oid_regex;
{
my $x = oid_nlen_regex($sha1_len);
my $y = oid_nlen_regex($sha256_extra_len);
$oid_regex = qr/(?:$x(?:$y)?)/
}
Then most of the rest of this is the same, e.g.:
> - if ($input =~ m/^[0-9a-fA-F]{40}$/) {
But...
> @@ -2037,10 +2040,10 @@ sub format_log_line_html {
> (?<!-) # see strbuf_check_tag_ref(). Tags can't start with -
> [A-Za-z0-9.-]+
> (?!\.) # refs can't end with ".", see check_refname_format()
> - -g[0-9a-fA-F]{7,40}
> + -g[0-9a-fA-F]{7,64}
> |
> # Just a normal looking Git SHA1
> - [0-9a-fA-F]{7,40}
> + [0-9a-fA-F]{7,64}
> )
> \b
> }{
E.g. here we can do call oid_nlen_regex("7,64") to produce this blurb.
> - if ($line =~ m/^index [0-9a-fA-F]{40},[0-9a-fA-F]{40}/) {
> + if ($line =~ m/^index $oid_regex,$oid_regex/) {
> - } elsif ($line =~ m/^index [0-9a-fA-F]{40}..[0-9a-fA-F]{40}/) {
> + } elsif ($line =~ m/^index $oid_regex..$oid_regex/) {
And here, maybe nobody cares, but we now implicitly accept mixed SHA-1 &
SHA-256 input. Whereas we could have a helper on top of the above code
like:
sub oid_nlen_prefix_infix_regex {
my $nlen = shift;
my $prefix = shift;
my $infix = shift;
my $rx = oid_nlen_regex($nlen);
return qr/^\Q$prefix\E$rx\Q$infix\E$rx$/;
}
And then e.g.:
} elsif ($line =~ oid_nlen_prefix_infix_regex($sha1_len, "index ", "..") ||
$line =~ oid_nlen_prefix_infix_regex($sha256_len, "index ", ".."))
{
So only accept SHA1..SHA1 or SHA256..SHA256, not SHA1..SHA256 or
SHA256..SHA1.