Re: [PATCH v4] send-email: extract email-parsing code into a subroutine

2017-12-07 Thread Ævar Arnfjörð Bjarmason

On Thu, Dec 07 2017, Matthieu Moy jotted:

> Not terribly important, but your patch has trailing newlines. "git diff
> --staged --check" to see them. More below.
>
> PAYRE NATHAN p1508475  writes:
>
>> the part of code which parses the header a last time to prepare the
>> email and send it.
>
> The important point is not that it's the last time the code parses
> headers, so I'd drop the "a last time".
>
>> +my %parsed_email;
>> +$parsed_email{'body'} = '';
>> +while (my $line = <$c>) {
>> +next if $line =~ m/^GIT:/;
>> +parse_header_line($line, \%parsed_email);
>> +if ($line =~ /^\n$/i) {
>
> You don't need the /i (case-Insensitive) here, there are no letters to
> match.

Good catch, actually this can just be: /^$/. The $ syntax already
matches the ending newline, no need for /^\n$/.

>> +sub parse_header_line {
>> +my $lines = shift;
>> +my $parsed_line = shift;
>> +my $pattern1 = join "|", qw(To Cc Bcc);
>> +my $pattern2 = join "|",
>> +qw(From Subject Date In-Reply-To Message-ID MIME-Version
>> +Content-Type Content-Transfer-Encoding References);
>> +
>> +foreach (split(/\n/, $lines)) {
>> +if (/^($pattern1):\s*(.+)$/i) {
>> +$parsed_line->{lc $1} = [ parse_address_line($2) ];
>> +} elsif (/^($pattern2):\s*(.+)\s*$/i) {
>> +$parsed_line->{lc $1} = $2;
>> +}
>
> I don't think you need to list the possibilities in the "else" branch.
> Just matching /^([^:]*):\s*(.+)\s*$/i should do the trick.

Although you'll end up with a lot of stuff in the $parsed_line hash you
don't need, which makes dumping it for debugging verbose.

I also wonder about multi-line headers, but then again that probably
breaks already on e.g. Message-ID and Refererences, but that's an
existing bug unrelated to this patch...

>> +$body = $body . $body_line;
>
> Or just: $body .= $body_line;


Re: [PATCH v4] send-email: extract email-parsing code into a subroutine

2017-12-07 Thread Matthieu Moy
Not terribly important, but your patch has trailing newlines. "git diff
--staged --check" to see them. More below.

PAYRE NATHAN p1508475  writes:

> the part of code which parses the header a last time to prepare the
> email and send it.

The important point is not that it's the last time the code parses
headers, so I'd drop the "a last time".

> + my %parsed_email;
> + $parsed_email{'body'} = '';
> + while (my $line = <$c>) {
> + next if $line =~ m/^GIT:/;
> + parse_header_line($line, \%parsed_email);
> + if ($line =~ /^\n$/i) {

You don't need the /i (case-Insensitive) here, there are no letters to
match.

> + if ($parsed_email{'mime-version'}) {
> + $need_8bit_cte = 0;

This $need_8bit_cte is a leftover of the old code, which processed the
headers in the order it found them in the message and had to remember
the content of MIME-Version while parsing Content-Type.

I believe you can apply this on top of your patch:

--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -709,7 +709,6 @@ EOT3
open $c, "<", $compose_filename
or die sprintf(__("Failed to open %s: %s"), $compose_filename, 
$!);
 
-   my $need_8bit_cte = file_has_nonascii($compose_filename);
my $in_body = 0;
my $summary_empty = 1;
if (!defined $compose_encoding) {
@@ -740,12 +739,10 @@ EOT3
"\n";
}
if ($parsed_email{'mime-version'}) {
-   $need_8bit_cte = 0;
print $c2 "MIME-Version: $parsed_email{'mime-version'}\n",
"Content-Type: 
$parsed_email{'content-type'};\n",
"Content-Transfer-Encoding: 
$parsed_email{'content-transfer-encoding'}\n";
-   }
-   if ($need_8bit_cte) {
+   } else if (file_has_nonascii($compose_filename)) {
if ($parsed_email{'content-type'}) {
print $c2 "MIME-Version: 1.0\n",
 "Content-Type: 
$parsed_email{'content-type'};",

It reads much better: "If the original message already had a
MIME-Version header, then use that, else see if the file has non-ascii
characters and if so, use MIME-Version: 1.0".

Actually, you can even simplify further by factoring the if/else below:

> + if ($parsed_email{'content-type'}) {
> + print $c2 "MIME-Version: 1.0\n",
> +  "Content-Type: 
> $parsed_email{'content-type'};",

(Suspicious ";", and suspicious absence of "\n" here, I don't think it's
intentional and I'm fixing it below, but correct me if I'm wrong)

> +  "Content-Transfer-Encoding: 8bit\n";
> + } else {

(Broken indentation, this is not aligned with the "if" above)

>   print $c2 "MIME-Version: 1.0\n",
>"Content-Type: text/plain; ",
> -"charset=$compose_encoding\n",
> +  "charset=$compose_encoding\n",
>"Content-Transfer-Encoding: 8bit\n";
>   }

This could become stg like (untested):

} else if (file_has_nonascii($compose_filename)) {
my $content_type = ($parsed_email{'content-type'} or
"text/plain; charset=$compose_encoding");
print $c2 "MIME-Version: 1.0\n",
  "Content-Type: $content_type\n",
  "Content-Transfer-Encoding: 8bit\n";
}

> + open $c2, "<", $compose_filename . ".final"
> + or die sprintf(__("Failed to open %s.final: %s"), 
> $compose_filename, $!);
> + close $c2;

What is this? Cut-and-paste mistake?

> +sub parse_header_line {
> + my $lines = shift;
> + my $parsed_line = shift;
> + my $pattern1 = join "|", qw(To Cc Bcc);
> + my $pattern2 = join "|",
> + qw(From Subject Date In-Reply-To Message-ID MIME-Version 
> + Content-Type Content-Transfer-Encoding References);
> + 
> + foreach (split(/\n/, $lines)) {
> + if (/^($pattern1):\s*(.+)$/i) {
> + $parsed_line->{lc $1} = [ parse_address_line($2) ];
> + } elsif (/^($pattern2):\s*(.+)\s*$/i) {
> + $parsed_line->{lc $1} = $2;
> + }

I don't think you need to list the possibilities in the "else" branch.
Just matching /^([^:]*):\s*(.+)\s*$/i should do the trick.

> + $body = $body . $body_line;

Or just: $body .= $body_line;

-- 
Matthieu Moy
https://matthieu-moy.fr/