On Wed, Oct 24, 2012 at 03:25:30PM -0400, Jeff King wrote:
> Right, but I was specifically worried about raw "=?", which is only an
> issue due to rfc2047 itself.
> 
> However, reading the patch again, we are already checking for that with
> is_rfc2047_quoted. It might miss the case where we have =? but not the
> rest of a valid encoded word, but any compliant parser should recognize
> that and leave it be.
> 
> So I think your original patch is actually correct.
> 
> [...]
> We have a possibly already-encoded header, and we would want to avoid
> double-encoding it.
> 
> In the first case, the "wants quoting" logic should be:
> 
>   is_rfc2047_quoted($subject) || /[^[:ascii:]]/
> 
> and in the latter case it would be:
> 
>   !is_rfc2047_quoted($subject) && /^[:ascii:]]/
> 

ok, I'm sending a version that just adds quote_subject() without
changing any logic, so now we still have in first case:

 /[^[:ascii:]]/

and in the latter case:
 
 !is_rfc2047_quoted($subject) && /^[:ascii:]]/


In the next patch I will just add matching for "=?" in 
subject_needs_rfc2047_quoting() and we will have:

   /=?/ || /[^[:ascii:]]/

and in the latter case:
 
   !is_rfc2047_quoted($subject) && (/=\?/ || /^[:ascii:]]/)

This will also add quoting for any rfc2047 quoted subject or any
other rfc2047-like subject, as you suggested.

Krzysiek
-- 
>From a70c5385f9b4da69a8ce00a1448f87f63bbd500d Mon Sep 17 00:00:00 2001
From: Krzysztof Mazur <krzys...@podlesie.net>
Date: Wed, 24 Oct 2012 22:46:00 +0200
Subject: [PATCH] git-send-email: introduce quote_subject()

The quote_rfc2047() always adds RFC2047 quoting and to avoid quoting ASCII
subjects, before calling quote_rfc2047() subject must be tested for non-ASCII
characters. To avoid this new quote_subject() function is introduced.
The quote_subject() performs this test and calls quote_rfc2047() only if
necessary.

Signed-off-by: Krzysztof Mazur <krzys...@podlesie.net>
---
 git-send-email.perl | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index efeae4c..eb1b876 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -657,9 +657,7 @@ EOT
                        $initial_subject = $1;
                        my $subject = $initial_subject;
                        $_ = "Subject: " .
-                               ($subject =~ /[^[:ascii:]]/ ?
-                                quote_rfc2047($subject, $compose_encoding) :
-                                $subject) .
+                               quote_subject($subject, $compose_encoding) .
                                "\n";
                } elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
                        $initial_reply_to = $1;
@@ -907,6 +905,22 @@ sub is_rfc2047_quoted {
        $s =~ m/^(?:"[[:ascii:]]*"|=\?$token\?$token\?$encoded_text\?=)$/o;
 }
 
+sub subject_needs_rfc2047_quoting {
+       my $s = shift;
+
+       return ($s =~ /[^[:ascii:]]/);
+}
+
+sub quote_subject {
+       local $subject = shift;
+       my $encoding = shift || 'UTF-8';
+
+       if (subject_needs_rfc2047_quoting($subject)) {
+               return quote_rfc2047($subject, $encoding);
+       }
+       return $subject;
+}
+
 # use the simplest quoting being able to handle the recipient
 sub sanitize_address {
        my ($recipient) = @_;
@@ -1327,9 +1341,8 @@ foreach my $t (@files) {
                $body_encoding = $auto_8bit_encoding;
        }
 
-       if ($broken_encoding{$t} && !is_rfc2047_quoted($subject) &&
-                       ($subject =~ /[^[:ascii:]]/)) {
-               $subject = quote_rfc2047($subject, $auto_8bit_encoding);
+       if ($broken_encoding{$t} && !is_rfc2047_quoted($subject)) {
+               $subject = quote_subject($subject, $auto_8bit_encoding);
        }
 
        if (defined $author and $author ne $sender) {
-- 
1.8.0.4.ge8ddce6
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to