Hi John,

Please don't strip the Cc's!

> > Anyway, this code seems to be overly complicated in the first place.
> > What about this instead:
> >
> >         /^$/    { exit }
> >         END     { exit (not_mh) }
> >         !/^[A-Za-z-]+:[ \t]/    { not_mh=1 ; exit }
> >
> > Works for me and should be totally portable. John, can you please test
> > with Sun's /usr/xpg4/bin/awk for confirmation?
> 
> That passes through Sun awk ok, with the following test failure, which
> I assume is the one you expect to fail below.
> 
> [63] $ quilt mail --mbox mbox --prefix "test" --subject "This is a
> test" -m "Message" -- failed
> Unable to extract a subject header from patches/5.diff != ~

Yes, same one.

> > Note that the attached patch breaks the test suite. My analysis is that
> > the old awk code didn't work, and this hid a different bug in mail.in,
> > which we will now have to hunt for.
> 
> Agreed.

On further analysis, it appears that subject lines starting with "Re: "
or "Fwd: " will be erroneously detected as valid mail header lines.

Explicitely discarding "Re" and "Fwd" doesn't seem to be a good idea,
as different languages have different abreviations for these (e.g.
German have "Aw:" for reply, French have "Tr:" for forward) so we could
state that valid mail header keywords must be at least 4 characters
long, except for "To", "Cc" and possibly "Bcc". Attached is a patch
which does this (combined with the previous fix) so the test suite
passes again.

A different (and possibly more robust) approach would be to consider
the first paragraph a mail header block only if at least one of "To:",
"Cc:" or "From:" is found (in addition to the rule that only mail
headers must be found there.)

Comments?

-- 
Jean Delvare
Index: quilt/mail.in
===================================================================
RCS file: /cvsroot/quilt/quilt/quilt/mail.in,v
retrieving revision 1.20
diff -u -r1.20 mail.in
--- quilt/mail.in       1 Feb 2006 12:50:31 -0000       1.20
+++ quilt/mail.in       3 Feb 2006 09:39:51 -0000
@@ -220,15 +220,26 @@
                fi
 
                # Does the first paragraph look like a mail header?
+               # Valid mail header lines start with a keyword followed by
+               # a colon itself followed by a blank space (or tab). The
+               # keyword must be at least 4 characters long except for To, Cc
+               # and Bcc. This additional rule is required to differenciate
+               # subject lines starting with "Re:" or "Fwd:" from mail header
+               # lines.
                if [ [EMAIL PROTECTED] -eq 0 ]
                then
                        if echo "$header" | awk '
-                               /^$/    { exit (!mh || not_mh) }
-                               END     { exit (!mh || not_mh) }
-                               { if ($0 ~ /^[a-z]+:[ \t]/i)
-                                   mh = 1
-                                 else
-                                   not_mh = 1
+                               /^$/    { exit }
+                               END     { exit (not_mh) }
+                               /^[A-Za-z-]+:[ \t]/ {
+                                       if (length($1) >= 5 ||
+                                           tolower($1) == "to:" ||
+                                           tolower($1) == "cc:" ||
+                                           tolower($1) == "bcc:")
+                                               next;
+                               }
+                               { not_mh = 1
+                                 exit
                                }'
                        then
                                return 1
_______________________________________________
Quilt-dev mailing list
[email protected]
http://lists.nongnu.org/mailman/listinfo/quilt-dev

Reply via email to