some more possible test fixups in perl@9452 (was Re: Perl5.7.* Unicode/EBCDIC status.)

Peter Prymmer Thu, 29 Mar 2001 18:09:05 -0800

I have glanced through RFC 2045 at:

   http://www.ietf.org/rfc/rfc2045.txt

and it seems clear that the _intent_ of the quoted printable encoding is
to:

   There is a tradeoff between the desire for a compact and
   efficient encoding of largely- binary data and the desire for a
   somewhat readable encoding of data that is mostly, but not entirely,
   7bit.  For this reason, at least two encoding mechanisms are
   necessary: a more or less readable encoding (quoted-printable) and a
   "dense" or "uniform" encoding (base64).

since the stated goal of the encoding is legibility one could argue that
on EBCDIC a QP implementation that left the printable portions
of the /[:ascii:]/ set as is would be compliant.  Nevertheless there are
many exlplicit examples given that presuppose the ASCII numeric codes,
there are also things such as:

   The quoted-printable and base64 encodings transform their input from
   an arbitrary domain into material in the "7bit" range, thus making it
   safe to carry over restricted transports.  The specific definition of
   the transformations are given below.

   The proper Content-Transfer-Encoding label must always be used.
   Labelling unencoded data containing 8bit characters as "7bit" is not
   allowed, nor is labelling unencoded non-line-oriented data as
   anything other than "binary" allowed.

The only mention of EBCDIC occurs with this passage:

   NOTE: The quoted-printable encoding represents something of a
   compromise between readability and reliability in transport.  Bodies
   encoded with the quoted-printable encoding will work reliably over
   most mail gateways, but may not work perfectly over a few gateways,
   notably those involving translation into EBCDIC.  A higher level of
   confidence is offered by the base64 Content-Transfer-Encoding.  A way
   to get reasonably reliable transport through EBCDIC gateways is to
   also quote the US-ASCII characters

     !"#$@[\]^`{|}~

   according to rule #1.

So I have not yet decided what the MIME::QuotedPrint behavior for perl
ought to be on EBCDIC and would appreciate hearing from some RFC mavens on
the matter.

One way to address the trouble on OS/390 under CP IBM-1047 would be:

diff -ru perl.9452/t/lib/mimeqp.t perl/t/lib/mimeqp.t
--- perl.9452/t/lib/mimeqp.t    Thu Mar 29 18:21:03 2001
+++ perl/t/lib/mimeqp.t Thu Mar 29 18:21:22 2001
@@ -78,6 +78,17 @@
 for (@tests) {
     $testno++;
     ($plain, $encoded) = @$_;
+    if (ord('A') == 193) { 
+        $encoded =~ s/=3D/=7E/g;
+        $encoded = 'vVre kjWre norske tegn b8r Wres'
+            if $encoded eq "v=E5re kj=E6re norske tegn b=F8r =E6res";
+        $encoded = '=40=40'
+            if $encoded eq "=20=20";
+        $encoded = "\tt=05"
+            if $encoded eq "\tt=09";
+        $encoded = "test=40=40\ntest\n=05=40=05=40\n"
+            if $encoded eq "test=20=20\ntest\n=09=20=09=20\n";
+    }
     $x = encode_qp($plain);
     if ($x ne $encoded) {
        print "Encode test failed\n";
@@ -98,6 +109,17 @@
 }
 
 # Some extra testing for a case that was wrong until libwww-perl-5.09
+if (ord('A') == 193) { 
+print "not " unless decode_qp("foo  \n\nfoo =\n\nfoo=40\n\n") eq
+                                "foo\n\nfoo \nfoo \n\n";
+$testno++; print "ok $testno\n";
+
+# Same test but with "\r\n" terminated lines
+print "not " unless decode_qp("foo  \r\n\r\nfoo =\r\n\r\nfoo=40\r\n\r\n") eq
+                                "foo\r\n\r\nfoo \r\nfoo \r\n\r\n";
+$testno++; print "ok $testno\n";
+}
+else {
 print "not " unless decode_qp("foo  \n\nfoo =\n\nfoo=20\n\n") eq
                                 "foo\n\nfoo \nfoo \n\n";
 $testno++; print "ok $testno\n";
@@ -106,4 +128,5 @@
 print "not " unless decode_qp("foo  \r\n\r\nfoo =\r\n\r\nfoo=20\r\n\r\n") eq
                                 "foo\r\n\r\nfoo \r\nfoo \r\n\r\n";
 $testno++; print "ok $testno\n";
+}
 
End of Possible Patch (may be 1047 specific???)

While we're discussing the manner in which to patch tests I might as well
also discuss pragma/warnings.t.  Enclosed below is one way to address it -
that leads to a difficult situation as far as maintainability is concerned
in that test number 72/73 would now be hard coded into the test script.
An alternative might be to supply regular expressions to the $results
variable on EBCDIC machines such that we do something along the lines of:

+    if (ord('A') == 193) { 
+        $result =~ s/"main::z"/"main::w"/;
+        $result =~ s/line 6.$/line 5./;
+        $result =~ s/"main::x"/"main::z"/;
+        $result =~ s/line 4.$/line 6./;
+        $result =~ s/"main::w"/"main::x"/;
+        $result =~ s/line 5.$/line 4./;
+    }

or somesuch. Preferences?

diff -ru perl.9452/t/pragma/warnings.t perl/t/pragma/warnings.t
--- perl.9452/t/pragma/warnings.t       Thu Mar 29 14:47:01 2001
+++ perl/t/pragma/warnings.t    Thu Mar 29 15:26:35 2001
@@ -110,6 +110,15 @@
            }
        }
     }
+    # Hashing order difference on EBCDIC turns expectation into:
+    #Name "main::x" used only once: possible typo at - line 4.
+    #Name "main::z" used only once: possible typo at - line 6.
+    # If tests are added prior to warn/perl - 4th test then change 72
+    # below.
+    if ($i == 72 && ord('A') == 193) { 
+        my ($results_line1,$results_line2,@remainder) = split("\n",$results);
+        $results = join("\n",$results_line2,$results_line1);
+    }
     if ( $results =~ s/^SKIPPED\n//) {
        print "$results\n" ;
     }
End of Possible patch.


Peter Prymmer
some more possible test fixups in perl@9452 (was Re: Perl5.7.* Unicode/EBCDIC status.)

Reply via email to