> The UNDEF is worrying, that says something could not represent
> the code point. Also what is that '=A0' when de-mimed ?
If I patch the t/lib/encode./t test that comes with the perl@10086
kit like so:
--- lib/encode.t.orig Tue Mar 27 15:22:41 2001
+++ lib/encode.t Mon May 14 21:55:52 2001
@@ -16,7 +16,7 @@
my @source = qw(ascii iso8859-1 cp1250);
my @destiny = qw(cp1047 cp37 posix-bc);
my @ebcdic_sets = qw(cp1047 cp37 posix-bc);
-plan test => 38+$n*@encodings + 2*@source*@destiny*@character_set +
2*@ebcdic_sets*256;
+plan test => 38+$n*@encodings + 2*@source*@destiny*@character_set +
+2*@ebcdic_sets*256 + 3*255;
my $str = join('',map(chr($_),0x20..0x7E));
my $cpy = $str;
ok(length($str),from_to($cpy,'iso8859-1','Unicode'),"Length Wrong");
@@ -110,7 +110,8 @@
}
# Spot check a few points in/out of utf8
-for my $i (0x41,128,256,0x20AC)
+# Yes the hex range covers the 128 decimal point redundantly.
+for my $i (0x00..0xff,128,256,0x20AC)
{
my $c = chr($i);
my $o = encode_utf8($c);
End of Patch.
Then the following 32 (decimal) codepoints fail the first and last tests
of the three within the for loop at the tail end of encode.t:
65, 74, 106, 138, 139, 143, 144, 154, 155, 157, 159, 160, 170, 171,
175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 187, 188, 190,
202, 218, 234, 250
I just tested such a patched encode.t test on linux with the 10086 kit
and it passes all tests with no failures.
I have tried to include the output from the failed tests as an attachment
to this email. Be forewarned that it does contain 8 bit characters
although they all appear to be printable Latin-1 types on linux.
If you cannot unpack the file then you might want to glance at the tarball
available from:
http://www.best.com/~pvhp/os390/pepo/ebre.tar.gz
which contains some of this information in a cruder form.
not ok 2902
# Test 2902 got: <UNDEF> (lib/encode.t at line 118 fail #66)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 65)
not ok 2904
# Test 2904 got: <UNDEF> (lib/encode.t at line 120 fail #66)
# Expected: '�' (utf8 decode by name broken for 65)
not ok 2929
# Test 2929 got: <UNDEF> (lib/encode.t at line 118 fail #75)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 74)
not ok 2931
# Test 2931 got: <UNDEF> (lib/encode.t at line 120 fail #75)
# Expected: '�' (utf8 decode by name broken for 74)
not ok 3025
# Test 3025 got: <UNDEF> (lib/encode.t at line 118 fail #107)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 106)
not ok 3027
# Test 3027 got: <UNDEF> (lib/encode.t at line 120 fail #107)
# Expected: '�' (utf8 decode by name broken for 106)
not ok 3121
# Test 3121 got: <UNDEF> (lib/encode.t at line 118 fail #139)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 138)
not ok 3123
# Test 3123 got: <UNDEF> (lib/encode.t at line 120 fail #139)
# Expected: '�' (utf8 decode by name broken for 138)
not ok 3124
# Test 3124 got: <UNDEF> (lib/encode.t at line 118 fail #140)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 139)
not ok 3126
# Test 3126 got: <UNDEF> (lib/encode.t at line 120 fail #140)
# Expected: '�' (utf8 decode by name broken for 139)
not ok 3136
# Test 3136 got: <UNDEF> (lib/encode.t at line 118 fail #144)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 143)
not ok 3138
# Test 3138 got: <UNDEF> (lib/encode.t at line 120 fail #144)
# Expected: '�' (utf8 decode by name broken for 143)
not ok 3139
# Test 3139 got: <UNDEF> (lib/encode.t at line 118 fail #145)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 144)
not ok 3141
# Test 3141 got: <UNDEF> (lib/encode.t at line 120 fail #145)
# Expected: '�' (utf8 decode by name broken for 144)
not ok 3169
# Test 3169 got: <UNDEF> (lib/encode.t at line 118 fail #155)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 154)
not ok 3171
# Test 3171 got: <UNDEF> (lib/encode.t at line 120 fail #155)
# Expected: '�' (utf8 decode by name broken for 154)
not ok 3172
# Test 3172 got: <UNDEF> (lib/encode.t at line 118 fail #156)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 155)
not ok 3174
# Test 3174 got: <UNDEF> (lib/encode.t at line 120 fail #156)
# Expected: '�' (utf8 decode by name broken for 155)
not ok 3178
# Test 3178 got: <UNDEF> (lib/encode.t at line 118 fail #158)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 157)
not ok 3180
# Test 3180 got: <UNDEF> (lib/encode.t at line 120 fail #158)
# Expected: '�' (utf8 decode by name broken for 157)
not ok 3184
# Test 3184 got: <UNDEF> (lib/encode.t at line 118 fail #160)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 159)
not ok 3186
# Test 3186 got: <UNDEF> (lib/encode.t at line 120 fail #160)
# Expected: '�' (utf8 decode by name broken for 159)
not ok 3187
# Test 3187 got: <UNDEF> (lib/encode.t at line 118 fail #161)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 160)
not ok 3189
# Test 3189 got: <UNDEF> (lib/encode.t at line 120 fail #161)
# Expected: '�' (utf8 decode by name broken for 160)
not ok 3217
# Test 3217 got: <UNDEF> (lib/encode.t at line 118 fail #171)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 170)
not ok 3219
# Test 3219 got: <UNDEF> (lib/encode.t at line 120 fail #171)
# Expected: '�' (utf8 decode by name broken for 170)
not ok 3220
# Test 3220 got: <UNDEF> (lib/encode.t at line 118 fail #172)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 171)
not ok 3222
# Test 3222 got: <UNDEF> (lib/encode.t at line 120 fail #172)
# Expected: '�' (utf8 decode by name broken for 171)
not ok 3232
# Test 3232 got: <UNDEF> (lib/encode.t at line 118 fail #176)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 175)
not ok 3234
# Test 3234 got: <UNDEF> (lib/encode.t at line 120 fail #176)
# Expected: '�' (utf8 decode by name broken for 175)
not ok 3235
# Test 3235 got: <UNDEF> (lib/encode.t at line 118 fail #177)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 176)
not ok 3237
# Test 3237 got: <UNDEF> (lib/encode.t at line 120 fail #177)
# Expected: '�' (utf8 decode by name broken for 176)
not ok 3238
# Test 3238 got: <UNDEF> (lib/encode.t at line 118 fail #178)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 177)
not ok 3240
# Test 3240 got: <UNDEF> (lib/encode.t at line 120 fail #178)
# Expected: '�' (utf8 decode by name broken for 177)
not ok 3241
# Test 3241 got: <UNDEF> (lib/encode.t at line 118 fail #179)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 178)
not ok 3243
# Test 3243 got: <UNDEF> (lib/encode.t at line 120 fail #179)
# Expected: '�' (utf8 decode by name broken for 178)
not ok 3244
# Test 3244 got: <UNDEF> (lib/encode.t at line 118 fail #180)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 179)
not ok 3246
# Test 3246 got: <UNDEF> (lib/encode.t at line 120 fail #180)
# Expected: '�' (utf8 decode by name broken for 179)
not ok 3247
# Test 3247 got: <UNDEF> (lib/encode.t at line 118 fail #181)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 180)
not ok 3249
# Test 3249 got: <UNDEF> (lib/encode.t at line 120 fail #181)
# Expected: '�' (utf8 decode by name broken for 180)
not ok 3250
# Test 3250 got: <UNDEF> (lib/encode.t at line 118 fail #182)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 181)
not ok 3252
# Test 3252 got: <UNDEF> (lib/encode.t at line 120 fail #182)
# Expected: '�' (utf8 decode by name broken for 181)
not ok 3253
# Test 3253 got: <UNDEF> (lib/encode.t at line 118 fail #183)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 182)
not ok 3255
# Test 3255 got: <UNDEF> (lib/encode.t at line 120 fail #183)
# Expected: '�' (utf8 decode by name broken for 182)
not ok 3256
# Test 3256 got: <UNDEF> (lib/encode.t at line 118 fail #184)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 183)
not ok 3258
# Test 3258 got: <UNDEF> (lib/encode.t at line 120 fail #184)
# Expected: '�' (utf8 decode by name broken for 183)
not ok 3259
# Test 3259 got: <UNDEF> (lib/encode.t at line 118 fail #185)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 184)
not ok 3261
# Test 3261 got: <UNDEF> (lib/encode.t at line 120 fail #185)
# Expected: '�' (utf8 decode by name broken for 184)
not ok 3262
# Test 3262 got: <UNDEF> (lib/encode.t at line 118 fail #186)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 185)
not ok 3264
# Test 3264 got: <UNDEF> (lib/encode.t at line 120 fail #186)
# Expected: '�' (utf8 decode by name broken for 185)
not ok 3268
# Test 3268 got: <UNDEF> (lib/encode.t at line 118 fail #188)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 187)
not ok 3270
# Test 3270 got: <UNDEF> (lib/encode.t at line 120 fail #188)
# Expected: '�' (utf8 decode by name broken for 187)
not ok 3271
# Test 3271 got: <UNDEF> (lib/encode.t at line 118 fail #189)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 188)
not ok 3273
# Test 3273 got: <UNDEF> (lib/encode.t at line 120 fail #189)
# Expected: '�' (utf8 decode by name broken for 188)
not ok 3277
# Test 3277 got: <UNDEF> (lib/encode.t at line 118 fail #191)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 190)
not ok 3279
# Test 3279 got: <UNDEF> (lib/encode.t at line 120 fail #191)
# Expected: '�' (utf8 decode by name broken for 190)
not ok 3313
# Test 3313 got: <UNDEF> (lib/encode.t at line 118 fail #203)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 202)
not ok 3315
# Test 3315 got: <UNDEF> (lib/encode.t at line 120 fail #203)
# Expected: '�' (utf8 decode by name broken for 202)
not ok 3361
# Test 3361 got: <UNDEF> (lib/encode.t at line 118 fail #219)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 218)
not ok 3363
# Test 3363 got: <UNDEF> (lib/encode.t at line 120 fail #219)
# Expected: '�' (utf8 decode by name broken for 218)
not ok 3409
# Test 3409 got: <UNDEF> (lib/encode.t at line 118 fail #235)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 234)
not ok 3411
# Test 3411 got: <UNDEF> (lib/encode.t at line 120 fail #235)
# Expected: '�' (utf8 decode by name broken for 234)
not ok 3457
# Test 3457 got: <UNDEF> (lib/encode.t at line 118 fail #251)
# Expected: '�' (decode_utf8 not inverse of encode_utf8 for 250)
not ok 3459
# Test 3459 got: <UNDEF> (lib/encode.t at line 120 fail #251)
# Expected: '�' (utf8 decode by name broken for 250)