I guess it would help if I posted my code and what it puts out.

This is probably the wrong list for this question, but is anyone willing to give me a clue why

$line =~ tr/+/ /;

would clip out the lead bytes of a shift-JIS string in a cgi script?

Come to think of it, I think it's being applied while the string is still hex-encoded, so it makes even less sense to me.

(I know, I should be letting the CGI module decode the url-encoded string. But I seem to be mis-understanding something fundamental here. Which is why a newbies list would probably be better for this question.)

# The code that grabs the parameters:

my $qString = $ENV{'QUERY_STRING'};
my @list = split( '&', $qString, 10 );
my %queries = ();
foreach my $pair ( @list )
{       my ( $key, $value ) = split( '=', $pair, 2 );
        # Really should just give in and use CGI.
        #$key =~ tr/+/ /;
        $key =~ s/%([\dA-Fa-f][\dA-Fa-f])/pack ("C", hex ($1))/eg;
        $queries{ $key . '_0' } = $value;
        
        my $value_y = my $value_tr = $value;
        $value_y =~ y/+/ /;
        $queries{ $key . '_y1' } = $value_y;
        $value_tr =~ tr/+/ /;
        $queries{ $key . '_tr1' } = $value_tr;
        
        $value_y =~ s/%([\dA-Fa-f][\dA-Fa-f])/pack ("C", hex ($1))/eg;
        $queries{ $key . '_y' } = $value_y;
        $value_tr =~ s/%([\dA-Fa-f][\dA-Fa-f])/pack ("C", hex ($1))/eg;
        $queries{ $key . '_tr' } = $value_tr;
        
        $value =~ s/%([\dA-Fa-f][\dA-Fa-f])/pack ("C", hex ($1))/eg;
        $queries{ $key } = $value;
}

# For reference, the code being used to dump the parameters to html:

sub dumpQueries
{       if ( $debugParams )
        {
                print "language = $language, function = $function\n";

                print "<table border='1'>\n";

                my @keys = keys ( %queries );
                foreach my $key ( @keys )
                {       my $value = $queries{ $key };
            if ( !defined ( $value ) )
            {   $value = "UNDEF";
            }
print "<tr><td align='right'>$key</td><td align='left'>$value</ td></tr>\n";
                }

                print "</table>\n";
        }
}

# ------------results-------------
chatter this+is+a+test.+これはテストです。
function_tr     eキ
chatter_y1 this is a test. %82%B1%82%EA%82%CD%83e%83X%83g%82%C5%82%B7% 81B
who_tr1 daddy
function        投稿する
who_y1  daddy
function_tr1    %93%8A%8De%82%B7%82%E9
who     daddy
who_0   daddy
who_tr  daddy
who_y   daddy
chatter_0       this+is+a+test.+%82%B1%82%EA%82%CD%83e%83X%83g%82%C5%82%B7%81B
chatter_y       this is a test. アヘeXgナキB
function_y      eキ
chatter_tr      this is a test. アヘeXgナキB
function_y1     %93%8A%8De%82%B7%82%E9
function_0      %93%8A%8De%82%B7%82%E9
chatter_tr1 this is a test. %82%B1%82%EA%82%CD%83e%83X%83g%82%C5%82% B7%81B
# ------------results-sorted-------------
chatter_0       this+is+a+test.+%82%B1%82%EA%82%CD%83e%83X%83g%82%C5%82%B7%81B
chatter_y1 this is a test. %82%B1%82%EA%82%CD%83e%83X%83g%82%C5%82%B7% 81B chatter_tr1 this is a test. %82%B1%82%EA%82%CD%83e%83X%83g%82%C5%82% B7%81B
chatter_y       this is a test. アヘeXgナキB
chatter_tr      this is a test. アヘeXgナキB
chatter this+is+a+test.+これはテストです。
function_0      %93%8A%8De%82%B7%82%E9
function_y1     %93%8A%8De%82%B7%82%E9
function_tr1    %93%8A%8De%82%B7%82%E9
function_y      eキ
function_tr     eキ
function        投稿する
who_0   daddy
who_y1  daddy
who_tr1 daddy
who_y   daddy
who_tr  daddy
who     daddy
# ------------results-html-------------
language = j, function = talk
<table border='1'>
<tr><td align='right'>chatter</td><td align='left'>this+is+a+test. +これはテストです。</td></tr> <tr><td align='right'>function_tr</td><td align='left'>eキ</td></ tr> <tr><td align='right'>chatter_y1</td><td align='left'>this is a test. %82%B1%82%EA%82%CD%83e%83X%83g%82%C5%82%B7%81B</td></tr>
<tr><td align='right'>who_tr1</td><td align='left'>daddy</td></tr>
<tr><td align='right'>function</td><td align='left'>投稿する</ td></tr>
<tr><td align='right'>who_y1</td><td align='left'>daddy</td></tr>
<tr><td align='right'>function_tr1</td><td align='left'>%93%8A%8De%82% B7%82%E9</td></tr>
<tr><td align='right'>who</td><td align='left'>daddy</td></tr>
<tr><td align='right'>who_0</td><td align='left'>daddy</td></tr>
<tr><td align='right'>who_tr</td><td align='left'>daddy</td></tr>
<tr><td align='right'>who_y</td><td align='left'>daddy</td></tr>
<tr><td align='right'>chatter_0</td><td align='left'>this+is+a+test.+% 82%B1%82%EA%82%CD%83e%83X%83g%82%C5%82%B7%81B</td></tr> <tr><td align='right'>chatter_y</td><td align='left'>this is a test. アヘeXgナキB</td></tr>
<tr><td align='right'>function_y</td><td align='left'>eキ</td></tr>
<tr><td align='right'>chatter_tr</td><td align='left'>this is a test. アヘeXgナキB</td></tr> <tr><td align='right'>function_y1</td><td align='left'>%93%8A%8De%82% B7%82%E9</td></tr> <tr><td align='right'>function_0</td><td align='left'>%93%8A%8De%82% B7%82%E9</td></tr> <tr><td align='right'>chatter_tr1</td><td align='left'>this is a test. %82%B1%82%EA%82%CD%83e%83X%83g%82%C5%82%B7%81B</td></tr>
</table>
# ------------results-extract-------------
chatter this+is+a+test.+これはテストです。
chatter_tr      this is a test. アヘeXgナキB
function        投稿する
function_tr     eキ
# ------------results-extract-hexdump-lined-up-------------
00000000  63 68 61 74 74 65 72 09  |chatter.|
00000008 74 68 69 73 2b 69 73 2b 61 2b 74 65 73 74 2e 2b |this+is+a +test.+| 00000018 82 b1 82 ea 82 cd 83 65 83 58 83 67 82 c5 82 b7 81 42 0a |.......e.X.g.....B.|
0000002b  63 68 61 74 74 65 72 5f  74 72 09 |chatter_tr.|
00000036 74 68 69 73 20 69 73 20 61 20 74 65 73 74 2e 20 |this is a test. |
00000046  b1 cd  65 58 67 c5 b7 42 0a  |..eXg..B.|
0000004f  66 75 6e 63 74 69 6f 6e  09  |function.|
00000058  93 8a 8d 65 82 b7 82 e9  0a  |...e.....|
00000061  66 75 6e 63 74 69 6f 6e  5f 74 72 09  |function_tr.|
0000006d  65 b7 0a  |e..|
# ------------results-extract-hexdump-straight-------------
00000000 63 68 61 74 74 65 72 09 74 68 69 73 2b 69 73 2b | chatter.this+is+| 00000010 61 2b 74 65 73 74 2e 2b 82 b1 82 ea 82 cd 83 65 |a+test. +.......e| 00000020 83 58 83 67 82 c5 82 b7 81 42 0a 63 68 61 74 74 |.X.g.....B.chatt| 00000030 65 72 5f 74 72 09 74 68 69 73 20 69 73 20 61 20 | er_tr.this is a | 00000040 74 65 73 74 2e 20 b1 cd 65 58 67 c5 b7 42 0a 66 | test. ..eXg..B.f| 00000050 75 6e 63 74 69 6f 6e 09 93 8a 8d 65 82 b7 82 e9 | unction....e....| 00000060 0a 66 75 6e 63 74 69 6f 6e 5f 74 72 09 65 b7 0a |.function_tr.e..|
# ------------results-end-------------

00000018 82 b1 82 ea 82 cd 83 65 83 58 83 67 82 c5 82 b7 81 42 0a |.......e.X.g.....B.|
00000046  b1 cd  65 58 67 c5 b7 42 0a  |..eXg..B.|

and

00000058  93 8a 8d 65 82 b7 82 e9  0a  |...e.....|
0000006d  65 b7 0a  |e..|

tell the tale.

Okay, so it looks like it isn't just stripping the lead bytes, every now and then I'm losing a full JIS character.

Joel Rees
(waiting for a 3+GHz ARM processor to come out,
to test Steve's willingness to switch again.)


Reply via email to