I took another look at some garbled spam I seem to be picking up
regularly, which I had mistakenly assumed to be from a Korean source,
and it looks like Apple's mail app in 10.2.4 is _not_ handling 7-bit JIS
correctly. More later.

But, while I was checking that, I checked the following:

> I managed to find the kanji I asked the person about with the charecter palette
> description she gave, but it was or could be described otherwise as: Unicode
> 5782, JIS(X0213) 1-31-66, Shift JIS(X0208) 9082: 

tarasu/tareru (hang down)

> and it was mojibake'ed as
> ($BEZ(B)

That kind of looks like seven-bit JIS. The $B is a piece of a control
sequence when mixing 7-bit JIS with 7-bit ANSI.

EZ is the 7-bit JIS for tsuchi (earth, dirt). And BE is 7-bit JIS for
the "da" in "datou" (valid). Nope. Something else happened to that.

> Some other codes she sent, and hence probably in the same encoding,
> were $B7V(B and $Bj%(B both for hotaru.

7V is the 7-bit JIS for hotaru (firefly). j% is 7-bit JIS for a more
traditional rendering of hotaru.

Here's the meat of the source of a C tool I wrote to check:
---------------------------------------------------------
        for ( i = 0; i < kTermWidth - 1; i += 2 )
        {       unsigned long byte1 = (unsigned char) buf[ i ] - 0x21;  /* kuten */
                unsigned long byte2 = (unsigned char) buf[ i + 1 ] - 0x21;      /* 
kuten */
                if ( byte1 == '\0' )
                        break;
                byte2 += 0x40;
                if ( ( byte1 & 1 ) == 1 )
                        byte2 += 94;
                if ( byte2 > 0x7e )
                        ++byte2;
                byte1 >>= 1;
                byte1 += 0x81;
                if ( byte1 > 0x9f )
                        byte1 += 0x40;
                buf[ i ] = (char) byte1;
                buf[ i + 1 ] = (char) byte2;
        }
        buf[ kTermWidth ] = '\0';       /* training wheels */
---------------------------------------------------------

(Yeah, C comes more natural to me than perl. Especially for this kind of
stuff. So shoot me.) It's missing the escape sequence and end-of-line
handling, among other things, but may be amusing to those interested in
the relationship between 7-bit JIS and shift-JIS.

> 
> some other strings are:
> a$EAaD (this is the one I could decode)

Weird. All I can read out of that is kilogram told hits. Or, maybe just
the character "hayai" (early)?

> $B0T$B0U$B0G<>"<>n<>d<>c

Who's meaningful dark? Or perhaps the saba fish in the crucible?

Anyway, they _look_ sort of like 7-bit JIS, and the two you came up with
for hotaru are, in fact, 7-bit JIS for hotaru.

-- 
Joel Rees, programmer, Kansai Systems Group
Altech Corporation (Alpsgiken), Osaka, Japan
http://www.alpsgiken.co.jp

Reply via email to