For Chinese it is "obvious". "Word" means "character"
and a phrase is a sequence of words. A number is
specified as a phrase with words for the digits in the
positional decimal system, so repeated application of
[EMAIL PROTECTED] eventually gets you down (or up) to 3, and
# of its UTF-8 representation is 3.
----- Original Message -----
From: John Randall <[EMAIL PROTECTED]>
Date: Sunday, May 11, 2008 6:37
Subject: [Jchat] Iterated words and numbers
To: [email protected]
Another problem from HAKMEM.
Suppose we start with a number, write it out in words, take the
length, and keep going. For English, the result will
always converge
to 4 (and, as far as I can tell, to 3 for Chinese).
Using us and zh from
http://www.jsoftware.com/jwiki/Essays/Number_in_Words
~. [EMAIL PROTECTED]:_"0 ] 1e3 ?. 1e4
4
~. [EMAIL PROTECTED]:_"0 ] 1e3 ?. 1e4
3
Why?
Actually if wchar is used instead of utf8, the answer will be 1 and that will be
more intuitive. Replacing the first three definition swith
ZH10=: 7&u:&.> <;._1 ' 零 一 二 三 四 五 六 七 八 九' NB. i.10
ZH4 =: 7&u:&.> '千';'百';'十';' ' NB. 10^3 2 1 0
ZHU =: 7&u:&.> '';'萬';'億';'兆' NB. 10^0 4 8 12
~. [EMAIL PROTECTED]:_"0 ] 1e3 ?. 1e4
1
If your PC cannot display chinese characters, you may still see why it iterates
to 1 with the following definitions,
ZH10=: <"(0) '0123456789' NB. i.10
ZH4 =: 'K';'H';'D';' ' NB. 10^3 2 1 0
ZHU =: '';'M';'Y';'S' NB. 10^0 4 8 12
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm