At 13:11 01/10/19 +0900, Soobok Lee wrote:
>Another answer for your concern.
>
>----- Original Message -----
>From: "James Seng/Personal" <[EMAIL PROTECTED]>
> >
> > The bigger concern I have with re-ordering remains in the fact that
> > tables mappings proves efficient with existing IDN names in some
> > registries *BUT* it  does not indicate what performance it would be like
> > in the future. We do not know what happened when the names space get
> > saturated and would other names which would have been useable without
> > lsb become un-usuable due to lsb.
> >
>
>1) saturations in TLD namespaces would require longer names for which
>     REORDERING is designed to give greater benefits/compression ratio.

No. What James referred to is that saturation tends to fill up the
short name slots, and thus flatten the probability distribution.
I.e. if somebody doesn't get the name they wanted, the chance is
that they go for something like xq.com, because it's easy to
remember because it's short. Neither x nor q are very frequent
letters.


>2) future variations on character usage frequency in each script
>
>     2.0) the character frequency table are constructed from
>          Verisign GRS' ML.com testbeds.
>          Even for chinese han script, their
>           registrations came from China/TAIWAN/JAPAN/KOREA and other
>            non-asian squatters.
>          Each country of the 4 have their own different han character
>            usage patterns. The reordering table for han , therefore,
>           cannot  for the worst case, the mutual difference in 
> improvement ratios
>           did not exceed  +- 2% around 20%.
>
>     2.1) this issue is already answered by latest REORDERING I-D 2.0
>          see the enclosed excerpts from it. The influence of this
>          frequency variations is marginal.

Well, there are always some variations with marginal influence,
and there are others with much larger influence.


>i believe that frequency fluctuation of han characters over time  is
>WITHIN the frequent set.

Well, it's extremely difficult to predict the future. You may be
right, or you may be wrong.


>INs and OUTs from 4096 ones are rare and does not invalidate the validity 
>of most frequent 1024 and 2048 ones.
>Moreover, TC/SC/KC characters are put side-by-side

Can you explain that better? What about Japanese cases?


>to avoid countriy-specific biases in han reordering table.
>
>non-CJK scripts often haver small set of basic alphabets, and their
>character usage patterns are more stable than those for han/hangeul.

No, many other scripts are used for many more languages, with
quite different usage patterns. (A lot of Han usage in Japan,
and most of it in Korea, is due to loanwords from Chinese.)


>REORDERING does not recommends reordering on shares latin scripts,
>because latin characters are already encoded as it is (in literal mode,
>the most efficient form ). latin script for europeans (0.6 billions) are 
>the most favored one in ACE-Z. There shoulbe be some conpensations for
>non europeans. Han script: 2 billions, Arabic: 0.7 billion, Hindi: 0.5 billion

The main advantage for these scripts is the use of ACE-Z,
when e.g. compared to UTF-8. Reordering is rather minor.


>This new frequency-based reordering is always more efficient than
>original lexicographical ordering in UCS
>even with some fluctuation in future script usage patterns.
>
>We are not pursuing elusive "perfection and optimal" solution.
>REORDERING tables cannot be modified if it is once freezed as standards.

Very true.


>Therefore,REORDERING is a sub-optimal solution in its nature but will be 
>remain
>as a valid and effective solution for a long time .

Valid and effective for a long time is just something you have
to believe in, there is no way to prove it.


Regards,   Martin.

Reply via email to