Mark Davis wrote:

> What are also tricky are the 'almost' supersets, where there are only a few
> different characters. Those definitely cause problems because the difference
> in data is almost undetectable.

For example, Mark is referring to cases such as ISO 8859-1 and 8859-15.

Those share all the same encoded characters except those at
the code points 0xA4, 0xA6, 0xA8, 0xB4, 0xB8, and 0xBC..0xBE.

So neither of the repertoires is a proper subset of the other,
but the two coded character sets share the vast majority
of their characters, including almost all of the common ones.

--Ken


Reply via email to