When thinking about using surrogate pairs of 16 bit unicode characters to express a 21 bit unicode character I like to think in terms of an analogy of a Medieval Great Field divided into strips for cultivation. A road runs along one edge of the field, perpendicular to the strips, so that someone may gain access to a particular place on a particular strip by using the road to get to the near end of the strip and then proceeding along the strip. The high order bits part of the pair of 16 bit unicode characters denoting which strip is being considered and the low order bits part of the pair of 16 bit unicode characters denoting how far from the road along the strip that one is located. Suppose that one has a document, say a chapter from a novel, that consists of a sequence of unicode characters that are each more than 16 bits in significance and one wishes to represent them using a sequence of 16 bit unicode characters. Suppose that, continuing my analogy, that all of the characters are located in the same strip of the great field. Suppose that there are n characters in the sequence of 21 bit characters. Would the sequence of sixteen bit characters contain 2n or n+1 characters or some other number? That is, once a 16 bit character that is indicating high order bits has been used, is there a presumption that any number of following 16 bit characters that are indicating low order bits are all to be considered as indicating a character in the most recent "great field strip" indicated, or does one need to use a high and low pair for each character from the great field, even if that means continual repetition of the same high order bits indicating character? I can imagine advantages for both types of usage, a persistence of meaning rule would save a lot of space in a 16 bit character file, yet cutting and pasting a document could possibly cause problems if the high order bits have not been stated in the section that is being cut and pasted. William Overington 22 February 2001