On Jun 28, 2011, at 10:27 AM, Guy Harris wrote:

>       when putting them into a textual representation of the protocol tree or 
> into columns or something else to be shown to humans, map them to UTF-8, with 
> anything that can't be mapped to UTF-8 - including, if the encoding is 
> putatively UTF-8, octet sequences that aren't valid UTF-8 sequences - shown 
> as the Unicode replacement character U+FFFD;

...and, for "for display" conversions, we might want to convert control 
characters to "Control Pictures" symbols (0x0000 to 0x001F convert to 0x2400 to 
0x241f: ␀, ␁, etc. through ␟; 0x007F converts to 0x2421, i.e. ␡ - in the font 
in which this message is being displayed to me, those have the control 
character abbreviations displayed in really really small letters, diagonally 
from upper left to lower right; unfortunately, I see nothing for C1 control 
characters).
___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <wireshark-dev@wireshark.org>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
             mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe

Reply via email to