the rght argument 30101 is an integer, not literal2. 7 u: returns utf16 not literal2. utf16 has surrogate pairs so that result must be rank-1. utf16 is not a J data type.
4 u: returns literal2 (a J data type) in which the concept of surrogate pairs does not apply. literal2 has atom. try 7 u: 128512 to confirm the result is a surrogate pair. Also 9 u: 128512 is a literal4 atom. pre-j805, 7 u: integer is a domain error, behavior of j805 is incompatible. there will be an global parameter to restore the domain error so that it becomes compatible again. the same applies to 8 u: integer. Pre-j805 only support literal2. Utf16 was first introduced in j805. Your confusion might come from mixing up literal2 and utf16. On 2 Apr, 2017 12:55 am, "robert therriault" <[email protected]> wrote: u: 30101 疕 datatype u: 30101 unicode $ u: 30101 #$ u: 30101 0 NB. unicode (literal2) atom as expected 4 u: 30101 疕 datatype 4 u: 30101 unicode $ 4 u: 30101 #$ 4 u: 30101 0 NB. unicode (literal2) atom as expected 7 u: 30101 疕 datatype 7 u: 30101 unicode $ 7 u: 30101 1 NB. unicode (literal2) list of length 1 is unexpected #$ 7 u: 30101 1 NB. rank 1 is unexpected The dictionary suggests that with a right argument of literal2, then if all values <128, convert to ASCII, otherwise as is. [0] I believe that since the argument is > 128 the 'as is' case would apply and that no change in shape should occur, but Unicode is a tricky beast and I welcome enlightenment. Cheers, bob [0] http://www.jsoftware.com/help/dictionary/duco.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
