Thanks Bill,
If the utf8 is at most 3 bytes that takes a layer of checking out of my utf
verb.
utf_vts_
3 : 0"1
if. y-:'' do. return. end.
try. ((utf@:((1<.#)}.]));~((3 u: ":)@: (7 u: a.{~ (1<.#) {. ]))) y
catch. try. ((utf@:((2<.#)}.]));~((3 u: ":)@: (7 u: a.{~ (2<.#) {. ]))) y
cat
I see. But J can only handle unicode in bmp, ie codepoints below
65536, which are atmost 3 byte utf8.
u: 65536
|index error
| u:65536
Also the display width of a unicode character can vary from 0 to 2.
Пт, 17 июн 2016, robert therriault написал(а):
> Yes there are certainly illegal utf8
Yes there are certainly illegal utf8 characters in 8 6$ 'ఝ' ,'a','ఝ', but what
I am attempting is to reveal the illegal characters for what they are. Along
the lines
of the shape and type display that I had used incorporating svg. Once i have
that information
in a format that I can separate the i
But your s contains illegal utf8 characters.
isutf8=: 1:@(7&u:) ::0:
isutf8 'ఝ' ,'a','ఝ'
1
isutf8"1[ 8 6$ 'ఝ' ,'a','ఝ'
0 0 0 1 0 0 0 0
isutf8"1[ 8 7$ 'ఝ' ,'a','ఝ'
1 1 1 1 1 1 1 1
Since the 3 wide characters string is a 7 byte in utf8
a.i.'ఝ' ,'a','ఝ'
224 176 157 97 224 176 157
8 6 $
Thanks for all the suggestions everyone.
In the end I took a more explicit approach than I normally would, but it seems
to work.
I am not sure if this is useful for Henry, but it is one approach.
[s=. 8 6 $ 'ఝ' ,'a','ఝ'
ఝa��
�ఝa�
��ఝa
ఝఝ
aఝ��
�aఝ�
��aఝ
ఝa��
boxutf s
┌───┬─
Could using CTRL-SHIFT-V fix the formatting problem?
On Jun 17, 2016 12:47 PM, "Brian Schott" wrote:
> Thanks. I see.
> So your main purpose to add this spacing "color" for easier human
> readability? Right.
> --
> For informatio
Thanks. I see.
So your main purpose to add this spacing "color" for easier human
readability? Right.
--
For information about J forums see http://www.jsoftware.com/forums.htm
current linear representation/formatting feedback for mean is
(+/ % #)
I propose that there be an extra space between % and #. using _ for email
formatting (eats spaces)
(+/ %_ #)
A hook (f g) would get represented as
(f_ g)
The complex examples I gave earlier were meant to represent cases
Pascal,
I would perhaps understand your proposal better if you gave paired examples
of the present result and the proposed result and maybe pairs of present
inputs and proposed inputs if they differ, also.
--
(B=)
--
For inform
The basic proposal is that for dyadic verbs, that an extra space (2 total)
appears between it and its y operand.
The formatting would apply to specific tines in forks and hooks as well.
lr/formatter is most useful in returning tacit definitions, but returns it in a
formatted way. (linearized s
Since 36 characters are available to write digits, we would not expect a
number written in a base be extended.
10bx
33
datatype 10bx
integer
10b.xNB. floating point
3.3
j probably could permit an extended vector
{:0x 16b53973c0482de06eddf68d4531e9062c7
|ill-formed numbe
It seems that there's no extended hexadecimal literals in J.
The decimal ones work ok. Let's convert a long integer to its hex
representation
'0123456789abcdef'{~16#.^:_1 (1111111x)
53973c0482de06eddf68d4531e9062c7
16b53973c0482de06eddf68d4531e9062c7 N
12 matches
Mail list logo