Re: [Jprogramming] Unicode (UTF8) string deconstruction

2016-06-17 Thread robert therriault
Thanks Bill, If the utf8 is at most 3 bytes that takes a layer of checking out of my utf verb. utf_vts_ 3 : 0"1 if. y-:'' do. return. end. try. ((utf@:((1<.#)}.]));~((3 u: ":)@: (7 u: a.{~ (1<.#) {. ]))) y catch. try. ((utf@:((2<.#)}.]));~((3 u: ":)@: (7 u: a.{~ (2<.#) {. ]))) y cat

Re: [Jprogramming] Unicode (UTF8) string deconstruction

2016-06-17 Thread bill lam
I see. But J can only handle unicode in bmp, ie codepoints below 65536, which are atmost 3 byte utf8. u: 65536 |index error | u:65536 Also the display width of a unicode character can vary from 0 to 2. Пт, 17 июн 2016, robert therriault написал(а): > Yes there are certainly illegal utf8

Re: [Jprogramming] Unicode (UTF8) string deconstruction

2016-06-17 Thread robert therriault
Yes there are certainly illegal utf8 characters in 8 6$ 'ఝ' ,'a','ఝ', but what I am attempting is to reveal the illegal characters for what they are. Along the lines of the shape and type display that I had used incorporating svg. Once i have that information in a format that I can separate the i

Re: [Jprogramming] Unicode (UTF8) string deconstruction

2016-06-17 Thread bill lam
But your s contains illegal utf8 characters. isutf8=: 1:@(7&u:) ::0: isutf8 'ఝ' ,'a','ఝ' 1 isutf8"1[ 8 6$ 'ఝ' ,'a','ఝ' 0 0 0 1 0 0 0 0 isutf8"1[ 8 7$ 'ఝ' ,'a','ఝ' 1 1 1 1 1 1 1 1 Since the 3 wide characters string is a 7 byte in utf8 a.i.'ఝ' ,'a','ఝ' 224 176 157 97 224 176 157 8 6 $

Re: [Jprogramming] Unicode (UTF8) string deconstruction

2016-06-17 Thread robert therriault
Thanks for all the suggestions everyone. In the end I took a more explicit approach than I normally would, but it seems to work. I am not sure if this is useful for Henry, but it is one approach. [s=. 8 6 $ 'ఝ' ,'a','ఝ' ఝa�� �ఝa� ��ఝa ఝఝ aఝ�� �aఝ� ��aఝ ఝa�� boxutf s ┌───┬─

Re: [Jprogramming] linear representation/ formatter recommended change

2016-06-17 Thread Don Guinn
Could using CTRL-SHIFT-V fix the formatting problem? On Jun 17, 2016 12:47 PM, "Brian Schott" wrote: > Thanks. I see. > So your main purpose to add this spacing "color" for easier human > readability? Right. > -- > For informatio

Re: [Jprogramming] linear representation/ formatter recommended change

2016-06-17 Thread Brian Schott
Thanks. I see. So your main purpose to add this spacing "color" for easier human readability? Right. -- For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] linear representation/ formatter recommended change

2016-06-17 Thread 'Pascal Jasmin' via Programming
current linear representation/formatting feedback for mean is (+/ % #) I propose that there be an extra space between % and #. using _ for email formatting (eats spaces) (+/ %_ #) A hook (f g) would get represented as (f_ g) The complex examples I gave earlier were meant to represent cases

Re: [Jprogramming] linear representation/ formatter recommended change

2016-06-17 Thread Brian Schott
Pascal, I would perhaps understand your proposal better if you gave paired examples of the present result and the proposed result and maybe pairs of present inputs and proposed inputs if they differ, also. -- (B=) -- For inform

[Jprogramming] linear representation/ formatter recommended change

2016-06-17 Thread 'Pascal Jasmin' via Programming
The basic proposal is that for dyadic verbs, that an extra space (2 total) appears between it and its y operand. The formatting would apply to specific tines in forks and hooks as well. lr/formatter is most useful in returning tacit definitions, but returns it in a formatted way. (linearized s

Re: [Jprogramming] Fwd: 16bHHHHHHHH...HHx - no way

2016-06-17 Thread David Lambert
Since 36 characters are available to write digits, we would not expect a number written in a base be extended. 10bx 33 datatype 10bx integer 10b.xNB. floating point 3.3 j probably could permit an extended vector {:0x 16b53973c0482de06eddf68d4531e9062c7 |ill-formed numbe

[Jprogramming] Fwd: 16bHHHHHHHH...HHx - no way

2016-06-17 Thread Moon S
It seems that there's no extended hexadecimal literals in J. The decimal ones work ok. Let's convert a long integer to its hex representation '0123456789abcdef'{~16#.^:_1 (1111111x) 53973c0482de06eddf68d4531e9062c7 16b53973c0482de06eddf68d4531e9062c7 N