They do it that way because it's easier. UTF-8 is an ingenious way to
compress text and still maintain compatibility with ASCII. But it's a
compression algorithm, just like ZIP. UTF-8 is a great way to store,
display and transmit text, but it's tough to do much in the way of
manipulation. No one in his right mind would want to try to manipulate data
within a ZIP file. One would unzip the file, do the manipulation, then zip
it back. It's the same with UTF-8. Even counting characters is difficult
without converting to wide or 32 bit characters.

Look how hard it was to display boxed UTF-8 properly. Yet the old code
would have probably worked by simply adding a convert to wide, build the
boxed display, then back to UTF-8.

Look at the verb "ucpcount". It converts to wide to get the count because
it is easy. By the way, it will need to be updated to "#@(9&u:)" when 805
becomes production, along with several other verbs starting with "u".

On Tue, Oct 25, 2016 at 1:02 AM, bill lam <[email protected]> wrote:

> Perhaps for efficiency. Just my guess.
>
> Пн, 24 окт 2016, Henry Rich написал(а):
> > Not mentioned, because I have no idea what the reasons might be.  I was
> just
> > trying to describe the behavior.  If you know the reasons, pray reveal
> them.
> >
> > Henry Rich
> >
> > On 10/24/2016 8:59 PM, bill lam wrote:
> > > While the behavior is well documented, I suspect there should be
> > > some specific reasons for prefering character index instead of
> > > byte index, but that was not mentioned.
> > >
> > > Пн, 24 окт 2016, Henry Rich написал(а):
> > > > It's already documented in the note at the end of
> > > >
> > > > http://code.jsoftware.com/wiki/Guides/Window_Driver/
> ChildClasses#edit
> > > >
> > > > & other controls.  I'm not saying that the info might not profitably
> be
> > > > repeated elsewhere.
> > > >
> > > > Henry Rich
> > > >
> > > > On 10/24/2016 11:32 AM, Don Guinn wrote:
> > > > > Added to the bug list. I vote for leaving it as it is and
> documenting the
> > > > > way it works. It's not that hard to deal with once one is aware of
> how it
> > > > > works.
> > > > >
> > > > > On Mon, Oct 24, 2016 at 5:30 AM, chris burke <[email protected]>
> wrote:
> > > > >
> > > > > > I agree with Don in that this just needs documenting, and is not
> a bug.
> > > > > >
> > > > > > On 24 October 2016 at 05:45, Bill <[email protected]> wrote:
> > > > > >
> > > > > > > Qt works on unicode characters, but j uses utf8 for
> interfacing with Qt
> > > > > > so
> > > > > > > that I think this issue is a bug, it should report byte
> offset.  Please
> > > > > > > file a bug in jwiki for record.
> > > > > > >
> > > > > > > On 24 Oct, 2016, at 1:28 PM, Don Guinn <[email protected]>
> wrote:
> > > > > > >
> > > > > > > > The return from (wd 'sm get edit') contains the text behind
> the window
> > > > > > > and
> > > > > > > > select showing a selection. The text is literal and the
> select gives
> > > > > > > > indices of selected characters. If text contains any unicode
> > > > > > characters,
> > > > > > > > like line drawing characters the indices are offset. The
> indices are
> > > > > > not
> > > > > > > in
> > > > > > > > bytes, but characters. Converting the text to either
> literal2 or
> > > > > > literal4
> > > > > > > > makes the select indices correct.
> > > > > > > >
> > > > > > > >
> > > > > > > > I don't think that this is an error, but one must convert
> the text to
> > > > > > > > literal2 or literal4 to get the index to work. Should the
> index reflect
> > > > > > > the
> > > > > > > > U8 selection index instead of the literal2 or literal4
> indices? A
> > > > > > little
> > > > > > > > inconsistency.
> > > > > > > >
> > > > > > > >
> > > > > > > > I haven't been able to find where this kind of information
> is covered,
> > > > > > > but
> > > > > > > > it may be there somewhere. Is this the way it's supposed to
> work? If
> > > > > > so,
> > > > > > > > not a problem, but it needs to be documented. I learned how
> it works by
> > > > > > > > experimenting. There is no Discussion on the
> > > > > > > Window_Driver/Session_Manager
> > > > > > > > page, I can put in it what I have found so far, but I
> suspect that
> > > > > > there
> > > > > > > is
> > > > > > > > a lot more that should be covered.
> > > > > > > > ------------------------------------------------------------
> ----------
> > > > > > > > For information about J forums see http://www.jsoftware.com/
> forums.htm
> > > > > > > ------------------------------------------------------------
> ----------
> > > > > > > For information about J forums see http://www.jsoftware.com/
> forums.htm
> > > > > > >
> > > > > > ------------------------------------------------------------
> ----------
> > > > > > For information about J forums see http://www.jsoftware.com/
> forums.htm
> > > > > >
> > > > > ------------------------------------------------------------
> ----------
> > > > > For information about J forums see http://www.jsoftware.com/
> forums.htm
> > > > ------------------------------------------------------------
> ----------
> > > > For information about J forums see http://www.jsoftware.com/
> forums.htm
> >
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
>
> --
> regards,
> ====================================================
> GPG key 1024D/4434BAB3 2008-08-24
> gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
> gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to