Re: Unicode String Models

Mark Davis ☕️ via Unicode Sun, 09 Sep 2018 01:04:30 -0700

Thanks, excellent comments. While it is clear that some string models have
more complicated structures (with their own pros and cons), my focus was on
simple internal structures. The focus was also on immutable strings — and
the tradeoffs for mutable ones can be quite different — and that needs to
be clearer. I'll add some material about those two areas (with pointers to
sources where possible).


Mark


On Sat, Sep 8, 2018 at 9:20 PM John Cowan <[email protected]> wrote:

> This paper makes the default assumption that the internal storage of a
> string is a featureless array.  If this assumption is abandoned, it is
> possible to get O(1) indexes with fairly low space overhead.  The Scheme
> language has recently adopted immutable strings called "texts" as a
> supplement to its pre-existing mutable strings, and the sample
> implementation for this feature uses a vector of either native strings or
> bytevectors (char[] vectors in C/Java terms).  I would urge anyone
> interested in the question of storing and accessing mutable strings to read
> the following parts of SRFI 135 at <
> https://srfi.schemers.org/srfi-135/srfi-135.html>:  Abstract, Rationale,
> Specification / Basic concepts, and Implementation.  In addition, the
> design notes at <https://github.com/larcenists/larceny/wiki/ImmutableTexts>,
> though not up to date (in particular, UTF-16 internals are now allowed as
> an alternative to UTF-8), are of interest: unfortunately, the link to the
> span API has rotted.
>
> On Sat, Sep 8, 2018 at 12:53 PM Mark Davis ☕️ via Unicore <
> [email protected]> wrote:
>
>> I recently did some extensive revisions of a paper on Unicode string
>> models (APIs). Comments are welcome.
>>
>>
>> https://docs.google.com/document/d/1wuzzMOvKOJw93SWZAqoim1VUl9mloUxE0W6Ki_G23tw/edit#
>>
>> Mark
>>
>

Re: Unicode String Models

Reply via email to