On Dec 16, 2006, at 4:55 AM, Bryan Jurish wrote:
morning,
On 2006-12-16 01:40:03, Mathieu Bouchard <[EMAIL PROTECTED]>
appears to have written:
On Fri, 15 Dec 2006, Hans-Christoph Steiner wrote:
An advantage using the list-of-bytes approach is that because each
character can be represented by a rather large integer, it can be
extended to work on lists-of-characters meaning quickly, if there
is a [utf8decode] and [utf8encode] to turn bytes into characters
and back; also it's a method that is available now and reuses the
existing list objects; and it's a method that supports \0 (NUL)
characters.
Disadvantages are that it takes more time to convert to C strings
and back, it takes more space in .pd files, it isn't readable as
text in .pd files, it takes up to 4 times more space to represent
in .pd files, and exactly 4 times more space in RAM (in the case
that just iso-latin-1 is used), and also that you can't make lists
of strings like that.
i count (sizeof(int)+sizeof(float)-1)*strlen(message) wasted bytes
per string object, not counting the selector. as i think we've
discussed before, using ieee floats, which should be able to
losslessly encode a 24 bit integer, that can be tweaked down to
(sizeof(int)+sizeof(float)-1)*strlen(message)/3 on average, but on
my system (32 bit floats), that still amounts to one wasted byte
per character for the representation, and it's hellishly cryptic to
boot.
(By the time we can have real strings, we can have nested-lists,
and the other way around, because they'd use the same mechanisms.
whether it's better to make them two types or one type, is a good
question.)
... but then again, what else are ascii 0x1c-0x1f (28-31 =
{fs,gs,rs,us}) for? it's another ugly hack, would reserve some of
the ascii range, and would require additional parsing objects
(potentially constructable with [list]), but it's a possibility,
should anyone actually need nested lists as strings...
please don't get me wrong: i'm all in favor of "real" strings,
nested lists, and associative arrays - i wrote [pdstring] because i
needed to send some generated text over OSC to someone who could
only interpret ascii values: i'm glad if it's helpful to anyone
besides myself, and i don't see much difficulty in adding support
for low-level c-type string operations ([toupper], [tolower], at
some later point maybe even regexes), but i can't bring myself to
believe that the list-of-bytes approach is really the "right" way
to do it, although i don't have a better idea at the moment...
One advantage of this approach is that many C string functions like
toupper, tolower, strcat, strcmp, etc. would be pretty easy to
implement in Pd, rather than C. A regexp object in C would be pretty
straightforward.
How about using a selector "string" for these lists? I suppose that
could cause mayhem since it would make the list into a selector
series and run into all the vagaries of handling them.
.hc
------------------------------------------------------------------------
Man has survived hitherto because he was too ignorant to know how to
realize his wishes. Now that he can realize them, he must either
change them, or perish. -William Carlos Williams
_______________________________________________
PD-dev mailing list
PD-dev@iem.at
http://lists.puredata.info/listinfo/pd-dev