On Fri, 16 Jan 2009, Bryan Jurish wrote:

UTF-8 also does a pretty good job of compactly representing latin
character sets for natural language data, where non-ASCII characters
tend to be relatively infrequent anyways.  UTF-16 and UTF-32 are pretty
wasteful in these cases.  (Of course, I'm biting my own tail with this
point, since the [pdstring] representation is even more wasteful than
UTF-32 ;-)

Well, RAM is in discount at a very big mailorder store, where you can get 4096 megs as two sticks of DDR2-800 memory for 29,99 CAD, which is 18,24 EUR.

I don't think that the goal is to be compact, nor that you really have much choice here. The goal is so that Pd users can mess with string characters the way they want, in a way that is fairly easy to use (well at least, that's the goal I can infer when I look at the idea of using lists as strings!). Then if you decide not to depend on any other Pd library and try to leverage existing Pd 0.39+ the most you can, then you have to use Pd's lists, and then it's 64 or 128 bits per char.

And then, in theory, Pd could adopt any internal rep, as long as file I/O and socket I/O is done the way it needs to be done.

... except if you're building rsp. reading a persistent index for a
large file, in which case tell() & seek() are likely to be a wee bit
faster than parsing and counting variable-length-encoded characters ...
right.
... or calling malloc(), or doing pretty much any other low-level fiddly
stuff ...

It doesn't matter much, as Pd patches wouldn't be doing malloc(). Furthermore, I expect that you have or you would have a function for converting a list to a C string in the proper encoding, so that externs that want to use your strings don't have to do for(i=0;...) a[i]=b[i] all of the time, but also because it's a good opportunity for introducing optional encoding conversion.

 _ _ __ ___ _____ ________ _____________ _____________________ ...
| Mathieu Bouchard - tél:+1.514.383.3801, Montréal, Québec
_______________________________________________
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list

Reply via email to