[EMAIL PROTECTED] writes: > I read "the backend is by and large an ASCII, null-terminated-string > engine" with "we use UTF-8 [for varlena strings?]" as, a lot of the > code assumes varlena strings are '\0' terminated, and an assumption > on my part, that the varlena strings are not stored in the backend > with a '\0' terminator, therefore, they require being copied out, > terminated with a '\0', before they can be used?
There are places where we have to do that, the worst from a performance viewpoint being in string comparison --- we have to null-terminate both values before we can pass them to strcoll(). One of the large bits that would have to be done before we could even contemplate using UCS2/UCS4 is getting rid of our dependence on strcoll, since its API is null-terminated-string. > How much effort (past discussions that I've missed from a decade ago? > hehe) has been put into determining whether a zero-copy architecture, > or really, a minimum copy architecture, would address some of these > bottlenecks? Am I dreaming? :-) We've already done it in places, for instance the new implementation of "virtual tuples" in TupleTableSlots eliminates a lot of copying of pass-by-reference values. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly