Greg Ewing <[EMAIL PROTECTED]> wrote:
>
> Josiah Carlson wrote:
> > Because all text objects are internally
> > represented in its minimal 'encoding', equal text objects will always be
> > in the same encoding.
>
> That places a burden on all creators of strings to ensure
> that they are in the
Greg Ewing <[EMAIL PROTECTED]> writes:
> That places a burden on all creators of strings to ensure
> that they are in the minimal format, which could be
> inconvenient for some operations, e.g. taking a substring
> could require making an extra pass to re-code the data.
Yes, but taking a substrin
"Martin v. Löwis" <[EMAIL PROTECTED]> writes:
> You could play tricks with ob_size to save this field:
>
> - ob_size < 0: 8-bit data; length is abs(ob_size)
> - ob_size > 0, (ob_size & 1)==0: 16-bit data, length is ob_size/2
> - ob_size > 0, (ob_size & 1)==1: 32-bit data, length is ob_size/2
I wo
Josiah Carlson schrieb:
>> That places a burden on all creators of strings to ensure
>> that they are in the minimal format, which could be
>> inconvenient for some operations, e.g. taking a substring
>> could require making an extra pass to re-code the data.
>
> If Martin says it's not a big deal
Nick Coghlan schrieb:
> The choice of latin-1 is deliberate and non-arbitrary. The reason for the
> choice is that the ordinals 0-255 in latin-1 map to the Unicode code points
> 0-255:
That's true, but that this makes a good choice for a special case
doesn't follow. Instead, frequency of occurre
Marcin 'Qrczak' Kowalczyk schrieb:
>> You could play tricks with ob_size to save this field:
>>
>> - ob_size < 0: 8-bit data; length is abs(ob_size)
>> - ob_size > 0, (ob_size & 1)==0: 16-bit data, length is ob_size/2
>> - ob_size > 0, (ob_size & 1)==1: 32-bit data, length is ob_size/2
>
> I wonde
Martin v. Löwis wrote:
> Nick Coghlan schrieb:
>> The choice of latin-1 is deliberate and non-arbitrary. The reason for the
>> choice is that the ordinals 0-255 in latin-1 map to the Unicode code points
>> 0-255:
>
> That's true, but that this makes a good choice for a special case
> doesn't fol
Nick Coghlan schrieb:
> If an 8-bit encoding other than latin-1 is used for the internal buffer,
> then every comparison operation would have to decode the string to
> Unicode in order to compare code points.
>
> It seems much simpler to me to ensure that what is stored internally is
> *always* th
"Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
>
> Nick Coghlan schrieb:
> > If an 8-bit encoding other than latin-1 is used for the internal buffer,
> > then every comparison operation would have to decode the string to
> > Unicode in order to compare code points.
> >
> > It seems much simpler to
"Martin v. Löwis" <[EMAIL PROTECTED]> writes:
> Just try implementing comparison some time. You can end up implementing
> the same algorithm six times at least, once for each pair (1,1), (1,2),
> (1,4), (2,2), (2,4), (4,4). If the algorithm isn't symmetric (i.e.
> you can't reduce (2,1) to (1,2)),
Martin v. Löwis wrote:
> Just try implementing comparison some time. You can end up implementing
> the same algorithm six times at least, once for each pair (1,1), (1,2),
> (1,4), (2,2), (2,4), (4,4).
#define UnicodeStringComparisonFunction(TYPE1, TYPE2) \
/* code to implement it here */
Unico
11 matches
Mail list logo