Tatsuo Ishii <[EMAIL PROTECTED]> writes:
> I'm afraid we have to mke it larger, rather than smaller for 8.3. For
> example 0x82f5 in SHIFT_JIS_2004 (new in 8.3) becomes *pair* of 3
> bytes UTF_8 (0x00e3818b and 0x00e3829a). See
> util/mb/Unicode/shift_jis_2004_to_utf8_combined.map for more details.

> So the worst case is now 6, rather than 3.

Yipes.

> Can we add a column to pg_conversion which represents the "growth
> rate"? This would reduce the rate for most encodings much smaller than
> 6.

We need to do something, but the pg_conversion catalog seems a bad place
to put the info --- don't we have places that need to be able to do
conversion without catalog access?

Perhaps better would be to redefine the API for the conversion functions
so that they palloc their own result space.  Then each conversion
function would have to know the maximum growth rate for its particular
conversion.  This change would also make it feasible for a conversion
function to prescan the data and determine an exact output size, if that
seemed worthwhile because the potential growth rate was too extreme.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Reply via email to