> On Tue, 2007-09-11 at 14:50 +0900, Tatsuo Ishii wrote: > > > > > On Tue, 2007-09-11 at 12:29 +0900, Tatsuo Ishii wrote: > > > > Please show me concrete examples how I could introduce a > > vulnerability > > > > using this kind of convert() usage. > > > > > > Try the sequence below. Then, try to dump and then reload the > > database. > > > When you try to reload it, you will get an error: > > > > > > ERROR: invalid byte sequence for encoding "UTF8": 0xbd > > > > I know this could be a problem (like chr() with invalid byte pattern). > > What I really want to know is, read query something like this: > > > > SELECT * FROM japanese_table ORDER BY convert(japanese_text using > > utf8_to_euc_jp); > > I guess I don't quite understand the question. > > I agree that ORDER BY convert() must be safe in the C locale, because it > just passes the strings to strcmp(). > > Are you saying that we should not remove convert() until we can support > multiple locales in one database? > > If we make convert() operate on bytea and return bytea, as Tom > suggested, would that solve your use case?
The problem is, the above use case is just one of what I can think of. Another use case is, something like this: SELECT sum(octet_length(convert(text_column using utf8_to_euc_jp))) FROM mytable; to know the total byte length of text column if it's encoded in EUC_JP. So I'm not sure we could change convert() returning bytea without complaing from users... -- Tatsuo Ishii SRA OSS, Inc. Japan ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster