Hi

2016-01-20 7:20 GMT+01:00 Tatsuo Ishii <is...@postgresql.org>:

> > 2016-01-20 3:47 GMT+01:00 Tatsuo Ishii <is...@postgresql.org>:
> >
> >> test=# select format('%I', t) from t1;
> >>   format
> >> ----------
> >>  aaa
> >>  "AAA"
> >>  "あいう"
> >> (3 rows)
> >>
> >> Why is the text value of the third line needed to be double quoted?
> >> (note that it is a multi byte character). Same thing can be said to
> >> quote_ident().
> >>
> >> We treat identifiers made of the multi byte characters without double
> >> quotation (non delimited identifier) in other places.
> >>
> >> test=# create table t2(あいう text);
> >> CREATE TABLE
> >> test=# insert into t2 values('aaa');
> >> INSERT 0 1
> >> test=# select あいう from t2;
> >>  あいう
> >> --------
> >>  aaa
> >> (1 row)
> >
> > format uses same routine as quote_ident. So quote_ident should be fixed
> > first.
>
> Yes, I had that in my mind too.
>
> Attached is the proposed patch to fix the bug.
> Regression tests passed.
>
> Here is an example after the patch. Note that the third row is not
> quoted any more.
>
> test=#  select format('%I', あいう) from t2;
>  format
> --------
>  aaa
>  "AAA"
>  あああ
> (3 rows)
>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>
> diff --git a/src/backend/utils/adt/ruleutils.c
> b/src/backend/utils/adt/ruleutils.c
> index 3783e97..b93fc27 100644
> --- a/src/backend/utils/adt/ruleutils.c
> +++ b/src/backend/utils/adt/ruleutils.c
> @@ -9405,7 +9405,7 @@ quote_identifier(const char *ident)
>          * would like to use <ctype.h> macros here, but they might yield
> unwanted
>          * locale-specific results...
>          */
> -       safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_');
> +       safe = ((ident[0] >= 'a' && ident[0] <= 'z') || ident[0] == '_' ||
> IS_HIGHBIT_SET(ident[0]));
>
>         for (ptr = ident; *ptr; ptr++)
>         {
> @@ -9413,7 +9413,8 @@ quote_identifier(const char *ident)
>
>                 if ((ch >= 'a' && ch <= 'z') ||
>                         (ch >= '0' && ch <= '9') ||
> -                       (ch == '_'))
> +                       (ch == '_') ||
> +                       (IS_HIGHBIT_SET(ch)))
>                 {
>                         /* okay */
>                 }
>
>
This patch ls simply - I remember I was surprised, so we allow any
multibyte char few months ago.

+1

Pavel

Reply via email to