Re: [HACKERS] pg_trgm version 1.2

2015-07-20 Thread Alexander Korotkov
On Wed, Jul 15, 2015 at 12:31 AM, Jeff Janes wrote: > On Tue, Jul 7, 2015 at 6:33 AM, Alexander Korotkov < > a.korot...@postgrespro.ru> wrote: > >> >> >> See Tom Lane's comment about downgrade scripts. I think just remove it is >> a right solution. >> > > The new patch removes the downgrade path

Re: [HACKERS] pg_trgm version 1.2

2015-07-14 Thread Jeff Janes
On Tue, Jul 7, 2015 at 6:33 AM, Alexander Korotkov < a.korot...@postgrespro.ru> wrote: > > > See Tom Lane's comment about downgrade scripts. I think just remove it is > a right solution. > The new patch removes the downgrade path and the ability to install the old version. (If anyone wants an ea

Re: [HACKERS] pg_trgm version 1.2

2015-07-07 Thread Alexander Korotkov
On Tue, Jun 30, 2015 at 11:28 PM, Jeff Janes wrote: > On Tue, Jun 30, 2015 at 2:46 AM, Alexander Korotkov < > a.korot...@postgrespro.ru> wrote: > >> On Sun, Jun 28, 2015 at 1:17 AM, Jeff Janes wrote: >> >>> This patch implements version 1.2 of contrib module pg_trgm. >>> >>> This supports the tr

Re: [HACKERS] pg_trgm version 1.2

2015-06-30 Thread Tom Lane
Jeff Janes writes: > On Tue, Jun 30, 2015 at 2:46 AM, Alexander Korotkov < > a.korot...@postgrespro.ru> wrote: >> pg_trgm--1.1.sql andpg_trgm--1.1--1.2.sql are useful for debug, but do you >> expect them in final commit? As I can see in other contribs we have only >> last version and upgrade scrip

Re: [HACKERS] pg_trgm version 1.2

2015-06-30 Thread Jeff Janes
On Tue, Jun 30, 2015 at 2:46 AM, Alexander Korotkov < a.korot...@postgrespro.ru> wrote: > On Sun, Jun 28, 2015 at 1:17 AM, Jeff Janes wrote: > >> This patch implements version 1.2 of contrib module pg_trgm. >> >> This supports the triconsistent function, introduced in version 9.4 of >> the server

Re: [HACKERS] pg_trgm version 1.2

2015-06-30 Thread Alexander Korotkov
On Sun, Jun 28, 2015 at 1:17 AM, Jeff Janes wrote: > This patch implements version 1.2 of contrib module pg_trgm. > > This supports the triconsistent function, introduced in version 9.4 of the > server, to make it faster to implement indexed queries where some keys are > common and some are rare.

Re: [HACKERS] pg_trgm version 1.2

2015-06-29 Thread Merlin Moncure
On Mon, Jun 29, 2015 at 7:23 AM, Merlin Moncure wrote: > On Sat, Jun 27, 2015 at 5:17 PM, Jeff Janes wrote: >> V1.1: Time: 1743.691 ms --- after repeated execution to warm the cache >> >> V1.2: Time: 2.839 ms --- after repeated execution to warm the cache > > Wow! I'm going to test this.

Re: [HACKERS] pg_trgm version 1.2

2015-06-29 Thread Merlin Moncure
On Sat, Jun 27, 2015 at 5:17 PM, Jeff Janes wrote: > This patch implements version 1.2 of contrib module pg_trgm. > > This supports the triconsistent function, introduced in version 9.4 of the > server, to make it faster to implement indexed queries where some keys are > common and some are rare.

Re: [HACKERS] pg_trgm Memory Allocation logic

2015-03-09 Thread Beena Emerson
Hello, > If you manually set RPADDING 2 in trgm.h, then it will, but the > allocation probably should use LPADDING/RPADDING to get it right, rather > than assume the max values. Yes you are right. For RPADDING = 2, the current formula is suitable but for RPADDING =1, a lot of extra space is all

Re: [HACKERS] pg_trgm Memory Allocation logic

2015-03-09 Thread Heikki Linnakangas
On 03/09/2015 03:33 PM, Tom Lane wrote: Beena Emerson writes: In the pg_trgm module, within function generate_trgm, the memory for trigrams is allocated as follows: trg = (TRGM *) palloc(TRGMHDRSIZE + sizeof(trgm) * (slen / 2 + 1) *3); I have been trying to understand why this is so becau

Re: [HACKERS] pg_trgm Memory Allocation logic

2015-03-09 Thread Tom Lane
Beena Emerson writes: > In the pg_trgm module, within function generate_trgm, the memory for trigrams > is allocated as follows: > trg = (TRGM *) palloc(TRGMHDRSIZE + sizeof(trgm) * (slen / 2 + 1) *3); > I have been trying to understand why this is so because it seems to be > allocating more spa

Re: [HACKERS] pg_trgm Memory Allocation logic

2015-03-09 Thread Heikki Linnakangas
On 03/09/2015 02:54 PM, Alvaro Herrera wrote: Beena Emerson wrote: In the pg_trgm module, within function generate_trgm, the memory for trigrams is allocated as follows: trg = (TRGM *) palloc(TRGMHDRSIZE + sizeof(trgm) * (slen / 2 + 1) *3); I have been trying to understand why this is so becau

Re: [HACKERS] pg_trgm Memory Allocation logic

2015-03-09 Thread Alvaro Herrera
Beena Emerson wrote: > In the pg_trgm module, within function generate_trgm, the memory for trigrams > is allocated as follows: > > trg = (TRGM *) palloc(TRGMHDRSIZE + sizeof(trgm) * (slen / 2 + 1) *3); > > I have been trying to understand why this is so because it seems to be > allocating more s

Re: [HACKERS] pg_trgm partial-match

2013-01-16 Thread Fujii Masao
On Fri, Nov 23, 2012 at 2:11 AM, Fujii Masao wrote: > On Mon, Nov 19, 2012 at 10:56 AM, Tomas Vondra wrote: >> I've done a quick review of the current patch: > > Thanks for the commit! > > As Alexander pointed out upthread, another infrastructure patch is required > before applying this patch. So

Re: [HACKERS] pg_trgm partial-match

2012-11-22 Thread Fujii Masao
On Mon, Nov 19, 2012 at 10:56 AM, Tomas Vondra wrote: > I've done a quick review of the current patch: Thanks for the commit! As Alexander pointed out upthread, another infrastructure patch is required before applying this patch. So I will implement the infra patch first. Regards, -- Fujii Ma

Re: [HACKERS] pg_trgm partial-match

2012-11-22 Thread Fujii Masao
On Mon, Nov 19, 2012 at 7:55 PM, Alexander Korotkov wrote: > On Mon, Nov 19, 2012 at 10:05 AM, Alexander Korotkov > wrote: >> >> On Thu, Nov 15, 2012 at 11:39 PM, Fujii Masao >> wrote: >>> >>> Note that we cannot do a partial-match if KEEPONLYALNUM is disabled, >>> i.e., if query key contains mu

Re: [HACKERS] pg_trgm partial-match

2012-11-19 Thread Alexander Korotkov
On Mon, Nov 19, 2012 at 10:05 AM, Alexander Korotkov wrote: > On Thu, Nov 15, 2012 at 11:39 PM, Fujii Masao wrote: > >> Note that we cannot do a partial-match if KEEPONLYALNUM is disabled, >> i.e., if query key contains multibyte characters. In this case, byte >> length of >> the trigram string mi

Re: [HACKERS] pg_trgm partial-match

2012-11-18 Thread Alexander Korotkov
Hi! On Thu, Nov 15, 2012 at 11:39 PM, Fujii Masao wrote: > Note that we cannot do a partial-match if KEEPONLYALNUM is disabled, > i.e., if query key contains multibyte characters. In this case, byte > length of > the trigram string might be larger than three, and its CRC is used as a > trigram k

Re: [HACKERS] pg_trgm partial-match

2012-11-18 Thread Tomas Vondra
On 15.11.2012 20:39, Fujii Masao wrote: > Hi, > > I'd like to propose to extend pg_trgm so that it can compare a partial-match > query key to a GIN index. IOW, I'm thinking to implement the 'comparePartial' > GIN method for pg_trgm. > > Currently, when the query key is less than three characters,

Re: [HACKERS] pg_trgm: unicode string not working

2011-06-14 Thread Robert Haas
On Tue, Jun 14, 2011 at 1:15 AM, Tom Lane wrote: > I'm not sure that pg_upgrade is a good vehicle for dispensing such > advice, anyway.  At least in the Red Hat packaging, end users will never > read what it prints, unless maybe it fails outright and they're trying > to debug why. In my experienc

Re: [HACKERS] pg_trgm: unicode string not working

2011-06-14 Thread Florian Pflug
On Jun14, 2011, at 07:15 , Tom Lane wrote: > Robert Haas writes: >> On Mon, Jun 13, 2011 at 7:47 PM, Bruce Momjian wrote: >>> No, it does not. Under what circumstances should I issue a suggestion >>> to reindex, and what should the text be? > >> It sounds like GIN indexes need to be reindexed a

Re: [HACKERS] pg_trgm: unicode string not working

2011-06-13 Thread Tom Lane
Robert Haas writes: > On Mon, Jun 13, 2011 at 7:47 PM, Bruce Momjian wrote: >> No, it does not.  Under what circumstances should I issue a suggestion >> to reindex, and what should the text be? > It sounds like GIN indexes need to be reindexed after upgrading from < > 9.1 to >= 9.1. Only if you

Re: [HACKERS] pg_trgm: unicode string not working

2011-06-13 Thread Bruce Momjian
Robert Haas wrote: > On Mon, Jun 13, 2011 at 7:47 PM, Bruce Momjian wrote: > > Robert Haas wrote: > >> On Sun, Jun 12, 2011 at 8:40 AM, Florian Pflug wrote: > >> > Note that this restriction was removed in postgres 9.1 which > >> > is currently in beta. However, GIT indices must be re-created > >

Re: [HACKERS] pg_trgm: unicode string not working

2011-06-13 Thread Robert Haas
On Mon, Jun 13, 2011 at 7:47 PM, Bruce Momjian wrote: > Robert Haas wrote: >> On Sun, Jun 12, 2011 at 8:40 AM, Florian Pflug wrote: >> > Note that this restriction was removed in postgres 9.1 which >> > is currently in beta. However, GIT indices must be re-created >> > with REINDEX after upgradin

Re: [HACKERS] pg_trgm: unicode string not working

2011-06-13 Thread Bruce Momjian
Robert Haas wrote: > On Sun, Jun 12, 2011 at 8:40 AM, Florian Pflug wrote: > > Note that this restriction was removed in postgres 9.1 which > > is currently in beta. However, GIT indices must be re-created > > with REINDEX after upgrading from 9.0 to leverage that > > improvement. > > Does pg_upg

Re: [HACKERS] pg_trgm: unicode string not working

2011-06-12 Thread Robert Haas
On Sun, Jun 12, 2011 at 8:40 AM, Florian Pflug wrote: > Note that this restriction was removed in postgres 9.1 which > is currently in beta. However, GIT indices must be re-created > with REINDEX after upgrading from 9.0 to leverage that > improvement. Does pg_upgrade know about this? -- Robert

Re: [HACKERS] pg_trgm: unicode string not working

2011-06-12 Thread Florian Pflug
Hi Next time, please post questions regarding the usage of postgres to the -general list, not to -hackers. The purpose of -hackers is to discuss the development of postgres proper, not the development of applications using postgres. On Jun12, 2011, at 13:33 , Sushant Sinha wrote: > I am using pg_

Re: [HACKERS] pg_trgm

2010-05-30 Thread Tom Lane
Greg Stark writes: > There seem to be three behaviours on the table here: You're neglecting 4) Let the user decide whether he wants pg_trgm to consider word elements to be "alphanumerics" or "any non-space". The main problem I have with Tatsuo's patch is that it forecloses any reasonably upward

Re: [HACKERS] pg_trgm

2010-05-30 Thread Greg Stark
On Sun, May 30, 2010 at 3:41 PM, Tom Lane wrote: > I don't think it's unreasonable to insist that behavioral changes be > made in an upward compatible fashion ... especially ones that seem as > least as likely to break some current usages as to enable new usages. Fwiw I don't think we've traditio

Re: [HACKERS] pg_trgm

2010-05-30 Thread Tatsuo Ishii
> > > This is in 9.0, because 8.4 doesn't recognize the \u escape syntax. If > > > you run this in 8.4, you're just comparing a sequence of ASCII letters > > > and digits. > > > > Hum. Still I prefer 8.4's behavior since anything is better than > > returning NaN. It seems 9.0 does not have any es

Re: [HACKERS] pg_trgm

2010-05-30 Thread Tom Lane
Tatsuo Ishii writes: >> This is still ignoring the point: arbitrarily changing the module's >> longstanding standard behavior isn't acceptable. You need to provide >> a way for the user to control the behavior. (Once you've done that, >> I think it can be just either "alnum" or "!isspace", but m

Re: [HACKERS] pg_trgm

2010-05-30 Thread Peter Eisentraut
On sön, 2010-05-30 at 11:05 +0900, Tatsuo Ishii wrote: > > > Wait. This works fine for me with stock pg_trgm. local is C and > > > encoding is UTF8. What version of PostgreSQL are you using? Mine is > > > 8.4.4. > > > > This is in 9.0, because 8.4 doesn't recognize the \u escape syntax. If > > yo

Re: [HACKERS] pg_trgm

2010-05-29 Thread Tatsuo Ishii
> > Wait. This works fine for me with stock pg_trgm. local is C and > > encoding is UTF8. What version of PostgreSQL are you using? Mine is > > 8.4.4. > > This is in 9.0, because 8.4 doesn't recognize the \u escape syntax. If > you run this in 8.4, you're just comparing a sequence of ASCII letter

Re: [HACKERS] pg_trgm

2010-05-29 Thread Tatsuo Ishii
> This is still ignoring the point: arbitrarily changing the module's > longstanding standard behavior isn't acceptable. You need to provide > a way for the user to control the behavior. (Once you've done that, > I think it can be just either "alnum" or "!isspace", but maybe some > other behavior

Re: [HACKERS] pg_trgm

2010-05-29 Thread Tom Lane
Tatsuo Ishii writes: > After thinking a little bit more, I think following patch would not > break existing behavior and also adopts mutibyte + C locale case. What > do you think? This is still ignoring the point: arbitrarily changing the module's longstanding standard behavior isn't acceptable.

Re: [HACKERS] pg_trgm

2010-05-29 Thread Greg Stark
On Sat, May 29, 2010 at 9:13 AM, Tatsuo Ishii wrote: > ! #define iswordchr(c)  (lc_ctype_is_c()? \ > !                                                               ((*(c) & > 0x80)? !t_isspace(c) : (t_isalpha(c) || t_isdigit(c))) : \ > Surely isspace(c) will always be false for non-ascii charac

Re: [HACKERS] pg_trgm

2010-05-29 Thread Tatsuo Ishii
> > It's not a practical solution for people working with prebuilt Postgres > > versions, which is most people. I don't object to finding a way to > > provide a "not-space" behavior instead of an "is-alnum" behavior, > > but as noted upthread a GUC isn't the right way. How do you feel > > about a

Re: [HACKERS] pg_trgm

2010-05-27 Thread Peter Eisentraut
On fre, 2010-05-28 at 10:04 +0900, Tatsuo Ishii wrote: > > I think the problem at hand has nothing at all to do with agglutination > > or CJK-specific issues. You will get the same problem with other > > languages *if* you set a locale that does not adequately support the > > characters in use. E

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tatsuo Ishii
> It's not a practical solution for people working with prebuilt Postgres > versions, which is most people. I don't object to finding a way to > provide a "not-space" behavior instead of an "is-alnum" behavior, > but as noted upthread a GUC isn't the right way. How do you feel > about a new set o

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tatsuo Ishii
> I think the problem at hand has nothing at all to do with agglutination > or CJK-specific issues. You will get the same problem with other > languages *if* you set a locale that does not adequately support the > characters in use. E.g., Russian with locale C and encoding UTF8: > > select simil

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tatsuo Ishii
> Tatsuo Ishii writes: > > similarity -> generate_trgm -> find_word -> iswordchr -> t_isalpha -> > > isalpha > > > if locale is C and USE_WIDE_UPPER_LOWER defined which is the case in > > most modern OSs. > > Quite. And *if locale is C then only standard ASCII letters are letters*. > You may n

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tom Lane
Tatsuo Ishii writes: > Or you could just #undef KEEPONLYALNUM in trgm.h. But I'm not sure > this is the right thing for you. It's not a practical solution for people working with prebuilt Postgres versions, which is most people. I don't object to finding a way to provide a "not-space" behavior i

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tom Lane
Tatsuo Ishii writes: > similarity -> generate_trgm -> find_word -> iswordchr -> t_isalpha -> isalpha > if locale is C and USE_WIDE_UPPER_LOWER defined which is the case in > most modern OSs. Quite. And *if locale is C then only standard ASCII letters are letters*. You may not like that but it's

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tatsuo Ishii
> > Problem with pg_trgm is, it uses isascii() etc. to recognize a letter, > > which will skip any non ASCII range character in C locale. > > The only place I see that is in those ISPRINTABLE macros, which are only > used in show_trgm(), which is just a debugging function. It could stand > to be

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tom Lane
Tatsuo Ishii writes: > Problem with pg_trgm is, it uses isascii() etc. to recognize a letter, > which will skip any non ASCII range character in C locale. The only place I see that is in those ISPRINTABLE macros, which are only used in show_trgm(), which is just a debugging function. It could st

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tatsuo Ishii
> What I can't help wondering as I'm reading this discussion is - > Tatsuo-san said upthread that he has a problem with pg_trgm that he > does not have with full text search. So what is full text search > doing differently than pg_trgm? Problem with pg_trgm is, it uses isascii() etc. to recognize

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tatsuo Ishii
> I think the problem at hand has nothing at all to do with agglutination > or CJK-specific issues. You will get the same problem with other > languages *if* you set a locale that does not adequately support the > characters in use. E.g., Russian with locale C and encoding UTF8: > > select simil

Re: [HACKERS] pg_trgm

2010-05-27 Thread Robert Haas
On Thu, May 27, 2010 at 2:01 PM, Peter Eisentraut wrote: > On fre, 2010-05-28 at 00:46 +0900, Tatsuo Ishii wrote: >> > I don't know about Japanese, but the locale approach works just fine for >> > other agglutinative languages.  I would rather suspect that it is the >> > trigram approach that migh

Re: [HACKERS] pg_trgm

2010-05-27 Thread Peter Eisentraut
On fre, 2010-05-28 at 00:46 +0900, Tatsuo Ishii wrote: > > I don't know about Japanese, but the locale approach works just fine for > > other agglutinative languages. I would rather suspect that it is the > > trigram approach that might be rather useless for such languages, > > because you are goi

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tatsuo Ishii
> So I think a GUC is broken because pg_tgrm has a index opclasses and > any indexes built using one setting will be broken if the GUC is > changed. > > Perhaps we need two sets of functions (which presumably call the same > implementation with a flag to indicate which definition to use). Then > y

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tatsuo Ishii
> I don't know about Japanese, but the locale approach works just fine for > other agglutinative languages. I would rather suspect that it is the > trigram approach that might be rather useless for such languages, > because you are going to get a lot of similarity hits for the affixes. I'm not su

Re: [HACKERS] pg_trgm

2010-05-27 Thread Peter Eisentraut
On tor, 2010-05-27 at 23:20 +0900, Tatsuo Ishii wrote: > Anyway locale is completely usesless for finding word vs non-character > an agglutinative language such as Japanese. I don't know about Japanese, but the locale approach works just fine for other agglutinative languages. I would rather susp

Re: [HACKERS] pg_trgm

2010-05-27 Thread Greg Stark
On Thu, May 27, 2010 at 3:52 PM, Tom Lane wrote: > I think a more appropriate type of fix would be to expose the > KEEPONLYALNUM option as a GUC, or some other way of letting the > user decide what he wants. > So I think a GUC is broken because pg_tgrm has a index opclasses and any indexes built

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tom Lane
Tatsuo Ishii writes: > ! #define iswordchr(c)(t_isalpha(c) || t_isdigit(c) || > (lc_ctype_is_c() && !t_isspace(c))) This seems entirely arbitrary. It might "fix" things in your view but it will break the longstanding behavior for other people. I think a more appropriate type of fix wou

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tatsuo Ishii
> Well, that doesn't mean that the answer is to use C locale ;-) Of course it's up to user whether to use C locale or not. I just want pg_trgm work with C locale as well. > However, you could possibly think about making this bit of code > more flexible: > > #ifdef KEEPONLYALNUM > #define iswordc

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tom Lane
Tatsuo Ishii writes: > Anyway locale is completely usesless for finding word vs non-character > an agglutinative language such as Japanese. Well, that doesn't mean that the answer is to use C locale ;-) However, you could possibly think about making this bit of code more flexible: #ifdef KEEPON

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tatsuo Ishii
> Exactly what do you consider to be the missing functionality? > You need a notion of word vs non-word character from somewhere, > and the locale setting is the standard place to get that. The > core text search functionality behaves the same way. No. Text search works fine with multibyte + C lo

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tom Lane
Tatsuo Ishii writes: >> It's not a problem, it's just pilot error, or possibly inadequate >> documentation. pg_trgm uses the locale's definition of "alpha", >> "digit", etc. In C locale only basic ASCII letters and digits will be >> recognized as word constituents. > That means there is no chan

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tatsuo Ishii
> > Yes, pg_trgm seems to have problems with multibyte + C locale. > > It's not a problem, it's just pilot error, or possibly inadequate > documentation. pg_trgm uses the locale's definition of "alpha", > "digit", etc. In C locale only basic ASCII letters and digits will be > recognized as word

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tom Lane
Tatsuo Ishii writes: > What is your locale? >> It was en_EN.UTF-8. Interesting. With C it fails... > Yes, pg_trgm seems to have problems with multibyte + C locale. It's not a problem, it's just pilot error, or possibly inadequate documentation. pg_trgm uses the locale's definition of "alpha", "

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tatsuo Ishii
> > What is your locale? > It was en_EN.UTF-8. Interesting. With C it fails... Yes, pg_trgm seems to have problems with multibyte + C locale. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list

Re: [HACKERS] pg_trgm

2010-05-27 Thread Andres Freund
On Thursday 27 May 2010 14:40:41 Tatsuo Ishii wrote: > > > No, it doesn't. > > > Encoding is EUC_JP, locale is C. Included is the script to reproduce > > > the problem. > > > > test=# select show_trgm('日本語'); > > > > show_trgm > > > > --- > >

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tatsuo Ishii
> > No, it doesn't. > > Encoding is EUC_JP, locale is C. Included is the script to reproduce > > the problem. > test=# select show_trgm('日本語'); > show_trgm > --- > {0x8194c0,0x836e53,0x1dc363,0x1e22e9} > (1 row) > > Time: 0.44

Re: [HACKERS] pg_trgm

2010-05-27 Thread Andres Freund
Hi, On Thursday 27 May 2010 13:53:37 Tatsuo Ishii wrote: > > It's already multibyte safe since 8.4 > > No, it doesn't. > Encoding is EUC_JP, locale is C. Included is the script to reproduce > the problem. test=# select show_trgm('日本語'); show_trgm --

Re: [HACKERS] pg_trgm

2010-05-27 Thread Tatsuo Ishii
> It's already multibyte safe since 8.4 No, it doesn't. $ psql test Pager usage is off. psql (8.4.4) Type "help" for help. test=# select similarity('abc', 'abd'); -- OK similarity 0.33 (1 row) test=# select similarity('日本語', '日本後'); -- NG similarity

Re: [HACKERS] pg_trgm

2010-05-27 Thread Teodor Sigaev
Anyone working on make contrib/pg_trgm mutibyte encoding aware? If not, I'm interested in the work. It's already multibyte safe since 8.4 -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ --