Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-03-22 Thread Artur Zakirov
I attached the patch, which fixes the pg_trgm documentation. On 19.03.2016 01:18, Artur Zakirov wrote: 2016-03-18 23:46 GMT+03:00 Jeff Janes mailto:jeff.ja...@gmail.com>>: <% and <<-> are not documented at all. Is that a deliberate choice? Since they were added as convenience functi

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-03-18 Thread Artur Zakirov
2016-03-18 23:46 GMT+03:00 Jeff Janes : > > > <% and <<-> are not documented at all. Is that a deliberate choice? > Since they were added as convenience functions for the user, I think > they really need to be in the user documentation. > I can send a patch a little bit later. I documented %> an

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-03-18 Thread Jeff Janes
On Mon, Mar 14, 2016 at 9:27 AM, Artur Zakirov wrote: > On 14.03.2016 18:48, David Steele wrote: >> >> Hi Jeff, >> >> On 2/25/16 5:00 PM, Jeff Janes wrote: >> >>> But, It doesn't sound like I am going to win that debate. Given that, >>> I don't think we need a different name for the function. I'm

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-03-15 Thread Artur Zakirov
On 15.03.2016 17:28, David Steele wrote: On 3/14/16 12:27 PM, Artur Zakirov wrote: On 14.03.2016 18:48, David Steele wrote: Hi Jeff, On 2/25/16 5:00 PM, Jeff Janes wrote: But, It doesn't sound like I am going to win that debate. Given that, I don't think we need a different name for the fun

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-03-15 Thread David Steele
On 3/14/16 12:27 PM, Artur Zakirov wrote: > On 14.03.2016 18:48, David Steele wrote: >> Hi Jeff, >> >> On 2/25/16 5:00 PM, Jeff Janes wrote: >> >>> But, It doesn't sound like I am going to win that debate. Given that, >>> I don't think we need a different name for the function. I'm fine with >>> e

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-03-14 Thread Artur Zakirov
On 14.03.2016 18:48, David Steele wrote: Hi Jeff, On 2/25/16 5:00 PM, Jeff Janes wrote: But, It doesn't sound like I am going to win that debate. Given that, I don't think we need a different name for the function. I'm fine with explaining the word-boundary subtlety in the documentation, and

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-03-14 Thread David Steele
Hi Jeff, On 2/25/16 5:00 PM, Jeff Janes wrote: But, It doesn't sound like I am going to win that debate. Given that, I don't think we need a different name for the function. I'm fine with explaining the word-boundary subtlety in the documentation, and keeping the function name itself simple.

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-25 Thread Jeff Janes
On Fri, Jan 29, 2016 at 6:15 AM, Teodor Sigaev wrote: >> The behavior of this function is surprising to me. >> >> select substring_similarity('dog' , 'hotdogpound') ; >> >> substring_similarity >> -- >> 0.25 >> > Substring search was desined to search simil

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-17 Thread Artur Zakirov
On 12.02.2016 20:56, Teodor Sigaev wrote: On Thu, Feb 11, 2016 at 9:56 AM, Teodor Sigaev wrote: 1 - sml_limit to similarity_limit. sml_threshold is difficult to write I think, similarity_limit is more simple. It seems to me that threshold is right word by meaning. sml_threshold is my choice.

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-12 Thread Teodor Sigaev
On Thu, Feb 11, 2016 at 9:56 AM, Teodor Sigaev wrote: 1 - sml_limit to similarity_limit. sml_threshold is difficult to write I think, similarity_limit is more simple. It seems to me that threshold is right word by meaning. sml_threshold is my choice. Why abbreviate it like that? Nobody's go

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-12 Thread Robert Haas
On Thu, Feb 11, 2016 at 9:56 AM, Teodor Sigaev wrote: >> 1 - sml_limit to similarity_limit. sml_threshold is difficult to write I >> think, >> similarity_limit is more simple. > > It seems to me that threshold is right word by meaning. sml_threshold is my > choice. Why abbreviate it like that? N

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-11 Thread Teodor Sigaev
1 - sml_limit to similarity_limit. sml_threshold is difficult to write I think, similarity_limit is more simple. It seems to me that threshold is right word by meaning. sml_threshold is my choice. 2 - subword_similarity() to word_similarity(). Agree, according to Mike Rylander opinion in this

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-11 Thread Artur Zakirov
On 11.02.2016 16:35, Mike Rylander wrote: On Thu, Feb 11, 2016 at 8:11 AM, Teodor Sigaev wrote: I have attached a new version of the patch. It fixes error of operators <->> and %>: - operator <->> did not pass the regression test in CentOS 32 bit (gcc 4.4.7 20120313). - operator %> did not pass

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-11 Thread Artur Zakirov
On 11.02.2016 16:11, Teodor Sigaev wrote: I have attached a new version of the patch. It fixes error of operators <->> and %>: - operator <->> did not pass the regression test in CentOS 32 bit (gcc 4.4.7 20120313). - operator %> did not pass the regression test in FreeBSD 32 bit (gcc 4.2.1 200708

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-11 Thread Mike Rylander
On Thu, Feb 11, 2016 at 8:11 AM, Teodor Sigaev wrote: >> I have attached a new version of the patch. It fixes error of operators >> <->> and >> %>: >> - operator <->> did not pass the regression test in CentOS 32 bit (gcc >> 4.4.7 >> 20120313). >> - operator %> did not pass the regression test in

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-11 Thread Teodor Sigaev
I have attached a new version of the patch. It fixes error of operators <->> and %>: - operator <->> did not pass the regression test in CentOS 32 bit (gcc 4.4.7 20120313). - operator %> did not pass the regression test in FreeBSD 32 bit (gcc 4.2.1 20070831). It was because of variable optimizati

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-11 Thread Teodor Sigaev
The behavior of this function is surprising to me. select substring_similarity('dog' , 'hotdogpound') ; substring_similarity -- 0.25 Substring search was desined to search similar word in string: contrib_regression=# select substring_similarity('dog' ,

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-10 Thread Artur Zakirov
On 02.02.2016 15:45, Artur Zakirov wrote: On 01.02.2016 20:12, Artur Zakirov wrote: I have changed the patch: 1 - trgm2.data was corrected, duplicates were deleted. 2 - I have added operators <<-> and <->> with GiST index supporting. A regression test will pass only with the patch http://www.po

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-02 Thread Artur Zakirov
On 01.02.2016 20:12, Artur Zakirov wrote: I have changed the patch: 1 - trgm2.data was corrected, duplicates were deleted. 2 - I have added operators <<-> and <->> with GiST index supporting. A regression test will pass only with the patch http://www.postgresql.org/message-id/capphfdt19fwqxaryjk

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-01 Thread Artur Zakirov
On 29.01.2016 18:58, Artur Zakirov wrote: On 29.01.2016 18:39, Alvaro Herrera wrote: Teodor Sigaev wrote: The behavior of this function is surprising to me. select substring_similarity('dog' , 'hotdogpound') ; substring_similarity -- 0.25 Substring s

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-01-29 Thread Artur Zakirov
On 29.01.2016 18:39, Alvaro Herrera wrote: Teodor Sigaev wrote: The behavior of this function is surprising to me. select substring_similarity('dog' , 'hotdogpound') ; substring_similarity -- 0.25 Substring search was desined to search similar word in

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-01-29 Thread Alvaro Herrera
Teodor Sigaev wrote: > >The behavior of this function is surprising to me. > > > >select substring_similarity('dog' , 'hotdogpound') ; > > > > substring_similarity > >-- > > 0.25 > > > Substring search was desined to search similar word in string: > contrib_re

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-01-29 Thread Artur Zakirov
On 29.01.2016 17:15, Teodor Sigaev wrote: The behavior of this function is surprising to me. select substring_similarity('dog' , 'hotdogpound') ; substring_similarity -- 0.25 Substring search was desined to search similar word in string: contrib_regres

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-01-29 Thread Teodor Sigaev
The behavior of this function is surprising to me. select substring_similarity('dog' , 'hotdogpound') ; substring_similarity -- 0.25 Substring search was desined to search similar word in string: contrib_regression=# select substring_similarity('dog' ,

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-01-29 Thread Teodor Sigaev
Sure. I attached two patches. But notice that pg_trgm.limit should be used with this command: SHOW "pg_trgm.limit"; If you will use this command: SHOW pg_trgm.limit; you will get the error: ERROR: syntax error at or near "limit" LINE 1: SHOW pg_trgm.limit; ^ This is because

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-01-29 Thread Oleg Bartunov
On Fri, Jan 29, 2016 at 1:11 PM, Alvaro Herrera wrote: > Artur Zakirov wrote: > > > What status of this patch? In commitfest it is "Needs review". > > "Needs review" means it needs a reviewer to go over it and, uh, review > it. Did I send an email to you prodding you to review patches? I sent >

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-01-29 Thread Alvaro Herrera
Artur Zakirov wrote: > What status of this patch? In commitfest it is "Needs review". "Needs review" means it needs a reviewer to go over it and, uh, review it. Did I send an email to you prodding you to review patches? I sent such an email to several people from PostgresPro, but I don't rememb

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-01-29 Thread Artur Zakirov
On 21.01.2016 00:25, Alvaro Herrera wrote: Artur Zakirov wrote: I don't quite understand why aren't we using a custom GUC variable here. These already have SHOW and SET support ... Added GUC variables: - pg_trgm.limit - pg_trgm.substring_limit I added this variables to the documentation. sho

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-01-20 Thread Alvaro Herrera
Artur Zakirov wrote: > >I don't quite understand why aren't we using a custom GUC variable here. > >These already have SHOW and SET support ... > > > > Added GUC variables: > - pg_trgm.limit > - pg_trgm.substring_limit > I added this variables to the documentation. > show_limit() and set_limit()

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-01-15 Thread Jeff Janes
On Fri, Dec 18, 2015 at 11:43 AM, Artur Zakirov wrote: > Hello. > > PostgreSQL has a contrib module named pg_trgm. It is used to the fuzzy text > search. It provides some functions and operators for determining the > similarity of the given texts using trigram matching. > > At the moment, in pg_tr

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-01-15 Thread Artur Zakirov
On 12.01.2016 02:31, Alvaro Herrera wrote: I gave a quick look through the patch and noticed a few minor things while trying to understand it. I think the test corpus isn't particularly interesting for how big it is. I'd rather have (a) a small corpus (say 100 words) with which to do detailed r

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-01-13 Thread Tom Lane
Jeff Janes writes: > In the meantime, I had a question about bumping the version to 1.3. > Version 1.2 of pg_trgm has never been included in a community release > (because it didn't make the 9.5 cutoff). So should we really bump the > version to 1.3, or just merge the changes here directly into

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-01-13 Thread Jeff Janes
On Sat, Dec 26, 2015 at 9:12 PM, Jeff Janes wrote: > On Fri, Dec 18, 2015 at 11:43 AM, Artur Zakirov > wrote: >> Hello. >> >> PostgreSQL has a contrib module named pg_trgm. It is used to the fuzzy text >> search. It provides some functions and operators for determining the >> similarity of the gi

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-01-12 Thread Teodor Sigaev
! float4 tmpsml = cnt_sml(qtrg, key, *recheck); /* strange bug at freebsd 5.2.1 and gcc 3.3.3 */ ! res = (*(int *) &tmpsml == *(int *) &nlimit || tmpsml > nlimit) ? true : false; What's the co

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-01-11 Thread Tom Lane
Alvaro Herrera writes: >> + >> show_substring_limit()show_substring_limit >> + >> set_substring_limit(real)set_substring_limit > I don't quite understand why aren't we using a custom GUC variable here. Presumably this is following the existing set_limit() precedent in pg_trgm. But

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-01-11 Thread Alvaro Herrera
I gave a quick look through the patch and noticed a few minor things while trying to understand it. I think the test corpus isn't particularly interesting for how big it is. I'd rather have (a) a small corpus (say 100 words) with which to do detailed regression testing, and (b) some larger docume

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2015-12-26 Thread Jeff Janes
On Fri, Dec 18, 2015 at 11:43 AM, Artur Zakirov wrote: > Hello. > > PostgreSQL has a contrib module named pg_trgm. It is used to the fuzzy text > search. It provides some functions and operators for determining the > similarity of the given texts using trigram matching. > > At the moment, in pg_tr

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2015-12-19 Thread Alexander Korotkov
On Fri, Dec 18, 2015 at 10:53 PM, Artur Zakirov wrote: > On 18.12.2015 22:43, Artur Zakirov wrote: > >> Hello. >> >> PostgreSQL has a contrib module named pg_trgm. It is used to the fuzzy >> text search. It provides some functions and operators for determining the >> similarity of the given texts

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2015-12-18 Thread Artur Zakirov
On 18.12.2015 22:43, Artur Zakirov wrote: Hello. PostgreSQL has a contrib module named pg_trgm. It is used to the fuzzy text search. It provides some functions and operators for determining the similarity of the given texts using trigram matching. Sorry, I have forgotten to mark previous me