Re: [HACKERS] Prefix support for synonym dictionary

2009-08-06 Thread Teodor Sigaev

1. The docs should be clarified a little. For instance, it should have a
link back to the definition of a prefix search (12.3.2). I included my
doc suggestions as an attachment.

Thank you, merged


2. dsynonym_init() uses findwrd() in a slightly confusing (and perhaps
fragile) way. After calling findwrd(), the end pointer is pointing at
either the end of the string, or the *; depending on whether the string
ends in * and whether flags is NULL. I only mention this because I had
to take a more careful look to see what was happening. Perhaps add a
comment to make it more clear?

Add comments:
/*
 * Finds the next whitespace-delimited word within the 'in' string.
 * Returns a pointer to the first character of the word, and a pointer
 * to the next byte after the last character in the word (in *end).
 * Character '*' at the end of word will not be threated as word
 * charater if flags is not null.
 */
static char *
findwrd(char *in, char **end, uint16 *flags)




3. The patch looks for the special byte '*'. I think that's fine,
because we depend on the files being in UTF-8 encoding, where it's the
same byte. However, I thought it was worth mentioning in case we want to
support other encodings for text search files later.


tsearch_readline() converts file's UTF8 encoding into server encoding. pgsql 
supports only encoding which are a superset of ASCII. So it's safe to use 
asterisk with any encodings


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


synonym_prefix-0.2.gz
Description: Unix tar archive

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Prefix support for synonym dictionary

2009-08-06 Thread Robert Haas
2009/8/6 Teodor Sigaev teo...@sigaev.ru:
 1. The docs should be clarified a little. For instance, it should have a
 link back to the definition of a prefix search (12.3.2). I included my
 doc suggestions as an attachment.

 Thank you, merged

 2. dsynonym_init() uses findwrd() in a slightly confusing (and perhaps
 fragile) way. After calling findwrd(), the end pointer is pointing at
 either the end of the string, or the *; depending on whether the string
 ends in * and whether flags is NULL. I only mention this because I had
 to take a more careful look to see what was happening. Perhaps add a
 comment to make it more clear?

 Add comments:
 /*
  * Finds the next whitespace-delimited word within the 'in' string.
  * Returns a pointer to the first character of the word, and a pointer
  * to the next byte after the last character in the word (in *end).
  * Character '*' at the end of word will not be threated as word
  * charater if flags is not null.
  */
 static char *
 findwrd(char *in, char **end, uint16 *flags)



 3. The patch looks for the special byte '*'. I think that's fine,
 because we depend on the files being in UTF-8 encoding, where it's the
 same byte. However, I thought it was worth mentioning in case we want to
 support other encodings for text search files later.

 tsearch_readline() converts file's UTF8 encoding into server encoding. pgsql
 supports only encoding which are a superset of ASCII. So it's safe to use
 asterisk with any encodings

Jeff,

Based on these comments, do you want to go ahead and mark this Ready
for Committer?

https://commitfest.postgresql.org/action/patch_view?id=133

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Prefix support for synonym dictionary

2009-08-06 Thread Jeff Davis
On Thu, 2009-08-06 at 12:19 -0400, Robert Haas wrote:
 Based on these comments, do you want to go ahead and mark this Ready
 for Committer?

Done, thanks Teodor.

However, on the commitfest page, the patches got updated in the wrong
places: prefix support and filtering dictionary support are pointing
at each others' patches.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Prefix support for synonym dictionary

2009-08-06 Thread Robert Haas
On Thu, Aug 6, 2009 at 12:53 PM, Jeff Davispg...@j-davis.com wrote:
 On Thu, 2009-08-06 at 12:19 -0400, Robert Haas wrote:
 Based on these comments, do you want to go ahead and mark this Ready
 for Committer?

 Done, thanks Teodor.

 However, on the commitfest page, the patches got updated in the wrong
 places: prefix support and filtering dictionary support are pointing
 at each others' patches.

Fixed.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Prefix support for synonym dictionary

2009-08-05 Thread Robert Haas
On Sun, Aug 2, 2009 at 3:05 PM, Jeff Davispg...@j-davis.com wrote:
 The patch looks good.

 Comments:

 1. The docs should be clarified a little. For instance, it should have a
 link back to the definition of a prefix search (12.3.2). I included my
 doc suggestions as an attachment.

 2. dsynonym_init() uses findwrd() in a slightly confusing (and perhaps
 fragile) way. After calling findwrd(), the end pointer is pointing at
 either the end of the string, or the *; depending on whether the string
 ends in * and whether flags is NULL. I only mention this because I had
 to take a more careful look to see what was happening. Perhaps add a
 comment to make it more clear?

 3. The patch looks for the special byte '*'. I think that's fine,
 because we depend on the files being in UTF-8 encoding, where it's the
 same byte. However, I thought it was worth mentioning in case we want to
 support other encodings for text search files later.

Oleg,

Are you planning to update this patch this week?  If not I will set it
to Returned with Feedback.

Thanks,

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Prefix support for synonym dictionary

2009-08-05 Thread Jeff Davis
On Wed, 2009-08-05 at 12:34 -0400, Robert Haas wrote:
 Oleg,
 
 Are you planning to update this patch this week?  If not I will set it
 to Returned with Feedback.

My only comments were related to docs and comments, and I supplied a
patch as a suggested fix for the docs. Also, the patch is very small.

I'd hate to hold it up over such a minor issue, and it seems like a
useful feature. If Oleg is unavailable, would you mind just having a
second review of the patch to see if they agree with my suggestions, and
then mark ready for committer review?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Prefix support for synonym dictionary

2009-08-02 Thread Jeff Davis
Hi,

The patch looks good.

Comments:

1. The docs should be clarified a little. For instance, it should have a
link back to the definition of a prefix search (12.3.2). I included my
doc suggestions as an attachment.

2. dsynonym_init() uses findwrd() in a slightly confusing (and perhaps
fragile) way. After calling findwrd(), the end pointer is pointing at
either the end of the string, or the *; depending on whether the string
ends in * and whether flags is NULL. I only mention this because I had
to take a more careful look to see what was happening. Perhaps add a
comment to make it more clear?

3. The patch looks for the special byte '*'. I think that's fine,
because we depend on the files being in UTF-8 encoding, where it's the
same byte. However, I thought it was worth mentioning in case we want to
support other encodings for text search files later.

Regards,
Jeff Davis



*** textsearch.sgml	2009-08-02 11:22:38.0 -0700
--- textsearch.sgml.new	2009-08-02 11:22:27.0 -0700
***
*** 2290,2315 
 /para
 
 para
! Star sign literal*/literal at the end of definition word indicates, 
! that definition word is a prefix and functionto_tsquery()/function 
! function will transform that definition to the prefix search format. 
! Notice, it is ignored in functionto_tsvector()/function.
 /para
  
  programlisting
-  cat $SHAREDIR/tsearch_data/synonym_sample.syn
- postgrespgsql
- postgresql  pgsql
- postgre pgsql
- gogle   googl
- indices index*
-  cat $SHAREDIR/tsearch_data/synonym_sample.syn
  postgrespgsql
  postgresql  pgsql
  postgre pgsql
  gogle   googl
  indices index*
  
  =# create text search dictionary syn( template=synonym,synonyms='synonym_sample');
  =# select ts_lexize('syn','indices');
   ts_lexize
--- 2290,2317 
 /para
 
 para
! An asterisk (literal*/literal) at the end of definition word indicates 
! that definition word is a prefix, and functionto_tsquery()/function 
! function will transform that definition to the prefix search format (see 
! xref linkend=textsearch-parsing-queries). 
! Notice that it is ignored in functionto_tsvector()/function.
 /para
  
+para
+ Contents of filename$SHAREDIR/tsearch_data/synonym_sample.syn/:
+/para
  programlisting
  postgrespgsql
  postgresql  pgsql
  postgre pgsql
  gogle   googl
  indices index*
+ /programlisting
  
+para
+ Results:
+/para
+ programlisting
  =# create text search dictionary syn( template=synonym,synonyms='synonym_sample');
  =# select ts_lexize('syn','indices');
   ts_lexize
***
*** 2324,2329 
--- 2326,2338 
  
   'index':*
  (1 row)
+ 
+ =# select 'indexes are very useful'::tsvector;
+ tsvector 
+ -
+  'are' 'indexes' 'useful' 'very'
+ (1 row)
+ 
  =# select 'indexes are very useful'::tsvector @@ to_tsquery('tst','indices');
   ?column?
  --

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers