[HACKERS] patch for space around the FragmentDelimiter

2009-03-01 Thread Sushant Sinha
FragmentDelimiter is an argument for ts_headline function to separates
different headline fragments. The default delimiter is  ... .
Currently if someone specifies the delimiter as an option to the
function, no extra space is added around the delimiter. However, it does
not look good without space around the delimter. 

Since the option parsing function removes any space around the  given
value, it is not possible to add any desired space. The attached patch
adds space when a FragmentDelimiter is specified.

QUERY:

SELECT ts_headline('english', '
Day after day, day after day,
  We stuck, nor breath nor motion,
As idle as a painted Ship
  Upon a painted Ocean.
Water, water, every where
  And all the boards did shrink;
Water, water, every where,
  Nor any drop to drink.
S. T. Coleridge (1772-1834)
', to_tsquery('english', 'Coleridge  stuck'),
'MaxFragments=2,FragmentDelimiter=***');

OLD RESULT
ts_headline 

 after day, day after day,
   We bstuck/b, nor breath nor motion,
 As idle as a painted Ship
   Upon a painted Ocean.
 Water, water, every where
   And all the boards did shrink;
 Water, water, every where***drop to drink.
 S. T. bColeridge/b
(1 row)




NEW RESULT after the patch

 ts_headline  
--
 after day, day after day,
   We bstuck/b, nor breath nor motion,
 As idle as a painted Ship
   Upon a painted Ocean.
 Water, water, every where
   And all the boards did shrink;
 Water, water, every where *** drop to drink.
 S. T. bColeridge/b



Index: src/backend/tsearch/wparser_def.c
===
RCS file: /home/sushant/devel/pgrep/pgsql/src/backend/tsearch/wparser_def.c,v
retrieving revision 1.20
diff -c -r1.20 wparser_def.c
*** src/backend/tsearch/wparser_def.c	15 Jan 2009 16:33:59 -	1.20
--- src/backend/tsearch/wparser_def.c	2 Mar 2009 06:00:02 -
***
*** 2082,2087 
--- 2082,2088 
  	int			shortword = 3;
  	int			max_fragments = 0;
  	int			highlight = 0;
+ 	int			len;
  	ListCell   *l;
  
  	/* config */
***
*** 2105,2111 
  		else if (pg_strcasecmp(defel-defname, StopSel) == 0)
  			prs-stopsel = pstrdup(val);
  		else if (pg_strcasecmp(defel-defname, FragmentDelimiter) == 0)
! 			prs-fragdelim = pstrdup(val);
  		else if (pg_strcasecmp(defel-defname, HighlightAll) == 0)
  			highlight = (pg_strcasecmp(val, 1) == 0 ||
  		 pg_strcasecmp(val, on) == 0 ||
--- 2106,2116 
  		else if (pg_strcasecmp(defel-defname, StopSel) == 0)
  			prs-stopsel = pstrdup(val);
  		else if (pg_strcasecmp(defel-defname, FragmentDelimiter) == 0)
! 		{
! 			len = strlen(val) + 2 + 1;/* 2 for spaces and 1 for end of string */
! 			prs-fragdelim = palloc(len * sizeof(char));
! 			snprintf(prs-fragdelim, len,  %s , val);
! 		}
  		else if (pg_strcasecmp(defel-defname, HighlightAll) == 0)
  			highlight = (pg_strcasecmp(val, 1) == 0 ||
  		 pg_strcasecmp(val, on) == 0 ||
Index: src/test/regress/expected/tsearch.out
===
RCS file: /home/sushant/devel/pgrep/pgsql/src/test/regress/expected/tsearch.out,v
retrieving revision 1.15
diff -c -r1.15 tsearch.out
*** src/test/regress/expected/tsearch.out	17 Oct 2008 18:05:19 -	1.15
--- src/test/regress/expected/tsearch.out	2 Mar 2009 02:02:38 -
***
*** 624,630 
   body
   bSea/b view wow ubfoo/b bar/u iqq/i
   a href=http://www.google.com/foo.bar.html; target=_blankYES nbsp;/a
!   ff-bg
   script
  document.write(15);
   /script
--- 624,630 
   body
   bSea/b view wow ubfoo/b bar/u iqq/i
   a href=http://www.google.com/foo.bar.html; target=_blankYES nbsp;/a
!  ff-bg
   script
  document.write(15);
   /script
***
*** 712,726 
Nor any drop to drink.
  S. T. Coleridge (1772-1834)
  ', to_tsquery('english', 'Coleridge  stuck'), 'MaxFragments=2,FragmentDelimiter=***');
! ts_headline 
! 
   after day, day after day,
 We bstuck/b, nor breath nor motion,
   As idle as a painted Ship
 Upon a painted Ocean.
   Water, water, every where
 And all the boards did shrink;
!  Water, water, every where***drop to drink.
   S. T. bColeridge/b
  (1 row)
  
--- 712,726 
Nor any drop to drink.
  S. T. Coleridge (1772-1834)
  ', to_tsquery('english', 'Coleridge  stuck'), 'MaxFragments=2,FragmentDelimiter=***');
!  ts_headline  
! --
   after day, day after day,
 We bstuck/b, nor breath nor motion,
   As idle as a painted Ship
 Upon a painted Ocean.
   Water, water, every where
 And all the boards did shrink;
!  Water, water, every where *** drop to drink.
   S. T. bColeridge/b
  (1 row)
  

-- 
Sent via pgsql-hackers mailing list 

Re: [HACKERS] patch for space around the FragmentDelimiter

2009-03-01 Thread Tom Lane
Sushant Sinha sushant...@gmail.com writes:
 FragmentDelimiter is an argument for ts_headline function to separates
 different headline fragments. The default delimiter is  ... .
 Currently if someone specifies the delimiter as an option to the
 function, no extra space is added around the delimiter. However, it does
 not look good without space around the delimter. 

Maybe not to you, for the particular delimiter you happen to be working
with, but it doesn't follow that spaces are always appropriate.

 Since the option parsing function removes any space around the  given
 value, it is not possible to add any desired space. The attached patch
 adds space when a FragmentDelimiter is specified.

I think this is a pretty bad idea.  Better would be to document how to
get spaces into the delimiter, ie, use double quotes:

... FragmentDelimiter =  ...  ...

Hmm, actually, it looks to me that the documentation already shows this,
in the example of the default values.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for space around the FragmentDelimiter

2009-03-01 Thread Sushant Sinha
yeah you are right. I did not know that you can pass space using double
quotes.

-Sushant.

On Sun, 2009-03-01 at 20:49 -0500, Tom Lane wrote:
 Sushant Sinha sushant...@gmail.com writes:
  FragmentDelimiter is an argument for ts_headline function to separates
  different headline fragments. The default delimiter is  ... .
  Currently if someone specifies the delimiter as an option to the
  function, no extra space is added around the delimiter. However, it does
  not look good without space around the delimter. 
 
 Maybe not to you, for the particular delimiter you happen to be working
 with, but it doesn't follow that spaces are always appropriate.
 
  Since the option parsing function removes any space around the  given
  value, it is not possible to add any desired space. The attached patch
  adds space when a FragmentDelimiter is specified.
 
 I think this is a pretty bad idea.  Better would be to document how to
 get spaces into the delimiter, ie, use double quotes:
 
   ... FragmentDelimiter =  ...  ...
 
 Hmm, actually, it looks to me that the documentation already shows this,
 in the example of the default values.
 
   regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers