attached are two patches: 1. documentation 2. regression tests for headline with fragments.
-Sushant. On Tue, 2008-07-15 at 13:29 +0400, Teodor Sigaev wrote: > > Attached a new patch that: > > > > 1. fixes previous bug > > 2. better handles the case when cover size is greater than the MaxWords. > > Looks good, I'll make some tests with real-world application. > > > I have not yet added the regression tests. The regression test suite > > seemed to be only ensuring that the function works. How many tests > > should I be adding? Is there any other place that I need to add > > different test cases for the function? > > Just add 3-5 selects to src/test/regress/sql/tsearch.sql with checking basic > functionality and corner cases like > - there is no covers in text > - Cover(s) is too big > - and so on > > Add some words in documentation too, pls. > >
Index: doc/src/sgml/textsearch.sgml =================================================================== RCS file: /home/postgres/devel/pgsql-cvs/pgsql/doc/src/sgml/textsearch.sgml,v retrieving revision 1.44 diff -c -r1.44 textsearch.sgml *** doc/src/sgml/textsearch.sgml 16 May 2008 16:31:01 -0000 1.44 --- doc/src/sgml/textsearch.sgml 16 Jul 2008 02:37:28 -0000 *************** *** 1100,1105 **** --- 1100,1117 ---- </listitem> <listitem> <para> + <literal>MaxFragments</literal>: maximum number of text excerpts + or fragments that matches the query words. It also triggers a + different headline generation function than the default one. This + function finds text fragments with as many query words as possible. + Each fragment will be of at most MaxWords and will not have words + of size less than or equal to ShortWord at the start or end of a + fragment. If all query words are not found in the document, then + a single fragment of MinWords will be displayed. + </para> + </listitem> + <listitem> + <para> <literal>HighlightAll</literal>: Boolean flag; if <literal>true</literal> the whole document will be highlighted. </para> *************** *** 1109,1115 **** Any unspecified options receive these defaults: <programlisting> ! StartSel=<b>, StopSel=</b>, MaxWords=35, MinWords=15, ShortWord=3, HighlightAll=FALSE </programlisting> </para> --- 1121,1127 ---- Any unspecified options receive these defaults: <programlisting> ! StartSel=<b>, StopSel=</b>, MaxFragments=0, MaxWords=35, MinWords=15, ShortWord=3, HighlightAll=FALSE </programlisting> </para>
Index: src/test/regress/sql/tsearch.sql =================================================================== RCS file: /home/postgres/devel/pgsql-cvs/pgsql/src/test/regress/sql/tsearch.sql,v retrieving revision 1.9 diff -c -r1.9 tsearch.sql *** src/test/regress/sql/tsearch.sql 16 May 2008 16:31:02 -0000 1.9 --- src/test/regress/sql/tsearch.sql 16 Jul 2008 03:45:24 -0000 *************** *** 208,213 **** --- 208,253 ---- </html>', to_tsquery('english', 'sea&foo'), 'HighlightAll=true'); + --Check if headline fragments work + SELECT ts_headline('english', ' + Day after day, day after day, + We stuck, nor breath nor motion, + As idle as a painted Ship + Upon a painted Ocean. + Water, water, every where + And all the boards did shrink; + Water, water, every where, + Nor any drop to drink. + S. T. Coleridge (1772-1834) + ', to_tsquery('english', 'ocean'), 'MaxFragments=1'); + + --Check if more than one fragments are displayed + SELECT ts_headline('english', ' + Day after day, day after day, + We stuck, nor breath nor motion, + As idle as a painted Ship + Upon a painted Ocean. + Water, water, every where + And all the boards did shrink; + Water, water, every where, + Nor any drop to drink. + S. T. Coleridge (1772-1834) + ', to_tsquery('english', 'Coleridge & stuck'), 'MaxFragments=2'); + + --Fragments when there all query words are not in the document + SELECT ts_headline('english', ' + Day after day, day after day, + We stuck, nor breath nor motion, + As idle as a painted Ship + Upon a painted Ocean. + Water, water, every where + And all the boards did shrink; + Water, water, every where, + Nor any drop to drink. + S. T. Coleridge (1772-1834) + ', to_tsquery('english', 'ocean & seahorse'), 'MaxFragments=1'); + + --Rewrite sub system CREATE TABLE test_tsquery (txtkeyword TEXT, txtsample TEXT); Index: src/test/regress/expected/tsearch.out =================================================================== RCS file: /home/postgres/devel/pgsql-cvs/pgsql/src/test/regress/expected/tsearch.out,v retrieving revision 1.14 diff -c -r1.14 tsearch.out *** src/test/regress/expected/tsearch.out 16 May 2008 16:31:02 -0000 1.14 --- src/test/regress/expected/tsearch.out 16 Jul 2008 03:47:46 -0000 *************** *** 632,637 **** --- 632,705 ---- </html> (1 row) + --Check if headline fragments work + SELECT ts_headline('english', ' + Day after day, day after day, + We stuck, nor breath nor motion, + As idle as a painted Ship + Upon a painted Ocean. + Water, water, every where + And all the boards did shrink; + Water, water, every where, + Nor any drop to drink. + S. T. Coleridge (1772-1834) + ', to_tsquery('english', 'ocean'), 'MaxFragments=1'); + ts_headline + ----------------------------------- + ... stuck, nor breath nor motion, + As idle as a painted Ship + Upon a painted <b>Ocean</b>. + Water, water, every where + And all the boards did shrink; + Water, water, every where, + Nor any drop + (1 row) + + --Check if more than one fragments are displayed + SELECT ts_headline('english', ' + Day after day, day after day, + We stuck, nor breath nor motion, + As idle as a painted Ship + Upon a painted Ocean. + Water, water, every where + And all the boards did shrink; + Water, water, every where, + Nor any drop to drink. + S. T. Coleridge (1772-1834) + ', to_tsquery('english', 'Coleridge & stuck'), 'MaxFragments=2'); + ts_headline + ------------------------------------------- + ... after day, day after day, + We <b>stuck</b>, nor breath nor motion, + As idle as a painted Ship + Upon a painted Ocean. + Water, water, every where + And all the boards did shrink; + Water, water... every where, + Nor any drop to drink. + S. T. <b>Coleridge</b> + (1 row) + + --Fragments when there all query words are not in the document + SELECT ts_headline('english', ' + Day after day, day after day, + We stuck, nor breath nor motion, + As idle as a painted Ship + Upon a painted Ocean. + Water, water, every where + And all the boards did shrink; + Water, water, every where, + Nor any drop to drink. + S. T. Coleridge (1772-1834) + ', to_tsquery('english', 'ocean & seahorse'), 'MaxFragments=1'); + ts_headline + ------------------------------------ + + Day after day, day after day, + We stuck, nor breath nor motion, + As idle as + (1 row) + --Rewrite sub system CREATE TABLE test_tsquery (txtkeyword TEXT, txtsample TEXT); \set ECHO none
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers