[GENERAL] Tsearch2 index silently fails on PG 7.3.2

2005-03-20 Thread Justin L. Kennedy

The short question is why does this:

select to_tsvector('default', coalesce(name, '') ||' '|| 
coalesce(description, '') ||' '|| coalesce(keywords,'')) from link_items;

give different results than this:

update link_items set linksfti=to_tsvector('default', coalesce(name, '') 
||' '|| coalesce(description, '') ||' '|| coalesce(keywords,''));
select linksfti from link_items;


Here are more details:

I am working with Tsearch2 on a server with version string:
PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC 2.96

I have a table with the following schema:
CREATE TABLE link_items
(
  link_id int4,
  name varchar(255),
  url varchar(255),
  description text,
  spanish int4,
  spanishurl varchar(255),
  lastmod date,
  visible int4,
  state varchar(25),
  promisepractice int4,
  keywords text,
  linksfti tsvector
) 
WITH OIDS;
ALTER TABLE link_items OWNER TO gate;

I want linksfti to hold the search engine's indexing data (indexed on 
'name', 'description', and 'keywords'), so I run the following command:

update link_items set linksfti=to_tsvector('default', coalesce(name, '') 
||' '|| coalesce(description, '') ||' '|| coalesce(keywords,''));

The results are pretty empty.  Most have empty strings for data, other 
only index one or two items in the 3 input columns.

For example, after running, my table looks like:
name;description;keywords;linksfti
American Occupational Therapy Association (AOTA) ;Nationally recognized 
professional association for over 60,000 occupational therapists and 
occupational therapy assistants. ;Rehabilitation Professional 
Associations and Councils;'60':1 '000':2
American Physical Therapy Association (APTA);Represents more than 
70,000 physical therapists, physical therapist assistants, and students of 
physical therapy. ;Rehabilitation Professional Associations and 
Councils;'70':1 '000':2
U.S. Deaf Ski  Snowboard Association;Winter sports for people who are 
deaf  relevant links.;Recreation Winter Sports;'u.s':1
Texas Adaptive Aquatics;Adaptive water skiing program for people with 
physical and/or mental disabilities. ;Recreation Water 
Sports;'and/or':1
World T.E.A.M. Sports;Inclusive sports activities.;Recreation Team 
Sports;'t.e.a.m':1
Tennessee;Official State Web Site;Legal State Agencies;
Project Vote Smart;By entering zip code, users get list of all their 
elected officials. Links to elected officials' and candidates' web sites, 
etc. ;Government / Public Policy General;
TRIPOD Captioned Films;Captioned Films for people who are deaf or hard 
of hearing.;Recreation Captioned Movies;


When don't do it as an UPDATE and just print the contents to the screen, I 
get the full expected output:

select name, description, keywords, to_tsvector('default', coalesce(name, 
'') ||' '|| coalesce(description, '') ||' '|| coalesce(keywords,'')) from 
link_items;

United States of America Deaf Track and Field;Promotes training of  
track athletes who are deaf and coaches who are deaf and hearing. 
;Recreation Track;'of':3,11 'and':7,17,22 'are':15,20 'who':14,19 
'deaf':5,16,21 'hear':23 'unit':1 'coach':18 'field':8 'state':2 
'track':6,12,25 'train':10 'athlet':13 'promot':9 'america':4 
'recreat':24
Adventure Pursuit, Inc.;Adventure Pursuit is a group of volunteers who 
like spending time with all kinds of people and focus on adventure sports 
like kayaking.;Recreation Water Sports;'a':7 'is':6 'of':9,18 'on':22 
'all':16 'and':20 'inc':3 'who':11 'kind':17 'like':12,25 'time':14 
'with':15 'focus':21 'group':8 'kayak':26 'peopl':19 'spend':13 
'sport':24,29 'water':28 'volunt':10 'pursuit':2,5 'recreat':27 
'adventur':1,4,23


Using pgAdminIII, I copied (default backup/restore) the database from our 
production server and put in on my personal desktop (Windows 2000, PgSQL 
8.0.0) and re-ran the update query and it gave proper results.

Is it a known issue with 7.3.2, and is there a workaround without 
upgrading the server to 8.0.0?  We will upgrade in a few months, but we 
can't take the server offline now because we have too many websites that 
depend on it.

-- 
Justin Kennedy
Systems Analyst I



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Tsearch2 index silently fails on PG 7.3.2

2005-03-20 Thread Oleg Bartunov
I don't remember such problem ? What's your tsearch2 setup ?
Oleg
On Thu, 17 Mar 2005, Justin L. Kennedy wrote:
The short question is why does this:
select to_tsvector('default', coalesce(name, '') ||' '||
coalesce(description, '') ||' '|| coalesce(keywords,'')) from link_items;
give different results than this:
update link_items set linksfti=to_tsvector('default', coalesce(name, '')
||' '|| coalesce(description, '') ||' '|| coalesce(keywords,''));
select linksfti from link_items;
Here are more details:
I am working with Tsearch2 on a server with version string:
PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC 2.96
I have a table with the following schema:
CREATE TABLE link_items
(
 link_id int4,
 name varchar(255),
 url varchar(255),
 description text,
 spanish int4,
 spanishurl varchar(255),
 lastmod date,
 visible int4,
 state varchar(25),
 promisepractice int4,
 keywords text,
 linksfti tsvector
)
WITH OIDS;
ALTER TABLE link_items OWNER TO gate;
I want linksfti to hold the search engine's indexing data (indexed on
'name', 'description', and 'keywords'), so I run the following command:
update link_items set linksfti=to_tsvector('default', coalesce(name, '')
||' '|| coalesce(description, '') ||' '|| coalesce(keywords,''));
The results are pretty empty.  Most have empty strings for data, other
only index one or two items in the 3 input columns.
For example, after running, my table looks like:
name;description;keywords;linksfti
American Occupational Therapy Association (AOTA) ;Nationally recognized
professional association for over 60,000 occupational therapists and
occupational therapy assistants. ;Rehabilitation Professional
Associations and Councils;'60':1 '000':2
American Physical Therapy Association (APTA);Represents more than
70,000 physical therapists, physical therapist assistants, and students of
physical therapy. ;Rehabilitation Professional Associations and
Councils;'70':1 '000':2
U.S. Deaf Ski  Snowboard Association;Winter sports for people who are
deaf  relevant links.;Recreation Winter Sports;'u.s':1
Texas Adaptive Aquatics;Adaptive water skiing program for people with
physical and/or mental disabilities. ;Recreation Water
Sports;'and/or':1
World T.E.A.M. Sports;Inclusive sports activities.;Recreation Team
Sports;'t.e.a.m':1
Tennessee;Official State Web Site;Legal State Agencies;
Project Vote Smart;By entering zip code, users get list of all their
elected officials. Links to elected officials' and candidates' web sites,
etc. ;Government / Public Policy General;
TRIPOD Captioned Films;Captioned Films for people who are deaf or hard
of hearing.;Recreation Captioned Movies;
When don't do it as an UPDATE and just print the contents to the screen, I
get the full expected output:
select name, description, keywords, to_tsvector('default', coalesce(name,
'') ||' '|| coalesce(description, '') ||' '|| coalesce(keywords,'')) from
link_items;
United States of America Deaf Track and Field;Promotes training of
track athletes who are deaf and coaches who are deaf and hearing.
;Recreation Track;'of':3,11 'and':7,17,22 'are':15,20 'who':14,19
'deaf':5,16,21 'hear':23 'unit':1 'coach':18 'field':8 'state':2
'track':6,12,25 'train':10 'athlet':13 'promot':9 'america':4
'recreat':24
Adventure Pursuit, Inc.;Adventure Pursuit is a group of volunteers who
like spending time with all kinds of people and focus on adventure sports
like kayaking.;Recreation Water Sports;'a':7 'is':6 'of':9,18 'on':22
'all':16 'and':20 'inc':3 'who':11 'kind':17 'like':12,25 'time':14
'with':15 'focus':21 'group':8 'kayak':26 'peopl':19 'spend':13
'sport':24,29 'water':28 'volunt':10 'pursuit':2,5 'recreat':27
'adventur':1,4,23
Using pgAdminIII, I copied (default backup/restore) the database from our
production server and put in on my personal desktop (Windows 2000, PgSQL
8.0.0) and re-ran the update query and it gave proper results.
Is it a known issue with 7.3.2, and is there a workaround without
upgrading the server to 8.0.0?  We will upgrade in a few months, but we
can't take the server offline now because we have too many websites that
depend on it.

Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match