Re: [HACKERS] tsearch patch and namespace pollution
Bruce Momjian wrote: I would be happy if all text search functions began with 'ts', 'ts_', or 'to_ts', and if we don't clean this up now, it is going to be harder in the future. +1 from me. \df is also much more useful then. I think users can expect some migration for text search in 8.3 as a benefit of getting into core and be dump-able. I guess so. Especially if you change some functions, they will have to change source code anyway. So you can as well cleanup all functions that don't fit into a sound naming schema. Best Regards Michael Paesold ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] tsearch patch and namespace pollution
I find the following additions to pg_proc in the current tsearch2 patch: proc | prorettype --+ pg_ts_parser_is_visible(oid) | boolean pg_ts_dict_is_visible(oid) | boolean pg_ts_template_is_visible(oid) | boolean pg_ts_config_is_visible(oid) | boolean tsvectorin(cstring) | tsvector tsvectorout(tsvector)| cstring tsvectorsend(tsvector) | bytea tsqueryin(cstring) | tsquery tsqueryout(tsquery) | cstring tsquerysend(tsquery) | bytea gtsvectorin(cstring) | gtsvector gtsvectorout(gtsvector) | cstring tsvector_lt(tsvector,tsvector) | boolean tsvector_le(tsvector,tsvector) | boolean tsvector_eq(tsvector,tsvector) | boolean tsvector_ne(tsvector,tsvector) | boolean tsvector_ge(tsvector,tsvector) | boolean tsvector_gt(tsvector,tsvector) | boolean tsvector_cmp(tsvector,tsvector) | integer length(tsvector) | integer strip(tsvector) | tsvector setweight(tsvector,char) | tsvector tsvector_concat(tsvector,tsvector) | tsvector vq_exec(tsvector,tsquery)| boolean qv_exec(tsquery,tsvector)| boolean tt_exec(text,text) | boolean ct_exec(character varying,text) | boolean tq_exec(text,tsquery)| boolean cq_exec(character varying,tsquery) | boolean tsquery_lt(tsquery,tsquery) | boolean tsquery_le(tsquery,tsquery) | boolean tsquery_eq(tsquery,tsquery) | boolean tsquery_ne(tsquery,tsquery) | boolean tsquery_ge(tsquery,tsquery) | boolean tsquery_gt(tsquery,tsquery) | boolean tsquery_cmp(tsquery,tsquery) | integer tsquery_and(tsquery,tsquery) | tsquery tsquery_or(tsquery,tsquery) | tsquery tsquery_not(tsquery) | tsquery tsq_mcontains(tsquery,tsquery) | boolean tsq_mcontained(tsquery,tsquery) | boolean numnode(tsquery) | integer querytree(tsquery) | text rewrite(tsquery,tsquery,tsquery) | tsquery rewrite(tsquery,text)| tsquery rewrite_accum(tsquery,tsquery[]) | tsquery rewrite_finish(tsquery) | tsquery rewrite(tsquery[]) | tsquery stat(text) | record stat(text,text) | record rank(real[],tsvector,tsquery,integer)| real rank(real[],tsvector,tsquery)| real rank(tsvector,tsquery,integer) | real rank(tsvector,tsquery) | real rank_cd(real[],tsvector,tsquery,integer) | real rank_cd(real[],tsvector,tsquery) | real rank_cd(tsvector,tsquery,integer)| real rank_cd(tsvector,tsquery)| real token_type(oid) | record token_type(text) | record parse(oid,text) | record parse(text,text) | record lexize(oid,text) | text[] lexize(text,text)| text[] headline(oid,text,tsquery,text) | text headline(oid,text,tsquery) | text headline(text,text,tsquery,text) | text headline(text,text,tsquery) | text headline(text,tsquery,text) | text headline(text,tsquery) | text to_tsvector(oid,text)| tsvector to_tsvector(text,text) | tsvector to_tsquery(oid,text) | tsquery to_tsquery(text,text)| tsquery plainto_tsquery(oid,text)| tsquery plainto_tsquery(text,text) | tsquery to_tsvector(text)| tsvector to_tsquery(text) | tsquery plainto_tsquery(text)| tsquery tsvector_update_trigger()| trigger get_ts_config_oid(text) | oid get_current_ts_config() | oid (82 rows) (This list omits functions with INTERNAL arguments, as those are of no particular concern to users.) While most of these are probably OK, I'm disturbed by the prospect that we are commandeering names as generic as parse or stat with argument types as generic as text. I think we need to put a ts_ prefix on some of these. Specifically, I find these names totally unacceptable without a ts_ prefix: stat(text) | record stat(text,text) | record token_type(oid) | record token_type(text)
Re: [HACKERS] tsearch patch and namespace pollution
Tom Lane wrote: I find the following additions to pg_proc in the current tsearch2 patch: It seems a lot of these are useless and just bloat. I will mark a few: proc | prorettype --+ pg_ts_parser_is_visible(oid) | boolean pg_ts_dict_is_visible(oid) | boolean pg_ts_template_is_visible(oid) | boolean pg_ts_config_is_visible(oid) | boolean Why would anyone look these up via OID rather than name? tsvectorin(cstring) | tsvector tsvectorout(tsvector)| cstring tsvectorsend(tsvector) | bytea tsqueryin(cstring) | tsquery tsqueryout(tsquery) | cstring tsquerysend(tsquery) | bytea gtsvectorin(cstring) | gtsvector gtsvectorout(gtsvector) | cstring tsvector_lt(tsvector,tsvector) | boolean tsvector_le(tsvector,tsvector) | boolean tsvector_eq(tsvector,tsvector) | boolean tsvector_ne(tsvector,tsvector) | boolean tsvector_ge(tsvector,tsvector) | boolean tsvector_gt(tsvector,tsvector) | boolean tsvector_cmp(tsvector,tsvector) | integer length(tsvector) | integer strip(tsvector) | tsvector setweight(tsvector,char) | tsvector tsvector_concat(tsvector,tsvector) | tsvector vq_exec(tsvector,tsquery)| boolean qv_exec(tsquery,tsvector)| boolean tt_exec(text,text) | boolean ct_exec(character varying,text) | boolean tq_exec(text,tsquery)| boolean cq_exec(character varying,tsquery) | boolean tsquery_lt(tsquery,tsquery) | boolean tsquery_le(tsquery,tsquery) | boolean tsquery_eq(tsquery,tsquery) | boolean tsquery_ne(tsquery,tsquery) | boolean tsquery_ge(tsquery,tsquery) | boolean tsquery_gt(tsquery,tsquery) | boolean tsquery_cmp(tsquery,tsquery) | integer tsquery_and(tsquery,tsquery) | tsquery tsquery_or(tsquery,tsquery) | tsquery tsquery_not(tsquery) | tsquery tsq_mcontains(tsquery,tsquery) | boolean tsq_mcontained(tsquery,tsquery) | boolean numnode(tsquery) | integer querytree(tsquery) | text rewrite(tsquery,tsquery,tsquery) | tsquery rewrite(tsquery,text)| tsquery rewrite_accum(tsquery,tsquery[]) | tsquery rewrite_finish(tsquery) | tsquery rewrite(tsquery[]) | tsquery stat(text) | record stat(text,text) | record rank(real[],tsvector,tsquery,integer)| real rank(real[],tsvector,tsquery)| real rank(tsvector,tsquery,integer) | real rank(tsvector,tsquery) | real rank_cd(real[],tsvector,tsquery,integer) | real rank_cd(real[],tsvector,tsquery) | real rank_cd(tsvector,tsquery,integer)| real rank_cd(tsvector,tsquery)| real Do we realy need this many ranking functions? token_type(oid) | record Again, why by OID? token_type(text) | record parse(oid,text) | record parse(text,text) | record lexize(oid,text) | text[] lexize(text,text)| text[] headline(oid,text,tsquery,text) | text headline(oid,text,tsquery) | text headline(text,text,tsquery,text) | text headline(text,text,tsquery) | text headline(text,tsquery,text) | text headline(text,tsquery) | text to_tsvector(oid,text)| tsvector to_tsvector(text,text) | tsvector to_tsquery(oid,text) | tsquery Why OID again for the configuration? I just don't see the use case and it is bloat and causes confusion. to_tsquery(text,text)| tsquery plainto_tsquery(oid,text)| tsquery plainto_tsquery(text,text) | tsquery Again, OID. I asked Oleg about this and he said: Bruce, just remove oid argument specification from documentation. so I think we can go ahead and remove cases where the configuration name or object is specified by oid. I have already removed them from the documentation and I though the patch had them removed too, but I guess not. Admittedly this API has been in flux. to_tsvector(text)| tsvector to_tsquery(text) | tsquery