Re: [GENERAL] Parallel Query should be a top priority
Interesting, that Stonebraker in his interview said about parallel query processing http://searchenterpriselinux.techtarget.com/qna/0,289202,sid39_gci1025832,00.html Putting aside Larry Ellison, would you say, anything should have been done differently? Stonebraker: We made a couple of significant mistakes. The one I most would like to have back was Informix made a nice run in the early 1990s selling parallel query processing and they were really fast and routinely beat Oracle in performance bakeoffs. Informix horizontally partitioned databases and spread them out over different processors and used multiple processors on a single query very efficiently. That was technology that Ingres started developing in 1987 and then Ingres decided to cancel that initiative, so that's one I'd like to have back. Another initiative that failed was that Ingres put a fair amount of money into writing a distributed database system and there just wasn't much of a market for distributed databases. I would have killed that one and kept alive the parallel query processing effort. Ultimately Informix got squashed by Oracle anyway. It's not clear this would have made a whole lot of difference in the outcome. On Mon, 28 Mar 2005, Bruno Wolff III wrote: On Sun, Mar 27, 2005 at 23:58:35 -0500, Mike Mascari mascarm@mascari.com wrote: Without parallel query, the *only* way to decrease the execution time of a single query whose data has been fully cached is to buy the latest-and-greatest which is increasing in speed at decreasing rates, rather than scaling up the number of processors in a single box. A speed barrier to PostgreSQL's ability to execute a single query is fast approaching. I think that is a bit extreme. For some queries you will be able to parallelize accross mutliple back ends and realize some speedup. I would also think that this argument could also apply to cases where the data is on several sets of disks and you wanted to be reading from both sets at once rather than serially. I love PostgreSQL, and with tablespaces, PITR, nested transactions, and more PLs than one knows what do with, it's my favorite database from a usability standpoint. But in terms of performance, the one missing piece to the performance puzzle is parallel query. Consider parallel processing a single query should be moved out from under Miscellaneous on the TODO list and re-categorized as the formerly existent URGENT feature... I think there are other things that could be done to improve optimization that will benefit more people than parallelized queries. Those are really only useful to people where the database is being used by a handful (less than the number of processors and/or disk channels) of users concurrently, who are making long running queries and waiting for the results. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] psql variables
On Fri, 25 Mar 2005, Patrick TJ McPhee wrote: In article [EMAIL PROTECTED], Paul Cunningham [EMAIL PROTECTED] wrote: % I use a bash script (similar to following example) to update tables. % % psql -v passed_in_var=\'some_value\' -f script_name % % Is it possible to pass a value back from psql to the bash script? If you run it like this $(psql -v passed_in_var=\'some_value\' -f script_name) and arrange for all the script output to be in the form var1=value 1 var2=value 2 var3=value 3 ... then var1, var2, etc will be set in bash (or ksh, or the posix shell). Note that there are no spaces around the equals signs. btw, how to expand psql variable in -c ? For example, this works as expected psql discovery-test -v a1=\'message_parts\' -f tt.sql but not psql discovery-test -v a1=\'message_parts\' -c select * from iostat where relname=:a1; ERROR: parse error at or near : at character 36 LINE 1: select * from iostat where relname=:a1; Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] 8.0.1 in a non-standard location and tsearch2
Did you try 'make clean' first ? On Thu, 24 Mar 2005, Ben wrote: I'm trying to install tsearch2 into an empty database on a new 8.0.1 postgres install. The machine already has an older 7.4 install of postgres on it, so I gave configure a --prefix=/usr/local/pg801 option. Postgres installed and started fine (after changing the port), and I was able to create my new empty database without issues. Now comes the part where I fail to install tsearch2. I go to the contrib/tsearch2 directory, run make and make install without issues. make installcheck tries to connect to the older postgres install (I don't see an option to set the port it attempts to use), so I try to pipe tsearch2.sql into the new database. It starts working fine, and then says: ERROR: could not find function tsvector_cmp in file /usr/local/pgsql/lib/tsearch2.so which is interesting, because it's not trying to use /usr/local/pg801/ like it's supposed to. Thoughts? ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] question about 8.1 and stored procedures
On Sun, 20 Mar 2005, Bruce Momjian wrote: Tony Caduto wrote: Hi, I read in a article/interview on http://madpenguin.org/cms/html/62/3677.html that work was being done on improving/adding support for sql standard compliant stored procs/functions Does anyone know exactly what that means? Does it mean that Postgres will have stored procs that can have input and output params? Yes. I think that keyword here is 'sql standard compliant', not stored procedures itself, because we have them for a long time and with support of dozen languages. Or I miss something ? Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] question about 8.1 and stored procedures
On Sun, 20 Mar 2005, Joshua D. Drake wrote: Oleg Bartunov wrote: On Sun, 20 Mar 2005, Bruce Momjian wrote: Tony Caduto wrote: Hi, I read in a article/interview on http://madpenguin.org/cms/html/62/3677.html that work was being done on improving/adding support for sql standard compliant stored procs/functions Does anyone know exactly what that means? Does it mean that Postgres will have stored procs that can have input and output params? Yes. I think that keyword here is 'sql standard compliant', not stored procedures itself, because we have them for a long time and with support of dozen languages. Or I miss something ? What I read from this is, when will PostgreSQL have stored procedures like Oracle. Thus the IN/OUT parameter statement. I mean original Josh's interview An example of what people are working on right now is SQL standard compliant stored procedures. We have procedures now, but they're not compliant with the standard syntax. Nothing about Oracle unless Oracle has standard compliant stored procedures. My understanding is that 8.1 will have a much more mature implementation of stored procedures versus UDFs (Which we have had forever). What's the difference between UDF and stored procedure ? Sincerely, Joshua D. Drake Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] question about 8.1 and stored procedures
On Sun, 20 Mar 2005, Joshua D. Drake wrote: My understanding is that 8.1 will have a much more mature implementation of stored procedures versus UDFs (Which we have had forever). What's the difference between UDF and stored procedure ? Here are a couple of GGIYF references: http://builder.com.com/5100-6388-1045463.html http://blogs.pingpoet.com/vbguru/archive/2004/04/29/535.aspx They are similar but they offer different functionality. At least in the sense of the other databases. Hmm, the only real difference I see - is that SP are precompiled. I think we should clearly outline what is SP and what is UDF and do we are working on SP or just improving and extending our functions. Sincerely, Joshua D. Drake Sincerely, Joshua D. Drake Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Tsearch2 index silently fails on PG 7.3.2
I don't remember such problem ? What's your tsearch2 setup ? Oleg On Thu, 17 Mar 2005, Justin L. Kennedy wrote: The short question is why does this: select to_tsvector('default', coalesce(name, '') ||' '|| coalesce(description, '') ||' '|| coalesce(keywords,'')) from link_items; give different results than this: update link_items set linksfti=to_tsvector('default', coalesce(name, '') ||' '|| coalesce(description, '') ||' '|| coalesce(keywords,'')); select linksfti from link_items; Here are more details: I am working with Tsearch2 on a server with version string: PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC 2.96 I have a table with the following schema: CREATE TABLE link_items ( link_id int4, name varchar(255), url varchar(255), description text, spanish int4, spanishurl varchar(255), lastmod date, visible int4, state varchar(25), promisepractice int4, keywords text, linksfti tsvector ) WITH OIDS; ALTER TABLE link_items OWNER TO gate; I want linksfti to hold the search engine's indexing data (indexed on 'name', 'description', and 'keywords'), so I run the following command: update link_items set linksfti=to_tsvector('default', coalesce(name, '') ||' '|| coalesce(description, '') ||' '|| coalesce(keywords,'')); The results are pretty empty. Most have empty strings for data, other only index one or two items in the 3 input columns. For example, after running, my table looks like: name;description;keywords;linksfti American Occupational Therapy Association (AOTA) ;Nationally recognized professional association for over 60,000 occupational therapists and occupational therapy assistants. ;Rehabilitation Professional Associations and Councils;'60':1 '000':2 American Physical Therapy Association (APTA);Represents more than 70,000 physical therapists, physical therapist assistants, and students of physical therapy. ;Rehabilitation Professional Associations and Councils;'70':1 '000':2 U.S. Deaf Ski Snowboard Association;Winter sports for people who are deaf relevant links.;Recreation Winter Sports;'u.s':1 Texas Adaptive Aquatics;Adaptive water skiing program for people with physical and/or mental disabilities. ;Recreation Water Sports;'and/or':1 World T.E.A.M. Sports;Inclusive sports activities.;Recreation Team Sports;'t.e.a.m':1 Tennessee;Official State Web Site;Legal State Agencies; Project Vote Smart;By entering zip code, users get list of all their elected officials. Links to elected officials' and candidates' web sites, etc. ;Government / Public Policy General; TRIPOD Captioned Films;Captioned Films for people who are deaf or hard of hearing.;Recreation Captioned Movies; When don't do it as an UPDATE and just print the contents to the screen, I get the full expected output: select name, description, keywords, to_tsvector('default', coalesce(name, '') ||' '|| coalesce(description, '') ||' '|| coalesce(keywords,'')) from link_items; United States of America Deaf Track and Field;Promotes training of track athletes who are deaf and coaches who are deaf and hearing. ;Recreation Track;'of':3,11 'and':7,17,22 'are':15,20 'who':14,19 'deaf':5,16,21 'hear':23 'unit':1 'coach':18 'field':8 'state':2 'track':6,12,25 'train':10 'athlet':13 'promot':9 'america':4 'recreat':24 Adventure Pursuit, Inc.;Adventure Pursuit is a group of volunteers who like spending time with all kinds of people and focus on adventure sports like kayaking.;Recreation Water Sports;'a':7 'is':6 'of':9,18 'on':22 'all':16 'and':20 'inc':3 'who':11 'kind':17 'like':12,25 'time':14 'with':15 'focus':21 'group':8 'kayak':26 'peopl':19 'spend':13 'sport':24,29 'water':28 'volunt':10 'pursuit':2,5 'recreat':27 'adventur':1,4,23 Using pgAdminIII, I copied (default backup/restore) the database from our production server and put in on my personal desktop (Windows 2000, PgSQL 8.0.0) and re-ran the update query and it gave proper results. Is it a known issue with 7.3.2, and is there a workaround without upgrading the server to 8.0.0? We will upgrade in a few months, but we can't take the server offline now because we have too many websites that depend on it. Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] postgresql8.0 and postgis1.0.0
On Wed, 16 Feb 2005, Pritesh Shah wrote: I understand that this is due to the following: libpostgis.so.8.0 is now liblwgeom.so.1.0 and also histogram2d_in is now lwhistogram2d_in histogram2d_out is now lwhistogram2d_out and so on. Now my problem is there are a lot of databases that use the postgis stuff (like the histogram2d_in) which has changed from the older version to the newer version. What do i do to overcome this problem?? Can somebody help me out with this?? if these functions are internal POSGIS api, you need to remove all defintions come from posgis.sql (match version !) from you dump, then 1. createdb dbname 2. psql dbname posgis.sql (new version) 3. psql dbname your_dump.sql if these functions are part of external api, you're out of luck, you should do search/replace in your dump, probably checking arguments. Cheers, Pritesh ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Creating an index-type for LIKE '%value%'
On Tue, 8 Feb 2005, Larry Rosenman wrote: On Tue, 8 Feb 2005, Oleg Bartunov wrote: On Mon, 7 Feb 2005, Larry Rosenman wrote: Oleg Bartunov wrote: Larry, I pointed you to pg_trgm module mostly following Martijn's suggestions. Now, I see you need another our module - ltree, see http://www.sai.msu.su/~megera/postgres/gist/ltree/ for details. I maybe dense, but could you give me an example? test=# \d tt Table public.tt Column | Type | Modifiers +---+--- domain | ltree | Indexes: ltree_idx gist (domain) test=# select * from tt where domain ~ '*.ru'::lquery; domain - astronet.ru mail.ru pgsql.ru (3 rows) I'm not seeing it for some reason :). Thanks, LER Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Creating an index-type for LIKE '%value%'
On Tue, 8 Feb 2005, Larry Rosenman wrote: It doesn't seem to like pieces with hyphens ('-') in the name, when I try To update blacklist set new_domain_lt=text2ltree(domain) I get a Syntax error (apparently for the hyphens). Try change definition of ISALNUM on ltree.h #define ISALNUM(x) ( isalnum((unsigned char)(x)) || (x) == '_' ) this was already discussed http://www.pgsql.ru/db/mw/msg.html?mid=2034299 Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Creating an index-type for LIKE '%value%'
On Tue, 8 Feb 2005, Larry Rosenman wrote: Oleg Bartunov wrote: On Tue, 8 Feb 2005, Larry Rosenman wrote: It doesn't seem to like pieces with hyphens ('-') in the name, when I try To update blacklist set new_domain_lt=text2ltree(domain) I get a Syntax error (apparently for the hyphens). Try change definition of ISALNUM on ltree.h #define ISALNUM(x) ( isalnum((unsigned char)(x)) || (x) == '_' ) this was already discussed http://www.pgsql.ru/db/mw/msg.html?mid=2034299 Thanks! Now, how can I make it always case-insensitive? from http://www.sai.msu.su/~megera/postgres/gist/ltree/ It is possible to use several modifiers at the end of a label: @ Do case-insensitive label matching * Do prefix matching for a label % Don't account word separator '_' in label matching, that is 'Russian%' would match 'Russian_nations', but not 'Russian' Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Creating an index-type for LIKE '%value%'
On Tue, 8 Feb 2005, Larry Rosenman wrote: Oleg Bartunov wrote: On Tue, 8 Feb 2005, Larry Rosenman wrote: Oleg Bartunov wrote: On Tue, 8 Feb 2005, Larry Rosenman wrote: It doesn't seem to like pieces with hyphens ('-') in the name, when I try To update blacklist set new_domain_lt=text2ltree(domain) I get a Syntax error (apparently for the hyphens). Try change definition of ISALNUM on ltree.h #define ISALNUM(x) ( isalnum((unsigned char)(x)) || (x) == '_' ) this was already discussed http://www.pgsql.ru/db/mw/msg.html?mid=2034299 Thanks! Now, how can I make it always case-insensitive? from http://www.sai.msu.su/~megera/postgres/gist/ltree/ It is possible to use several modifiers at the end of a label: @ Do case-insensitive label matching * Do prefix matching for a label % Don't account word separator '_' in label matching, that is 'Russian%' would match 'Russian_nations', but not 'Russian' Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 Does that apply to each node, or the entire string? I'd like to not have to parse the lquery string and make each node following it with an @. I'm a little bit tired :), if you want case insenstive for the whole node, you could use built-in fuinction 'lower(text)' ! use text2ltree(lower(text)) LER Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] Creating an index-type for LIKE '%value%'
Read http://www.sai.msu.su/~megera/postgres/gist/pg_trgm/README.pg_trgm Oleg On Mon, 7 Feb 2005, Martijn van Oosterhout wrote: On Mon, Feb 07, 2005 at 09:28:24AM -0800, CG wrote: As I was exploring ways to optimize my application's use of the database, which has to run the horrid SELECT * FROM table WHERE field LIKE '%value%'; in places, I thought this solution could be built upon to allow for an easier deployment. snip AFAICT, the right way to do this would be to create an index type which would take care of splitting the varchar field, and to have the query planner use the index in a SELECT when a LIKE '%value%' is used in the WHERE clause. Tsearch2 is fantastic, but it works best for fields that contain words. I have to sift through alphanumeric identification numbers. Seems to me to depends quite a bit or your problem domain. How big are the string's you're searching. If you're not searching on word boundaries like tsearch, you'd need to split on every char. Say you split on three character blocks. So the string Hello World would need entries for: Hel, ell, llo, lo , o W, Wo, Wor, orl, rld For N character strings you'd need N-2 entries. If you're storing entire documents it's not practical. But if all your strings are maybe 15 characters long (maybe serial numbers), it might be practical. I havn't looked at tsearch but maybe you can customise it to your needs. If you can redefine the split function you could make it work appropriately. Then you can define the ~~ operator (which is LIKE) to call tsearch. This in just off the top of my head, but maybe it can work. Hope this helps, Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] Creating an index-type for LIKE '%value%'
On Mon, 7 Feb 2005, Larry Rosenman wrote: Oleg Bartunov wrote: Read http://www.sai.msu.su/~megera/postgres/gist/pg_trgm/README.pg_trgm Oleg On Mon, 7 Feb 2005, Martijn van Oosterhout wrote: Would you have a suggestion to index the following query: SELECT domain,message,'1' as truth FROM blacklist WHERE somedomain ~* '(?:.+\.|)' || domain || '\$') The somedomain is actually a constant passed in from Exim (it's the sender's righthand Side of an E-Mail address). I'm looking to see if the domain name is in my blacklist. I may just be SOL, but I figured I'd ask. Larry, I pointed you to pg_trgm module mostly following Martijn's suggestions. Now, I see you need another our module - ltree, see http://www.sai.msu.su/~megera/postgres/gist/ltree/ for details. Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Eeek! Major problem after reload with tsearch2
This is know issue with OIDS. You,probably, needed to apply regprocedure_7.4.patch.gz patch from http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ Oleg On Thu, 3 Feb 2005, Karl Denninger wrote: Ok, I found out what was going on with the tsearch2 module and reloading after an upgrade to 8.0.1 from 7.4.1 The data now loads cleanly, and selects are fine. HOWEVER, when I attempt an update, I issue the following SQL command (this table has a tsearch2 vector in it:) update post set invisible='0' where ordinal='2843' And get back: ERROR: cache lookup failed for function 36476 CONTEXT: SQL statement select lt.tokid, map.dict_name from public.pg_ts_cfgmap as map, public.pg_ts_cfg as cfg, public.token_type( $1 ) as lt where lt.alias = map.tok_alias and map.ts_name = cfg.ts_name and cfg.oid= $2 order by lt.tokid desc; Ai! A reindex did nothing. What did I miss? Looks like there's something missing, but what?! -- Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Eeek! Major problem after reload with tsearch2
On Thu, 3 Feb 2005, Karl Denninger wrote: The patch is in the 8.0.1 version of Tsearch2 already. The problem is that I have a dump from a 7.4.1 database taken with the 8.0.1 pg_dumpall that I need to be able to get back online on 8.0.x. you had to apply patch to 7.4.1 db before dumping or use regprocedure_update.sql to update your live 7.4.1 database ! pg_dump (pg_dumpall) has nothing with the problem. After applying patch and dumping db you have OID-free db and should have no problem ! Is the only option to find all the functions in the tsearch.sql file, drop them by hand, remove all the tsearch2 index fields, then reload tsearch2.sql and re-create the indices? That's not impossible, but a shizload of work, as there's no good way that I can see to drop all the tsearch2 functions in a single step (e.g. I'd have to go through the tsearch2.sql file individually, find all the entries, drop them, etc.) Another possibility Does a pg_dumpall dump functions as well? It appears not from the documentation - so if I drop the columns and then dump the database, I should have a clean dump without the OID stuff in it. If I then re-init and reload the data, I should then be able to do so without the tsearch2.sql stuff. I can then reload the tsearch2.sql functions and re-create the indices. Sound plausible? No, if you have OIDs in db pg_dump* will dump them. Try regprocedure_update.sql and read http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch-V2-intro.html If you have 7.4.1 db running you might go way described earlier. - Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] Problem resolved (tsearch2 inhibiting migration)
On Thu, 3 Feb 2005, Karl Denninger wrote: On Thu, Feb 03, 2005 at 10:20:47PM -0500, Tom Lane wrote: Karl Denninger [EMAIL PROTECTED] writes: I agree with this - what would be even better would be a way to create 'subclasses' for things like this, which could then be 'included' easily. We could decree that a contrib module's script should create a schema and shove everything it makes into that schema. Then DROP SCHEMA CASCADE is all you need to get rid of it. However, you'd probably end up having to add this schema to your search path to use the module conveniently. regards, tom lane I would prefer that vastly over what I had to deal with this time. Until I discovered the untsearch2.sql script I was seriously considering the trouble of backing this out BY HAND either in a dump or in the online database. As I already pointed you should apply regprocedure_update.sql to your 7.4.1 database before dumping to get OIDs free tsearch2 functions ! This is the way most people dump their db without any problem. Unfortunately, this was not documented in 7.4 readme, but it does in current documentations, web site and discussed in mailing list. Either would have been a stupendous amount of work. Please consider this change in approach - someone else has to have been bit in the butt by this one other than me by now. -- Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] GiST index not used for ORDER BY?
On Thu, 27 Jan 2005, Max wrote: Hi, I'm setting up a simple search engine using Tsearch2. The basic idea is: a user enters a search query and a maximum of 1000 results is returned, sorted by date, newest first. At the moment the table holding the searchable data has 1.1 million entries. It works great when the search only produces a few hundred results. However when people search on a common word with 10.000+ results, there's a performance problem. The database design looks like this: CREATE TABLE posts_index ( startdate INT NOT NULL, idxFTI tsvector, ); Where startdate is a unix timestamp, and idxFTI is a tsvector with the data to be searched. Since only 1000 results need to be returned sorted by date (newest first), I hoped to solve the problem by installing the btree_gist extension and adding the following index: CREATE INDEX idxFTI_idx2 ON posts_index USING gist(idxFTI,(-startdate)); However the -startdate portion of the index doesn't seem to be used: -- EXPLAIN SELECT startdate, headline(subject,q) AS subject FROM posts_index i, to_tsquery('default', '_SEARCH_TERM_') AS q WHERE idxfti @@ q ORDER BY (-i.startdate) LIMIT 1000; I assume you already vacuum your db. Hmm, seems you need to rewrite your query. EXPLAIN SELECT startdate, headline(subject,q) AS subject FROM ( SELECT startdate, subject from posts_index i, to_tsquery('default', '_SEARCH_TERM_') AS q WHERE idxfti @@ q ORDER BY (-i.startdate) LIMIT 1000) as foo; I bet your query will be much faster. In your query all founded tuples should be read from disk to calculate headline(), while in my query maximum 1000 tuples will be read. So, performance gain could be noticeable, for example, if search returns 10,000 tuples, my query will be 10x faster than yours :) I think this is what you observed. QUERY PLAN Limit (cost=5152014.10..5152016.60 rows=1000 width=126) - Sort (cost=5152014.10..5155079.61 rows=1226201 width=126) Sort Key: (- i.startdate) - Nested Loop (cost=0.00..4912754.84 rows=1226201 width=126) - Function Scan on q (cost=0.00..12.50 rows=1000 width=32) - Index Scan using idxfti_idx2 on posts_index i (cost=0.00..4891.27 rows=1227 width=253) Index Cond: (i.idxfti @@ outer.q) Any suggestions? Regards, Max ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] GiST index not used for ORDER BY?
On Thu, 27 Jan 2005, Max wrote: Hi, At 09:54 PM 1/27/2005, you wrote: On Thu, 27 Jan 2005, Max wrote: I'm setting up a simple search engine using Tsearch2. The basic idea is: a user enters a search query and a maximum of 1000 results is returned, sorted by date, newest first. At the moment the table holding the searchable data has 1.1 million entries. It works great when the search only produces a few hundred results. However when people search on a common word with 10.000+ results, there's a performance problem. CREATE TABLE posts_index ( startdate INT NOT NULL, idxFTI tsvector, ); CREATE INDEX idxFTI_idx2 ON posts_index USING gist(idxFTI,(-startdate)); I assume you already vacuum your db. Yes, I did vacuum analyze it. And he does use the first part of the index (idxFTI), just not the second part (-startdate). Hmm, seems you need to rewrite your query. EXPLAIN SELECT startdate, headline(subject,q) AS subject FROM ( SELECT startdate, subject from posts_index i, to_tsquery('default', '_SEARCH_TERM_') AS q WHERE idxfti @@ q ORDER BY (-i.startdate) LIMIT 1000) as foo; I bet your query will be much faster. In your query all founded tuples should be read from disk to calculate headline(), while in my query maximum 1000 tuples will be read. So, performance gain could be noticeable, for example, if search returns 10,000 tuples, my query will be 10x faster than yours :) I think this is what you observed. Thanks for your help, however headline() doesn't seem the problem. Here's an EXPLAIN ANALYZE using your query and a common word as SEARCH_TERM: -- QUERY PLAN Subquery Scan foo (cost=5368809.49..5368824.49 rows=1000 width=181) (actual time=363455.642..363510.277 rows=1000 loops=1) - Limit (cost=5368809.49..5368811.99 rows=1000 width=126) (actual time=363454.387..363455.983 rows=1000 loops=1) - Sort (cost=5368809.49..5372006.34 rows=1278741 width=126) (actual time=363454.380..363455.471 rows=1000 loops=1) Sort Key: (- i.startdate) - Nested Loop (cost=0.00..5118844.92 rows=1278741 width=126) (actual time=0.140..354003.773 rows=343974 loops=1) - Function Scan on q (cost=0.00..12.50 rows=1000 width=32) (actual time=0.015..0.018 rows=1 loops=1) - Index Scan using idxfti_idx2 on posts_index i (cost=0.00..5099.65 rows=1279 width=253) (actual time=0.111..353068.267 rows=343974 loops=1) Index Cond: (i.idxfti @@ outer.q) Total runtime: 363571.960 ms It still seems to rather sort 343.974 rows and take over 5 minutes to complete, than use the index for the date. While searching on less common words takes less than a second.Omitting headline() completely doesn't changes anything either. So it must be something else. strange. Why did you omit select ? So, search returns 343.974 rows. Am I right ? try select * from YOUR_TABLE limit 343974; then you'll see how much time requires just for reading results. Regards, Max Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] Lower case
Vladimir, there is pgsql-ru-general mailing list for russian speaking people Oleg On Wed, 26 Jan 2005, Vladimir S. Petukhov wrote: Hi! Sorry for my English.. I want to do case-insensitivity search, like this ... WHERE lower (column_name) LIKE lower (%value%); This work fine for English.. But i need search for Russian words, lower() operator does not work with Russian (non-English) chars, but ORDER works fine... ??? What's wrong? Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] Cube
On Mon, 24 Jan 2005 [EMAIL PROTECTED] wrote: Hi, I have just installed V 8.0 on XP and I have discovered some interesting functions related to cubes and crosstabs. Navigating on the site I was not able to find any information/documentation. Please help. Why not use search engine ? For example, http://www.pgsql.ru/db/pgsearch/index.html?q=crosstabs Thank you Regards Fabrizio ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] visualizing B-tree index coverage
Excuse me for bothering but what kind of search engine you developed. Does it looks like sets comparing ? Oleg On Tue, 25 Jan 2005, TJ O'Donnell wrote: Since I'm using a multi-column index, I can greatly influence the nature of the index created, depending on which columns I use and how many. I'm searching for an optimal set of columns that creates an index that, for sure does not have every value the same, nor only two values. Instead, I want to see how well I've spread the index out over the data (if that phrasing makes sense). More specifically, I have character data representing molecular structures. I've written (rather slow) search functions. I can create any number of columns that fingerprint each structure, e.g. # Carbon atoms, # N atoms, # single bonds, etc. I expect my fingerprints will not be unique (fingerprint may be a poor analogy), but rather will classify similar structures together. I create a multi-column index on these counts and get about 2-3 times speedup using 13 columns right now. For example: select count(smiles) from structure where oe_matches(smiles,'c1c1CC(=O)NC') about 15 sec. select count(smiles) from structure where (_c, _n, _o, _s, _p, _halo, _arom_c, _arom_n, _arom_o, _arom_s, _atoms, _single_bonds, _other_bonds) = ( 3,1,1,0,0,0, 6,0,0,0, 11,4,7 ) and oe_matches(smiles,'c1c1CC(=O)NC') about 6 seconds when the (_c, etc.) is a multi-column index. The data isn't inherently structured in any way that invites some particular number of columns for indexing. I don't want to use too many, nor too few columns. I also want to optimize the nature(which atom types, bond types, etc.) of the count columns. While I could do this and use the speedup as the measure of success, I think that if my B-tree were covering the data well, I would get the best results. Covering means finding that optimal situation where there is not one index for all rows and also not a unique index for every row - something inbetween would be ideal, or is that basically a wrong idea? TJ Useful explanation of PostgreSQL index format: http://www.faqs.org/docs/ppbook/c13329.htm I think you are aiming for the wrong thing. The worst possible index is one with every value the same. The second worst (still basically useless) is one with only two values. The greater the differentiation of the data, the more workload is reduced on a search. Since it isn't a straight binary tree, I don't think that having highly dissimilar data in the index should be a problem. Do you have data or experience that shows otherwise? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of TJ O'Donnell Sent: Tuesday, January 25, 2005 2:19 PM To: pgsql-general@postgresql.org Cc: [EMAIL PROTECTED] Subject: [GENERAL] visualizing B-tree index coverage Does anyone know of a tools that allows one to visualize the tree created by a multi-column B-tree index? A picture of a tree with branches, showing how branchy the tree is would be great. I'm wondering how well I've clustered the data in my table using the multi-column index. In other words, do my multi-columns sufficiently but not overly discriminate rows from each other? Do I have too many with the same index? (not enough branches) Do I have a unique index for each row? (way too many branches) Thanks, TJ ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] tsearch2 dump and restore
On Fri, 21 Jan 2005, Sean Davis wrote: Just a clarification: in 8.0.0, tsearch2-containing databases are correctly dumped and restored without the patch mentioned in the docs? Yes, it does. Thanks, Sean ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Change type
On Tue, 18 Jan 2005, Vladimir S. Petukhov wrote: Hi! Sorry for my English.. I want to dinamcly change type of column. If possible, of course.Trivial idia - create new temporary column, try to write value from old columt to temporarity (if type conersion ok - this made using select/update command and type conversion checks on client's side), then delete old column, and rename temporary column.. But this decision looked quite ill... Do you have better idea? hmm, you may use generic type like 'varchar' to store your data -- c p , p p ,c p. p p! Now playing: 03 - Lara And Reyes - Exotico.mp3 AutoGenerated by fortune xmms... ---(end of broadcast)---TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] PostgreSQL 8.0.0 Release Candidate 2
On Wed, 22 Dec 2004, Scott Marlowe wrote: The new site launched, and the download url has changed, it is now: http://www.postgresql.org/download/mirrors-ftp I don't see my mirror (Russia) ftp.ru.postgresql.org ! (or maybe http://wwwmaster.postgresql.org/download/mirrors-ftp, but I'm not sure. They both work for now.) On Tue, 2004-12-21 at 21:37, Marc G. Fournier wrote: Its been 3 weeks since our first Release Candidate, and we're down to the finally stretch for Full Release, which is looking good for happening before the New Year. A current list of *known* supported platforms can be found at: http://developer.postgresql.org/supported-platforms.html We're always looking to improve that list, and with this RC looking to be the last one before release, we encourage anyone that is running a platform not listed to please report on any success or failures with Release Candidate 2. As always, this release is available on all mirrors, as listed at: http://www.postgresql.org/mirrors-ftp.html For those using Bittorrent, David Fetter has updated the .torrents, which can be downloaded from: http://bt.postgresql.org Please report any bug reports with this Release Candidate to: pgsql-bugs@postgresql.org Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] NewsForge Poll: Favorite open source database?
On Thu, 16 Dec 2004, Marcelo Cid wrote: http://www.newsforge.com/pollBooth.pl?qid=54 I see strange line below results: (You've already voted.) I don't believe such results because I do know I didn't been here :) http://www.newsforge.com/pollBooth.pl?qid=54section=mainpageaid=4 section=mainpageaid=4 Marcelo Cid Analista de Sistemas PRODAM/DIDET/DAA/VEA 5080-9227 Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Index on geometry and timestamp
On Tue, 7 Dec 2004, Werdin Jens wrote: Hi, I need an index on a postgis-point and a timestamp. I'm using an GiST index on the geometry. But the creation of an GiST index on geometry and timestamp seems to be impossible, because GiST doesn't support Timestamps. Is there a possibility to solve my problem? yes, use contrib/btree_gist Greetings Jens Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL]
On Mon, 6 Dec 2004, Konstantin Danilov wrote: Hello List! PostgreSQL 8 does not correctly sort words written in cyrillic. It is a bug I suppose :) Probably, it's a problem of your setup. Here is what I have: regression=# select version(); version --- PostgreSQL 8.0.0rc1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.3 (1 row) regression=# select upper(''),'' ''; upper | ?column? ---+-- | t (1 row) ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] List archives search function broken
On Sat, 4 Dec 2004, Michael Fuhr wrote: Using the list archive search function currently fails with 503 Service Unavailable. Should messages about list archive problems go to pgsql-general, or would it be better to use one of the other lists like bugs, hackers, or www? I think complaints should go to -www list. btw, www.pgsql.ru is working and I think it could be as 'backstop' for main search engine, for example on server error there could link to http://www.pgsql.ru/db/pgsearch/index.html?set=archives Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [pgsql-www] [GENERAL] List archives search function broken
John, you could always rely on your script where you check if db handler is alive and if not then show gentle error message with link to www.pgsql.ru Oleg On Sun, 5 Dec 2004, John Hansen wrote: Using the list archive search function currently fails with 503 Service Unavailable. Should messages about list archive problems go to pgsql-general, or would it be better to use one of the other lists like bugs, hackers, or www? I think complaints should go to -www list. btw, www.pgsql.ru is working and I think it could be as 'backstop' for main search engine, for example on server error there could link to http://www.pgsql.ru/db/pgsearch/index.html?set=archives Not a bad idea, but I won't be able to implement that, as the frontend does not allow per virtual host error pages. - Unless someone is willing to write a patch for pound (http://www.apsis.ch/pound). Btw, this would have been caught earlier, if not for the fact that I've been without an internet connection @ home for the past few days. Db backend server OS had shut down for yet to be determined reasons. Kind Regards, John ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Tsearch2 and Unicode?
Markus, it'd be nice if you (or somebody) wrtite a note about unicode, so it could be added to tsearch2 documentation. It will help people and save time and hair :) Oleg On Mon, 22 Nov 2004, Markus Wollny wrote: Hi! I dug through my list-archives - I actually used to have the very same problem that you described: special chars being swallowed by tsearch2-functions. The source of the problem was that I had INITDB'ed my cluster with [EMAIL PROTECTED] as locale, whereas my databases used Unicode encoding. This does not work correctly. I had to dump, initdb to the correct UTF-8-locale (de_DE.UTF-8 in my case) and reload to get tsearch2 to work correctly. You may find the original discussion here: http://archives.postgresql.org/pgsql-general/2004-07/msg00620.php If you wish to find out which locale was used during INITDB for your cluster, you may use the pg_controldata program that's supplied with PostgreSQL. Kind regards Markus -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Dawid Kuroczko Gesendet: Mittwoch, 17. November 2004 17:17 An: Pgsql General Betreff: [GENERAL] Tsearch2 and Unicode? I'm trying to use tsearch2 with database which is in 'UNICODE' encoding. It works fine for English text, but as I intend to search Polish texts I did: insert into pg_ts_cfg('default_polish', 'default', 'pl_PL.UTF-8'); (and I updated other pg_ts_* tables as written in manual). However, Polish-specific chars are being eaten alive, it seems. I.e. doing select to_tsvector('default_polish', body) from messages; results in list of words but with national chars stripped... I wonder, am I doing something wrong, or just tsearch2 doesn't grok Unicode, despite the locales setting? This also is a good question regarding ispell_dict and its feelings regarding Unicode, but that's another story. Assuming Unicode unsupported means I should perhaps... oh, convert the data to iso8859 prior feeding it to_tsvector()... interesting idea, but so far I have failed to actually do it. Maybe store the data as 'bytea' and add a column with encoding information (assuming I don't want to recreate whole database with new encoding, and that I want to use unicode for some columns (so I don't have to keep encoding with every text everywhere...). And while we are at it, how do you feel -- an extra column with tsvector and its index -- would it be OK to keep it away from my data (so I can safely get rid of them if need be)? [ I intend to keep index of around 2 000 000 records, few KBs of text each ]... Regards, Dawid Kuroczko ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] TSearch2: Problems with compound words and stop words
On Wed, 17 Nov 2004, Timo Haberkern wrote: sorry for the late answer, i was on holyday, see my remarks below Oleg Bartunov wrote: On Fri, 5 Nov 2004, Timo Haberkern wrote: Oleg, i use TSearch2 with PostgreSQL 7.4.6 and i applied the compoundword patch yesterday. The configuration changed a little bit but the result is the same. I get no compound words. I'm using the locale de_DE with encoding ISO8859-1 for the database. I think i spell is working correctly except the compound words. If i try SELECT lexize('de_ispell', 'springt') i get lexize {springen,springen} which seems correct. But a SELECT lexize('de_ispell', 'Autobahn') results in lexize {autobahn} i would expect {auto,bahn, autobahn} Hmm, have you checked 'Autobahn' in ispell dictionary ? Does dictionary you used supports 'Z' flag for compound words ? Autobahn is in the ispell dictionary. What does a ispell dictionary need to support the Z flag??? Try ispell -C Autobahn search 'compound' in 'man ispell' for details. the problem exists only if ispell *does* splits word correctly while tsearch2 doesn't. You should find correct ispell dictionary for german or create it yourself. You may consult monzilla.net http://staff.science.uva.nl/~christof/monzilla/research/project-dr.html Timo The new configuration after the compound word patch: Seems you overestimate my capabilities :) Actions dict_name http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=2sortdir=ascstrings=expandedpage=1 dict_init http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=3sortdir=ascstrings=expandedpage=1 dict_initoption http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=4sortdir=ascstrings=expandedpage=1 dict_lexize http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=5sortdir=ascstrings=expandedpage=1 dict_comment http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=6sortdir=ascstrings=expandedpage=1 Edit http://www.rotex-service.com/phppgadmin/display.php?action=confeditrowstrings=expandedpage=1key%5Bdict_name%5D=simpledatabase=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=sortdir= Delete http://www.rotex-service.com/phppgadmin/display.php?action=confdelrowstrings=expandedpage=1key%5Bdict_name%5D=simpledatabase=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=sortdir= simple dex_init(text) /NULL/ dex_lexize(internal,internal,integer) Simple example of dictionary. Edit http://www.rotex-service.com/phppgadmin/display.php?action=confeditrowstrings=expandedpage=1key%5Bdict_name%5D=en_stemdatabase=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=sortdir= Delete http://www.rotex-service.com/phppgadmin/display.php?action=confdelrowstrings=expandedpage=1key%5Bdict_name%5D=en_stemdatabase=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=sortdir= en_stem snb_en_init(text) /usr/local/pgsql/share/contrib/english.stop snb_lexize(internal,internal,integer) English Stemmer. Snowball. Edit http://www.rotex-service.com/phppgadmin/display.php?action=confeditrowstrings=expandedpage=1key%5Bdict_name%5D=ru_stemdatabase=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=sortdir= Delete http://www.rotex-service.com/phppgadmin/display.php?action=confdelrowstrings=expandedpage=1key%5Bdict_name%5D=ru_stemdatabase=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=sortdir= ru_stem snb_ru_init(text) /usr/local/pgsql/share/contrib/russian.stop snb_lexize(internal,internal,integer) Russian Stemmer. Snowball. Edit
Re: [GENERAL] TSearch2: Problems with compound words and stop words
Timo, take a look into .aff file and search 'compoundwords'. german ispell file I got from http://j3e.de/ispell/igerman98/ has no support for compound words: 'compoundwords off' Norwegian, for example, has: compoundwords controlled z compoundmin 4 Oleg On Wed, 17 Nov 2004, Oleg Bartunov wrote: On Wed, 17 Nov 2004, Timo Haberkern wrote: sorry for the late answer, i was on holyday, see my remarks below Oleg Bartunov wrote: On Fri, 5 Nov 2004, Timo Haberkern wrote: Oleg, i use TSearch2 with PostgreSQL 7.4.6 and i applied the compoundword patch yesterday. The configuration changed a little bit but the result is the same. I get no compound words. I'm using the locale de_DE with encoding ISO8859-1 for the database. I think i spell is working correctly except the compound words. If i try SELECT lexize('de_ispell', 'springt') i get lexize {springen,springen} which seems correct. But a SELECT lexize('de_ispell', 'Autobahn') results in lexize {autobahn} i would expect {auto,bahn, autobahn} Hmm, have you checked 'Autobahn' in ispell dictionary ? Does dictionary you used supports 'Z' flag for compound words ? Autobahn is in the ispell dictionary. What does a ispell dictionary need to support the Z flag??? Try ispell -C Autobahn search 'compound' in 'man ispell' for details. the problem exists only if ispell *does* splits word correctly while tsearch2 doesn't. You should find correct ispell dictionary for german or create it yourself. You may consult monzilla.net http://staff.science.uva.nl/~christof/monzilla/research/project-dr.html Timo The new configuration after the compound word patch: Seems you overestimate my capabilities :) Actions dict_name http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=2sortdir=ascstrings=expandedpage=1 dict_init http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=3sortdir=ascstrings=expandedpage=1 dict_initoption http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=4sortdir=ascstrings=expandedpage=1 dict_lexize http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=5sortdir=ascstrings=expandedpage=1 dict_comment http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=6sortdir=ascstrings=expandedpage=1 Edit http://www.rotex-service.com/phppgadmin/display.php?action=confeditrowstrings=expandedpage=1key%5Bdict_name%5D=simpledatabase=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=sortdir= Delete http://www.rotex-service.com/phppgadmin/display.php?action=confdelrowstrings=expandedpage=1key%5Bdict_name%5D=simpledatabase=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=sortdir= simple dex_init(text) /NULL/ dex_lexize(internal,internal,integer) Simple example of dictionary. Edit http://www.rotex-service.com/phppgadmin/display.php?action=confeditrowstrings=expandedpage=1key%5Bdict_name%5D=en_stemdatabase=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=sortdir= Delete http://www.rotex-service.com/phppgadmin/display.php?action=confdelrowstrings=expandedpage=1key%5Bdict_name%5D=en_stemdatabase=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=sortdir= en_stem snb_en_init(text) /usr/local/pgsql/share/contrib/english.stop snb_lexize(internal,internal,integer) English Stemmer. Snowball. Edit http://www.rotex-service.com/phppgadmin/display.php?action=confeditrowstrings=expandedpage=1key%5Bdict_name%5D=ru_stemdatabase=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=sortdir= Delete http://www.rotex-service.com/phppgadmin/display.php?action=confdelrowstrings=expandedpage=1key%5Bdict_name%5D=ru_stemdatabase=selina_rotexschema
Re: [GENERAL] Tsearch2 and Unicode?
Dawid, unfortunately, tsearch2 doesn't support unicode yet. If you keep tsvector separately from data than you'll need one more join. Oleg On Wed, 17 Nov 2004, Dawid Kuroczko wrote: I'm trying to use tsearch2 with database which is in 'UNICODE' encoding. It works fine for English text, but as I intend to search Polish texts I did: insert into pg_ts_cfg('default_polish', 'default', 'pl_PL.UTF-8'); (and I updated other pg_ts_* tables as written in manual). However, Polish-specific chars are being eaten alive, it seems. I.e. doing select to_tsvector('default_polish', body) from messages; results in list of words but with national chars stripped... I wonder, am I doing something wrong, or just tsearch2 doesn't grok Unicode, despite the locales setting? This also is a good question regarding ispell_dict and its feelings regarding Unicode, but that's another story. Assuming Unicode unsupported means I should perhaps... oh, convert the data to iso8859 prior feeding it to_tsvector()... interesting idea, but so far I have failed to actually do it. Maybe store the data as 'bytea' and add a column with encoding information (assuming I don't want to recreate whole database with new encoding, and that I want to use unicode for some columns (so I don't have to keep encoding with every text everywhere...). And while we are at it, how do you feel -- an extra column with tsvector and its index -- would it be OK to keep it away from my data (so I can safely get rid of them if need be)? [ I intend to keep index of around 2 000 000 records, few KBs of text each ]... Regards, Dawid Kuroczko ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] TSearch2: Problems with compound words and stop words
Timo, I forward your message to openfts mailing list. Also, could you specify if locale settings are correct for your database and what dictionary you have downloaded. Oleg On Fri, 5 Nov 2004, Timo Haberkern wrote: Hi there, i have some troubles with my TSearch2 Installation. I have done this installation as described in http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_compound_words http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words I used the german myspell dictionary from http://lingucomponent.openoffice.org/spell_dic.html and converted it with my2ispell Nearly everything is working fine so far, except two problems: 1.) The stopword-file seems to be ignored: If i try it with SELECT to_tsvector(default_german, ein Haus) i get ein:1 haus:2 ein should be a Stopword for german (and is defined the german.stop file as well) 2.) The compound words feature doesnt work too. I have tried a lot of words, i.e. Fehlermeldung with SELECT to_tsvector(default_german, Fehlermeldung) i only get fehlermeldung:1 but i would expect fehler and meldung as seperated entries. Is there anything wrong with the dictonary or my configuration? My current configuration: pg_ts_cfg: default default C default_russian default ru_RU.KOI8-R simple default NULL default_german default de_DE.ISO8859-1 pg_ts_cfgmap: default_german host{simple} default_german hword {simple} default_german int {simple} default_german nlhword {simple} default_german nlpart_hword{simple} default_german nlword {simple} default_german part_hword {simple} default_german sfloat {simple} default_german uint{simple} default_german uri {simple} default_german url {simple} default_german version {simple} default_german word{simple} default_german lpart_hword {de_ispell,german_snowball} default_german lword {de_ispell,german_snowball} default_german lhword {de_ispell,german_snowball} pg_ts_dict: de_ispell | 17166 | DictFile=/usr/local/pgsql/share/contrib/dictonary/german.dict, AffFile=/usr/local/pgsql/share/contrib/dictonary/german.aff, StopFile=/usr/local/pgsql/share/contrib/dictonary/german.stop | 17167 | NULL german_snowball | 17357 | NULL | 17162 | Snowball stemmer for german Can anyone help me? regards Timo ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] TSearch2: Problems with compound words and stop words
Timo, please, check you apply patch for compound word support. What is version of postgresql ? Does ispell dict works for non-compound words ? Oleg On Fri, 5 Nov 2004, Timo Haberkern wrote: Hi there, i have some troubles with my TSearch2 Installation. I have done this installation as described in http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_compound_words http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words I used the german myspell dictionary from http://lingucomponent.openoffice.org/spell_dic.html and converted it with my2ispell Nearly everything is working fine so far, except two problems: 1.) The stopword-file seems to be ignored: If i try it with SELECT to_tsvector(default_german, ein Haus) i get ein:1 haus:2 ein should be a Stopword for german (and is defined the german.stop file as well) 2.) The compound words feature doesnt work too. I have tried a lot of words, i.e. Fehlermeldung with SELECT to_tsvector(default_german, Fehlermeldung) i only get fehlermeldung:1 but i would expect fehler and meldung as seperated entries. Is there anything wrong with the dictonary or my configuration? My current configuration: pg_ts_cfg: default default C default_russian default ru_RU.KOI8-R simple default NULL default_german default de_DE.ISO8859-1 pg_ts_cfgmap: default_german host{simple} default_german hword {simple} default_german int {simple} default_german nlhword {simple} default_german nlpart_hword{simple} default_german nlword {simple} default_german part_hword {simple} default_german sfloat {simple} default_german uint{simple} default_german uri {simple} default_german url {simple} default_german version {simple} default_german word{simple} default_german lpart_hword {de_ispell,german_snowball} default_german lword {de_ispell,german_snowball} default_german lhword {de_ispell,german_snowball} pg_ts_dict: de_ispell | 17166 | DictFile=/usr/local/pgsql/share/contrib/dictonary/german.dict, AffFile=/usr/local/pgsql/share/contrib/dictonary/german.aff, StopFile=/usr/local/pgsql/share/contrib/dictonary/german.stop | 17167 | NULL german_snowball | 17357 | NULL | 17162 | Snowball stemmer for german Can anyone help me? regards Timo ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] TSearch2: Problems with compound words and stop words
%3Dpg_ts_dictreturn_desc=Backsortkey=sortdir= ispell_template spell_init(text) /NULL/ spell_lexize(internal,internal,integer) ISpell interface. Must have .dict and .aff files Edit http://www.rotex-service.com/phppgadmin/display.php?action=confeditrowstrings=expandedpage=1key%5Bdict_name%5D=synonymdatabase=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=sortdir= Delete http://www.rotex-service.com/phppgadmin/display.php?action=confdelrowstrings=expandedpage=1key%5Bdict_name%5D=synonymdatabase=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=sortdir= synonym syn_init(text) /NULL/ syn_lexize(internal,internal,integer) Example of synonym dictionary Edit http://www.rotex-service.com/phppgadmin/display.php?action=confeditrowstrings=expandedpage=1key%5Bdict_name%5D=de_ispelldatabase=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=sortdir= Delete http://www.rotex-service.com/phppgadmin/display.php?action=confdelrowstrings=expandedpage=1key%5Bdict_name%5D=de_ispelldatabase=selina_rotexschema=publictable=pg_ts_dictreturn_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dictreturn_desc=Backsortkey=sortdir= de_ispell spell_init(text) DictFile=/usr/local/pgsql/share/contrib/dictonary/german_comb.dict, AffFile=/usr/local/pgsql/share/contrib/dictonary/german_comb.aff, StopFile=/usr/local/pgsql/share/contrib/dictonary/german.stop spell_lexize(internal,internal,integer) /NULL/ Timo Oleg Bartunov wrote: Timo, please, check you apply patch for compound word support. What is version of postgresql ? Does ispell dict works for non-compound words ? Oleg On Fri, 5 Nov 2004, Timo Haberkern wrote: Hi there, i have some troubles with my TSearch2 Installation. I have done this installation as described in http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_compound_words http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words I used the german myspell dictionary from http://lingucomponent.openoffice.org/spell_dic.html and converted it with my2ispell Nearly everything is working fine so far, except two problems: 1.) The stopword-file seems to be ignored: If i try it with SELECT to_tsvector(default_german, ein Haus) i get ein:1 haus:2 ein should be a Stopword for german (and is defined the german.stop file as well) 2.) The compound words feature doesnt work too. I have tried a lot of words, i.e. Fehlermeldung with SELECT to_tsvector(default_german, Fehlermeldung) i only get fehlermeldung:1 but i would expect fehler and meldung as seperated entries. Is there anything wrong with the dictonary or my configuration? My current configuration: pg_ts_cfg: defaultdefaultC default_russiandefaultru_RU.KOI8-R simpledefaultNULL default_germandefaultde_DE.ISO8859-1 pg_ts_cfgmap: default_germanhost{simple} default_germanhword{simple} default_germanint{simple} default_germannlhword{simple} default_germannlpart_hword{simple} default_germannlword{simple} default_germanpart_hword{simple} default_germansfloat{simple} default_germanuint{simple} default_germanuri{simple} default_germanurl{simple} default_germanversion{simple} default_germanword{simple} default_germanlpart_hword{de_ispell,german_snowball} default_germanlword{de_ispell,german_snowball} default_germanlhword{de_ispell,german_snowball} pg_ts_dict: de_ispell | 17166| DictFile=/usr/local/pgsql/share/contrib/dictonary/german.dict, AffFile=/usr/local/pgsql/share/contrib/dictonary/german.aff, StopFile=/usr/local/pgsql/share/contrib/dictonary/german.stop| 17167 | NULL german_snowball| 17357 | NULL| 17162 | Snowball stemmer for german Can anyone help me? regards Timo ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia
Re: [GENERAL] Tsearch2 trigger firing...
On Sat, 16 Oct 2004, Net Virtual Mailing Lists wrote: Hello, Thank you to Oleg for your help with this earlier! It resolved it very nicely! I still have one remaining issue which I can't figure out, perhaps best explained with an example: CREATE TABLE sometable ( titleTEXT, body TEXT, footer TEXT, all_fti TSVECTOR ); UPDATE sometable SET all_fti = setweight(to_tsvector(title), 'A') || setweight(to_tsvector(body),'B') || setweight(to_tsvector(footer), 'C'); INSERT INTO sometable (title, body, footer) VALUES ('something in the title', 'something in the body', 'something in the footer'); INSERT INTO sometable (title, body, footer) VALUES ('anything in the title', 'anything in the body', 'anything in the footer'); INSERT INTO sometable (title, body, footer) VALUES ('whatever in the title', 'whatever in the body', 'whatever in the footer'); .. What I can't figure out is how to make those last 3 inserts automatically update all_fti via a trigger... The documentation would show something like: CREATE TRIGGER sometable_update_fti BEFORE UPDATE OR INSERT ON sometable FOR EACH ROW EXECUTE PROCEDURE tsearch2(all_fti, title, body, footer); .. but as expected, this does not take into account the result of the setweight function Unfortunately, I am not very skilled with Postgres's triggers (yet) and I can't find any documentation on how to go about this... Any thoughts?. You're right, tsearch2 trigger doesn't supports tsvector as argument yet. The problem is in rather complex argument recognition. I don't know any workaround, so take a look on sources or wait until we'll have spare time to implement it. - Greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Tsearch2 trigger firing...
On Fri, 15 Oct 2004, Net Virtual Mailing Lists wrote: Hello, I have a table that uses tsearch2 and, of course, and index and trigger to keep everything updated. Something like: CREATE TABLE sometable ( id SERIAL, someinteger INTEGER sometext TEXT, sometext2TEXT, sometext3TEXT, sometext_fti TSVECTOR ); There are two issues: #1. I need a way to be able to do a full-text search any combination of sometext, sometext2, and sometext3... So far the only way I've been able .. is there some way to get the best of both worlds here or am I trying to jam a square peg into a round hole?... Hmm, probably :) There is a hint on http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_Notes under section Restricted search. #2. When doing an update on the above table, such as: UPDATE sometable SET someinteger=0, it ends up firing off the triggers which of course takes a long time to update... Is there someway to make the trigger only fire if a the field it is tied to is updated?This is not a no idea, sorry Thanks as always! - Greg ---(end of broadcast)--- TIP 8: explain analyze is your friend Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] vacuum full for all databases
On Thu, 2 Sep 2004, Richard Huxton wrote: Ilia Chipitsine wrote: Dear Sirs I'm about to write plpgsql function which will vacuum full all existing databases. Below is an example how to get list of databases. What should I write instead of raise notice ? raise notice ''datname = %'',list.datname; Something like: EXECUTE ''VACUUM FULL '' || list.datname; vacuumdb --all Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] Finally tsearch works ... somehow... remain a few
Marcel, On Thu, 19 Aug 2004, Marcel Boscher wrote: For now i am almost statisfied with my tsearch2 installation war over night somehow it seems to work, finally... what problems with installation ? Any additions,corrections to docs ? 3 probably easy questions remain... 1.) Is it possible to index already filled tables? sure. It's written many times. Check to_tsvector() function 2.)Can i have seperated indexes for different columns in a table why not, you may have separate indices or one index with different weights assigned to each column (up tp 4 columns) 3.) Can i create an extra Table just for my indexes that is not included in my data table? I don't understand this. Example, please. Seems, you need to reread documentation. http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch-V2-intro.html http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_Notes Greetings Marcel ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Finally tsearch works ... somehow... remain a few
On Thu, 19 Aug 2004, Gaetano Mendola wrote: Oleg Bartunov wrote: Marcel, On Thu, 19 Aug 2004, Marcel Boscher wrote: For now i am almost statisfied with my tsearch2 installation war over night somehow it seems to work, finally... What does have tsearch2 that htdig doesn't have ( for index document I mean ) ? Most important - index consistency and native access to document metadata. Regards Gaetano Mendola ---(end of broadcast)--- TIP 8: explain analyze is your friend Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] Installing FullTextSearchTool tsearch2
Marcel, it's very difficult from you message where do you lost. pgsql version, OS version, cut'n paste of commands you run and output would be fine. To install tsearch2 most people need (as postgresql superuser): 1. install postgresql and headers 2. cd contrib/tsearch2 3. make; make install; make installcheck Note, tsearch2 is just a plain contrib module and installation is the same as for other modules. Oleg On Wed, 18 Aug 2004, Marcel Boscher wrote: Hello everybody, i tried to J.U.S.T install the FullTextSearchTool tsearch2 under the guidiance of : http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ http://www.sai.msu.su/%7Emegera/postgres/gist/tsearch/V2/ and i'm almost losing my mind... runningagainstwallheadsfirst after i installed postgre which works fine i change to the folder /contrib/tsearch2/ there i guessed to use MAKE as it is not mentioned in the manual seemed to work... from then everything i tried to execute as guided on their website i get loads and pagelong error messages varying in many different error messages most are current transaction is aborted til end stuff over to syntax error at or near default at character 1 or could not access file$libdir/tsearch2: no such file or directory does anybody have a short instruction manual with only the syntax on how to install the fulltextcrap without 1 words around telling lies of how wonderful and easy this god made tool is? ... Gods are here to see: http://www.sai.msu.su/~megera/postgres/gist/oleg-teodor-1.jpg http://www.sai.msu.su/%7Emegera/postgres/gist/tsearch/V2/ Just a simple 10 line instruction on what to do after installing postgre the fashioned old way and now wanting to add tsearch2 Any help would be appreciated Thx in advance ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [OpenFTS-general] AW: [GENERAL] tsearch2, ispell, utf-8 and
On Wed, 21 Jul 2004, Markus Wollny wrote: Hi! -Urspr?ngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Markus Wollny Gesendet: Mittwoch, 21. Juli 2004 17:04 An: Oleg Bartunov Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Betreff: [OpenFTS-general] AW: [GENERAL] tsearch2, ispell, utf-8 and german special characters The issue with the unrecognized stop-word 'ein' which is converted by to_tsvector to 'eint' remains however. Now here's as much detail as I can provide: Ispell is Version 3.1.20 10/10/95, patch 1. I've just upgraded Ispell to the latest version (International Ispell Version 3.2.06 08/01/01), but that didn't help; by now I think it might be something to do with a german language peculiarity or with something in the german dictionary. In german.med, there is an entry ispell itself don't used in tsearch2, only dict,aff files ! eint/EGPVWX So the ts_vector output is just a bit like a wrong guess. Doesn't it evaluate the stopword-list first before doing the lookup in the Ispell-dictionary? yes. There is very usefull function for debugging I always recommend to use - ts_debug. See my notes (http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_Notes) for examples. Kind regards Markus Wollny --- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_idG21alloc_id040op?k ___ OpenFTS-general mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/openfts-general Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[GENERAL] new version of contrib/pg_trgm is available !
Hello, new version of contrib/pg_trgm (former trgm) is available for download from http://www.sai.msu.su/~megera/postgres/gist/pg_trgm One bug was fixed, upgrade is simple, no index rebuilding is required. contrib/pg_trgm module provides fuzzy search with GiST index support based on trigram statistics. Documentation kindly provided by Christopher Kings-Lynne is available from http://www.sai.msu.su/~megera/postgres/gist/pg_trgm/README.pg_trgm Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] new version of tsearch2 introduction is available
Hello, Andrew Kopciuch has submit new version of his Introduction to tsearch2, which is available from http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch-V2-intro.html He added explanation of how to backup and restore databases that features tsearch2. Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] Cache lookup failure for pg_restore?
Hi, Below is a email from Andrew Kopciuch who tested the patch and wrote some instruction. sql file is available from http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/regprocedure_update.sql Andrew, could you help Andre with his problem ? Oleg -- Date: Mon, 10 May 2004 16:24:36 -0600 From: Andrew J. Kopciuch [EMAIL PROTECTED] To: Oleg Bartunov [EMAIL PROTECTED] Subject: Re: patch for tsearch2 is available, please test Oleg: Andrew, could you, please, test patch and write instruction ? I've done some testing with this patch : I tried it myself and it works well, backup/restore works fine. I think, the main problem will be upgrade of tsearch2 and restore data. I dump sql and data separately, then install new tsearch2 with patch, created db, load tsearch2.sql, edit tsearch2 configuration and load data. After that, dump/reload should works fine. I have created an SQL file (attached) that I used to simply alter the current table definitions, and update the data prior to dumping. This way the upgrade is instant ... and from this point on ... simply dumping the database and restoring can be done in typical fashion. pg_dump dbname dbname.sql createdb dbname psql dbname dbname.sql Could you look through the file jsut to double check. It may be useful to other people for upgrading. I will go through the Introduction and make some modifications when I can later tonight ... or tomorrow. I will add sections regarding the patch, and leave the current documentation (I don't know why anyone _wouldn't_ apply the patch ... but nothing would surprise me). I will also add instructions about the dump / restore now. It's just like any other DB dump and restore procedure. Andy On Wed, 12 May 2004, Gellert, Andre wrote: Hello, Recently, we discover how to avoid problem with OIDs backup/restore in tsearch2. Check http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ for a little patch (regprocedure_7.4.patch.gz). It won't work on existed tsearch2 installation, though, but will help in future. I do have the same problem, after reimporting with the correct command sequence, with no errors (I edit the schema to get rid of duplicate functions) , ( AND: DELETE from pg_ts_dict; DELETE from pg_ts_parser ; DELETE from pg_ts_cfg; DELETE from pg_ts_cfgmap ; to avoid duplicate keys - that point is missing in the howto ?) I can connect to the DB , but cannot use the tsquery function e.g. . Example: www2=# select set_curcfg('default'); ERROR: cache lookup failed for function 138031386 www2=# select to_tsquery('cdrohling') ; ERROR: could not find tsearch config by locale www2=# select set_curcfg('default'); set_curcfg (1 row) www2=# select to_tsquery('cdrohling') ; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. ! --- I think this is the same problem. The patch is for 7.4.[0|1|2] , not 7.4, i guess. Is there a chance to rebuild a backup from a DB , which runs tsearch2 with the unpatched tsearch2-schema ? My idea: Isn't this a thing to be mentioned in the docs ? Restoring a DB could be a point :-) Andre ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] Cache lookup failure for pg_restore?
Recently, we discover how to avoid problem with OIDs backup/restore in tsearch2. Check http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ for a little patch (regprocedure_7.4.patch.gz). It won't work on existed tsearch2 installation, though, but will help in future. Oleg On Sun, 9 May 2004, Denis Braekhus wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Razvan Surdulescu wrote: | After I restore a Postgres database (using pg_restore), I get the | following error message when I try to run a simple UPDATE query: | | ERROR: cache lookup failed for function 70529 Hi Razvan, Just to add to what Tom has already said, this is most certainly because of your tsearch/gist usage. Check the Tsearch2 site [1] for a lot of interesting documentation on Tsearch2. For a quite nice howto on backups and restores of databases with Tsearch2 see the Tsearch2 Intro document [2] There is actually now a patch [3] to tsearch (only for 7.4 though) which is supposed to improve dumping and reloading of tsearch2 databases. [1] http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ [2] http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch-V2-intro.html [3] http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/regprocedure_7.4.patch.gz Best Regards - -- Denis -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.2-nr2 (Windows XP) iD8DBQFAniUSvsCA6eRGOOARAtJpAKCt4Wcrea3bIxu8fXw/5ZNFACdohwCfZPDf UuCk1dXLx8SCS4/qMniC2z4= =871m -END PGP SIGNATURE- ---(end of broadcast)--- TIP 8: explain analyze is your friend Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] Postgre and Web Request
On Thu, 29 Apr 2004, Tatsuo Ishii wrote: Depending on your web development environment (java, php, .NET) etc, you should be able to use some mechanism that will provide a pool of connections to the database. Each request does not open a new connection (and then release it), but insteads gets a connection from the pool to use, and returns it back to the pool when done. Where can I find some examples for connection pooling with php? Or must I just use persistence connections? Use pgpool (ftp://ftp.sra.co.jp/pub/cmd/postgres/pgpool/pgpool-1.1.tar.gz). Tatsuo, I just tried pgpool (not replication yet) and noticed warnings appear in pgsql.log: (I just changed port number in my perl script to ) Apr 29 19:19:59 mira postgres[363]: [4-1] WARNING: there is no transaction in progress Oleg -- Tatsuo Ishii ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] making tsearch2 dictionaries
On Tue, 17 Feb 2004, Ben wrote: On Tue, 2004-02-17 at 03:15, Oleg Bartunov wrote: Do you want '100' or 'hundred' will be fully equivalent ? So, if you search '100' you will find document with 'hundred'. Interesting, that you will find '123', because '123' will be 'one hundred twenty three'. Yeah, for a general case of documents I'm not sure how accurate it would make things, but I'm trying to index music artist names and song titles, where I'd get things like 3 Dog Night or is that Three Dog Night? :) What's the problem ? You may configure which dictionaries and in what order should be used for given type of token (pg_ts_cfgmap table). Aha, I got your problem: Once word is recognized by synonym dictionary it will not pass to next dictionary ! This is how tsearch2 is working with any dictionary. Yep, that's my problem. :) And it seems that if I could pass the normal words into an ispell dictionary before passing them on to the en_stem dictionary, I'd get spell checking for free. Unless there's a better way to give did you mean: your search spelled correctly? results? If ispell dictionary recognizes a word, that word will not pass to en_stem. We know how to add query spelling feature to tsearch2, just waiting for sponsorships :) meanwhile, you could use our trgm module, which implements trigram based spelling correction. You need to maintain separate table with all words of interests (say, from tsvectors) and search query words in that table using bestmatch finction. I know doing this would increase the size of the generated ts_vector, but for my case, where what I'm indexing is generally only a few words anyway, that's not an issue. As it is, I'm already going to get rid of the stop words file, so that I can actually find things like The Who. How hard do you think it would be to change up the behavior to make this happen? I What do you want from parser ? I want to be able to recognize symbols, such as the degree () and vulgar half () symbols. You mean '(TA)', '(TH)' ? I think it's not very difficult. What'd be a token type ( parenthesis_word :?) Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] making tsearch2 dictionaries
On Tue, 17 Feb 2004, Ben wrote: On Tue, 17 Feb 2004, Oleg Bartunov wrote: If ispell dictionary recognizes a word, that word will not pass to en_stem. We know how to add query spelling feature to tsearch2, just waiting for sponsorships :) meanwhile, you could use our trgm module, which implements trigram based spelling correction. You need to maintain separate table with all words of interests (say, from tsvectors) and search query words in that table using bestmatch finction. Hm, I'll take a look at this approach. I take it you think piping dictionary output to more dictionaries in the chain is a bad idea? :) it's unpredictable and I still don't get your idea of pipilining, but in general, I have nothing agains it. What do you want from parser ? I want to be able to recognize symbols, such as the degree () and vulgar half () symbols. You mean '(TA)', '(TH)' ? I think it's not very difficult. What'd be a token type ( parenthesis_word :?) uh, not sure how you got (TA) and (TH)... if you look at the original message with utf-8 unicode encoding, the sympols come out fine. Or, maybe you'd just have better luck pointing a browser at a page like Yup:) http://homepages.comnet.co.nz/~r-mahoney/bca_text/utf8.html. I want to be able to recognize a subset of these symbols, and I'd want another dictionary I'd make to handle the symbol token to return both the symbol and the common name as lexemes, in case people spell out the symbol instead of entering it. Aha, the same way as we handle complex words with hyphen - we return the whole word and its parts. So you need to introduce new type of token in parser and use synonym dictionary which in one's turn will returns the symbol token and human readable word. Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] PostgreSQL License
PostgreSQL has BSD license. Tha means do what do you want :) Oleg On Tue, 10 Feb 2004, Artemy wrote: What about the PostgreSQL License. If I use the database for the commercial purpose, Do I have to purchase the license?. P.S. For example MySQL Databse have the commercial license for that. Best Regards Artjom Smekalin ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Search across multiple sources
Why not use schema and single search table contains indices from different schemes (use trigger to update search table) On Sun, 1 Feb 2004, Merrall, Graeme wrote: I don't think there's an easy way to do this but I thought I better ask just in case. I'm trying to come up with a way to search across a number of databases without resorting to lots of horrible scripts. In one database I have a lot of news stories from our news provider while in another database I have a lot of user entered content. Ideally I'd like to search across both databases via a single web-based search form. The obvious way is to create a tsearch index/table in both databases and then to connect to each one in turn and to merge the results together but that doesn't seem like the best solution and potentially there could be issues with ranking and so on. Is it possible to create a single search database which can store search data from a number of different databases. This is all on the same server of course. Another option I looked at was to use an external search tool lke mnogosearch (http://www.mnogosearch.ru/doc/msearch-extended-indexing.html#htdb) although since I've used tsearch elsewhere it would be nice to use it here as well. Cheers, Graeme ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] New PostgreSQL search resource
On Fri, 16 Jan 2004, Joshua D. Drake wrote: Hello, Took an hour today and made the 7.3.4, 7.4.1 and Practical PostgreSQL documentation all searchable using OpenFTS and Tsearch2. You can take a look at: http://www.commandprompt.com/community/ I'd appreciate if you mention somewhere OpenFTS utilization, so people could recognize it. But I'm not insisting ;) Sincerely, Joshua Drake Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Hierarchical queries
Look at contrib/ltree http://www.sai.msu.su/~megera/postgres/gist/ltree Oleg On Fri, 9 Jan 2004 [EMAIL PROTECTED] wrote: Hello everybody! Does someone know how to build hierarchical queries to the postgresql? I have a table with tree in it (id, parent) and need to find a way from any point of the tree to any other point. And i would like to have a list of all steps from point A to point B to make some changes on each step (this is required by the algorythm). Here is an example: treetable (where tree is stored): idparent data int4 int4 varchar(255) 0 0root 1 0root's chield 1 2 0root's chield 2 3 1root's chield 1 chield 1 4 1root's chield 1 chield 2 5 2root's chield 2 chield 1 6 2root's chield 2 chield 2 And i want to get something like this: start point root's chield 2 chield 2 finish root's chield 1 chield 1 And the result i need: idparent data 6 2root's chield 2 chield 2 2 0root's chield 2 0 0root 1 0root's chield 1 4 1root's chield 1 chield 2 i know that it is possible in Oracle but what about postgres? Best regards, Anton Nikiforov ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] [GENERAL] Announce: Search PostgreSQL related resources
On Tue, 6 Jan 2004, Rajesh Kumar Mallah wrote: Hi, Could you please tell how the did you mean feature was implemented when the serach term has a typo. it's based on trigrams similarity and words statistics. The search engine is good . Regds mallah. Oleg Bartunov wrote: Hi there, I'm pleased to present pilot version of http://www.pgsql.ru - search system on postgresql related resources. Currently, we have crawled 27 sites, new resources are welcome. It has multi-languages interface (russian, english) but more languages could be added. We plan to add searchable archive of mailing lists (a'la fts.postgresql.org), russian documentation and WIKI for online documentation, tips, etc. We are welcome your feedback and comments. We need design solution, icons. This project is hosted at Sternberg Astronomical Institute, Moscow University and supported by Russian Foundation for Basic Research and Delta-Soft LLC. Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] website doc search is extremely SLOW
Try www.pgsql.ru. I just released pilot version with full text searching postgresql related resources. Search for security invoker takes 0.03 sec :) Oleg On Thu, 1 Jan 2004, ezra epstein wrote: Yup, So slow in fact that I never use it. I did once or twice and gave up. It is ironic! I only come to the online docs when I already know the where part of my search and just go to that part or section. For everything else, there's google! SECURITY INVOKER site:postgresql.org Searched pages from postgresql.org for SECURITY INVOKER. Results 1 - 10 of about 141. Search took 0.23 seconds. Ahhh, that's better. Or use site:www.postgresql.org to avoid the archive listings, etc. == Ezra Epstein D. Dante Lorenso [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Trying to use the 'search' in the docs section of PostgreSQL.org is extremely SLOW. Considering this is a website for a database and databases are supposed to be good for indexing content, I'd expect a much faster performance. I submitted my search over two minutes ago. I just finished this email to the list. The results have still not come back. I only searched for: SECURITY INVOKER Perhaps this should be worked on? Dante ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] [HACKERS] Announce: Search PostgreSQL related resources
On Mon, 5 Jan 2004, Marek Lewczuk wrote: Dave Cramer wrote: connection failed :( works for me... :-) (poland) We have small downtime because of upgrading server software, so this may be a reason for the problem. We're in stage of optimizing crawler because some sites are very-very ugly, for example, our crawler have discovered 2 millions URLs on http://ems-hitech.com/pgmanager/ ! 99.99 % of URLs are just 404 (document not found), but server does return 200 code )\:) So we have to explicitly exclude these pages. btw, archives.postgresql.org doesn't returns modification date in header. This prevent crawler to optimize downloading process. So, there are many problems, but we hope soon we'll tune crawling process. I estimate average time to refresh index about 1 week. Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[GENERAL] Announce: Search PostgreSQL related resources
Hi there, I'm pleased to present pilot version of http://www.pgsql.ru - search system on postgresql related resources. Currently, we have crawled 27 sites, new resources are welcome. It has multi-languages interface (russian, english) but more languages could be added. We plan to add searchable archive of mailing lists (a'la fts.postgresql.org), russian documentation and WIKI for online documentation, tips, etc. We are welcome your feedback and comments. We need design solution, icons. This project is hosted at Sternberg Astronomical Institute, Moscow University and supported by Russian Foundation for Basic Research and Delta-Soft LLC. Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: Mnogosearch (Was: Re: [GENERAL] website doc search is ... )
On Thu, 1 Jan 2004, Marc G. Fournier wrote: On Thu, 1 Jan 2004, Bruce Momjian wrote: Marc G. Fournier wrote: 186_archives=# \d ndict7 Table public.ndict7 Column | Type | Modifiers -+-+ url_id | integer | not null default 0 word_id | integer | not null default 0 intag | integer | not null default 0 Indexes: n7_url btree (url_id) n7_word btree (word_id) The slowdown is the LIKE condition, as the ndict[78] word_id conditions return near instantly when run individually, and when I run the 'url/LIKE' condition, it takes forever ... Does it help to CLUSTER url.url? Is your data being loaded in so identical values used by LIKE are next to each other? Just tried CLUSTER, and no difference, but ... chat'd with Dave on ICQ this evening, and was thinking of something ... and it comes back to something that I mentioned awhile back ... Taking the ndict8 query that I originally presented, now post CLUSTER, and an explain analyze looks like: QUERY PLAN --- Hash Join (cost=13918.23..26550.58 rows=17 width=8) (actual time=4053.403..83481.769 rows=13415 loops=1) Hash Cond: (outer.url_id = inner.rec_id) - Index Scan using n8_word on ndict8 (cost=0.00..12616.09 rows=3219 width=8) (actual time=113.645..79163.431 rows=15533 loops=1) Index Cond: (word_id = 417851441) - Hash (cost=13913.31..13913.31 rows=1968 width=4) (actual time=3920.597..3920.597 rows=0 loops=1) - Seq Scan on url (cost=0.00..13913.31 rows=1968 width=4) (actual time=3.837..2377.853 rows=304811 loops=1) Filter: ((url || ''::text) ~~ 'http://archives.postgresql.org/%%'::text) Total runtime: 83578.572 ms (8 rows) Now, if I knock off the LIKE, so that I'm returning all rows from ndict8, join'd to all the URLs that contain them, you get: QUERY PLAN --- Nested Loop (cost=0.00..30183.13 rows=3219 width=8) (actual time=0.299..1217.116 rows=15533 loops=1) - Index Scan using n8_word on ndict8 (cost=0.00..12616.09 rows=3219 width=8) (actual time=0.144..458.891 rows=15533 loops=1) Index Cond: (word_id = 417851441) - Index Scan using url_rec_id on url (cost=0.00..5.44 rows=1 width=4) (actual time=0.024..0.029 rows=1 loops=15533) Index Cond: (url.rec_id = outer.url_id) Total runtime: 1286.647 ms (6 rows) So, there are 15333 URLs that contain that word ... now, what I want to find out is how many of those 15333 URLs contain 'http://archives.postgresql.org/%%', which is 13415 ... what's the need for such query ? Are you trying to restrict search to archives ? Why not just have site attribute for document and use simple join ? The problem is that right now, we look at the LIKE first, giving us ~300k rows, and then search through those for those who have the word matching ... is there some way of reducing the priority of the LIKE part of the query, as far as the planner is concerned, so that it will resolve the = first, and then work the LIKE on the resultant set, instead of the other way around? So that the query is only checking 15k records for the 13k that match, instead of searching through 300k? I'm guessing that the reason that the LIKE is taking precidence(sp?) is because the URL table has less rows in it then ndict8? Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 8: explain analyze is your friend Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] tsearch2 column update produces word too long error
On Fri, 21 Nov 2003, Markus Wollny wrote: Hello! I'm currently testing deployment of tsearch2 on our forum table. The table is huge in itself - some 2GB of data without the indexes. I have got PostgreSQL 7.4RC2 running on a test machine, installed tsearch2 to my database, added the new column to the table and tried to update it in the recommended fashion: UPDATE ct_com_board_message SET ftindex=to_tsvector('default',coalesce(user_login,'') ||' '|| coalesce(title,'') ||' '|| coalesce(text,'')); It does run for a while but at some point I get ERROR: word is too long; I guess that this is caused by some idiot user(s) writing some Joycean nonsense (but most probably without the literary value) or drivelling about their holidays in Llanfairpwllgwyngyllgogerychwyrndrobwantysiliogogogoch (Wales). Now what could I do in order to intercept this error? Word length is limited by 2K. What's exactly the word tsearch2 complained on ? 'Llanfairpwllgwyngyllgogerychwyrndrobwantysiliogogogoch' is fine :) btw, don't forget to configure properly dictionaries, so you don't have a lot of unique words. Kind regards Markus ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] tsearch2 column update produces word too long
On Fri, 21 Nov 2003, Markus Wollny wrote: Hello! Von: Oleg Bartunov [mailto:[EMAIL PROTECTED] Gesendet: Freitag, 21. November 2003 13:06 An: Markus Wollny Cc: [EMAIL PROTECTED] Word length is limited by 2K. What's exactly the word tsearch2 complained on ? 'Llanfairpwllgwyngyllgogerychwyrndrobwantysiliogogogoch' is fine :) This was a silly example, I know - it is a long word, but not too long to worry a machine. The offending word will surely be much longer, but as a matter of fact, I cannot think of any user actually typing a 2k+ string without any spaces in between. I'm not sure on which word tsearch2 complained, it doesn't tell and even logging did not provide me with any more detail: 2003-11-21 14:06:44 [26497] ERROR: 42601: word is too long LOCATION: parsetext_v2, ts_cfg.c:294 STATEMENT: UPDATE ct_com_board_message SET ftindex=to_tsvector('default',coalesce(user_login,'') ||' '|| coalesce(title,'') ||' '|| coalesce(text,'')); Is there some way to find the exact position? I'm afraid you need to hack ts_cfg.c:294 yourself to print the word which's bugging you :) btw, don't forget to configure properly dictionaries, so you don't have a lot of unique words. I won't forget that; I justed wanted to run a quick-off first test before diving deeper into Ispell and other issues which are as yet a bit of a mystery to me. Kind Regards Markus Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] tsearch2 and gist index bloat
On Thu, 6 Nov 2003, George Essig wrote: Thanks for the reply. For this project, I can update the data and reindex during off-peak hours. I was just surprised to see the size of the index double after heavy write activity. This is not tsearch specific problem. It was discussed several times, ] see index bloat subject in archives. Oleg George Essig --- Joshua D. Drake [EMAIL PROTECTED] wrote: Hello, I don't know if you can do this with a gist index but try using the REINDEX command. J George Essig wrote: --- George Essig [EMAIL PROTECTED] wrote: I have installed tsearch2 and have noticed that the gist index used to do searches grows and grows as I update rows, delete rows, or run VACUUM FULL ANALYZE. Below are some details: There are 110,873 rows in this table and 13398 unique words indexed by ts_in. Using oid2name, I monitored the size of the index ts_in as I performed different operations: 154 MB After the index was created. 190 MB After updating 40,422 rows. 243 MB After VACUUM FULL 275 MB After deleting 40,422 rows again VACUUM FULL Sorry, I mis-reported the index sizes. They are about 1/10 the size: 15 MB After the index was created. 19 MB After updating 40,422 rows. 24 MB After VACUUM FULL 27 MB After deleting 40,422 rows again VACUUM FULL I still have a problem that the index size grows and grows and eventually searches slow to a crawl. George Essig ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 8: explain analyze is your friend Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] tsearch2 and aspell
On Mon, 20 Oct 2003, Pavel Stehule wrote: Hello Can I use tsearch2 with aspell? I didn't find any info about it, and I don't know anything about difference between ispell and aspell. there is one difference between ispell and aspell dictionaries - ispell has affix file which contains rules for word-formation, so we could use it for morphology (well, sort of). AFAIK, aspell has no affix compression. In tsearch2 we're trying to solve rather opposite task than spell checkers. Thank You Pavel Stehule ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Tsearch2 Causing Backend Crash
Hmm, it's weird. Could you provide us with backtrace ? Oleg On Tue, 30 Sep 2003 [EMAIL PROTECTED] wrote: After applying the patches supplied so far and also trying the lastest stable tar.gz for tsearch2 ( downloaded 24th of september) I am still experiencing the same issue as previously described: I try to do a SELECT to_tsvector( 'default', 'some text' ) The backend crashes. SELECT to_tsvector( 'default', string of whitespace ) does not crash Any more advice or suggestions? Thanks! Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] PostgreSQL versus MySQL
On Thu, 18 Sep 2003, Joshua D. Drake wrote: Hello, I think the below just about says it all: http://www.commandprompt.com/images/mammoth_versus_dolphin_500.jpg Cool ! Sincerely, Joshua Drake Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] PostgreSQL versus MySQL
On Thu, 18 Sep 2003, Andrew L. Gould wrote: On Thursday 18 September 2003 04:45 pm, Scott Holmes wrote: Andrew L. Gould wrote: On Thursday 18 September 2003 04:04 pm, Sean Chittenden wrote: I think the below just about says it all: http://www.commandprompt.com/images/mammoth_versus_dolphin_500.jpg Not exactly the kind of image I'd like to project, especially since I care about dolphins (at least non-iconified dolphins) We're among friends; and, quite frankly, I needed a good laugh today. I don't think any of us plan run the image up a flag pole. Exactly my impression. Good laugh and gigh spirits for this day. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Trying to create a GiST index in 7.3
Hi there, I'm back from vacation and clearing my mbox. I intended to write documentation about GiST, but other things grab attention :) There is quite short intro in Russian http://www.sai.msu.su/~megera/postgres/gist/doc/gist-inteface-r.shtml and a bunch of GiST modules http://www.sai.msu.su/~megera/postgres/gist/ so you may learn by examples. Oleg On Wed, 17 Sep 2003, Christopher Murtagh wrote: On Fri, 2003-08-08 at 16:08, Tom Lane wrote: Dmitry Tkach [EMAIL PROTECTED] writes: I am trying to create a custom GiST index in 7.3, but getting an error, ... I have done all the setup that was required in 7.2.4: You should not be using the 7.2 methods anymore --- there is a CREATE OPERATOR CLASS, use that instead. (See the contrib gist classes for examples.) I'm having the same problem as Dmitry, but I've been unable to find a solution. I've looked everywhere googleable for info on setting up GiST indexes, but haven't found any info that doesn't look like post-doc papers on the theory of indexability. I'd be happy with an RTFM response, if I could just find TFM. :-) Any info would be much appreciated. Cheers, Chris Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] tsearch2 in 7.4beta1 compile problem
On Thu, 14 Aug 2003, Tom Lane wrote: Oleg Bartunov [EMAIL PROTECTED] writes: On Wed, 13 Aug 2003, Jeff Davis wrote: I cd to the tsearch2 directory and typed make, however I get an error that yy_current_buffer is an undeclared identifier in wordparser/parser.c (which is apparently autogenerated with flex from parser.l). This is a FAQ. Don't use flex 2.5.31 Downgrade to stable 2.5.4. Still, it would be better if it worked than not. (All the core code does seem to work with flex 2.5.31 now; only contrib is behind.) ok. I recall discussion several months ago about 2.5.31 version. So, we oficially support it ? AFAICT tsearch2's incompatibility is in the redefined YY_INPUT macro, which seems of no value for Postgres anyway. Can't we take that out? We'll see. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] tsearch2 in 7.4beta1 compile problem
On Fri, 15 Aug 2003, Tom Lane wrote: Oleg Bartunov [EMAIL PROTECTED] writes: On Thu, 14 Aug 2003, Tom Lane wrote: Still, it would be better if it worked than not. (All the core code does seem to work with flex 2.5.31 now; only contrib is behind.) ok. I recall discussion several months ago about 2.5.31 version. So, we oficially support it ? I wouldn't say that, exactly --- if anyone has any problems with 2.5.31 I'll be the first to say use 2.5.4. (2.5.31 doesn't even compile on my primary machine.) But I assume the flex guys will fix their little problems soon, and that before PG 7.4 reaches end of life the newer flex behavior will be standard. So I think it behooves us to update our code to be compatible. The core code all works with either 2.5.4 or 2.5.31 now, and I'd like to see contrib doing the same. (cube and seg are broken, but I'll work on fixing those if you'll take care of tsearch and tsearch2.) ok, I see your arguments. Teodor is working on that issue. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] Tsearch limitations
On Mon, 11 Aug 2003 [EMAIL PROTECTED] wrote: Oleg, I understand (i think) how the parser breaks up the input into words and builds ts_vector's. And i understand how to do queries as described into the documentation. (I have read it!) SELECT * FROM vectors WHERE vector @@ to_tsquery('(leads|forks) ! crawl') But i haven't seen any mention of if i add the word: cathedral if there is any query which will match if I search for thed. No, tsearch2 is a word oriented search. It doesn't supports substring search. The documentation seems to say that this cannot be done - but i'd just like to check. Tsearch2 does everything i want except this. remember that the search operator @@ finds only exact matches between query lexemes and vector lexemes if they are not exactly the same string, they will not be considered a match Mat, there are several function you may use to see (please, read documentation): apod=# select to_tsvector('Hi my email addres is [EMAIL PROTECTED]' ); to_tsvector 'hi':1 'addr':4 'email':3 '[EMAIL PROTECTED]':6 (1 row) or, even better apod=# select * from ts_debug('Hi my email addres is [EMAIL PROTECTED] com'); ts_name | tok_type | description |token | dict_name |tsvector -+--+-+--+ ---+ default_russian | lword| Latin word | Hi | { en_stem} | 'hi' default_russian | lword| Latin word | my | { en_stem} | default_russian | lword| Latin word | email| { en_stem} | 'email' default_russian | lword| Latin word | addres | { en_stem} | 'addr' default_russian | lword| Latin word | is | { en_stem} | default_russian | email| Email | [EMAIL PROTECTED] | { simple} | '[EMAIL PROTECTED]' (6 rows) You may write your own parser or preprocess text before tsearch. Oleg On Mon, 11 Aug 2003, Mat wrote: Can Tsearch be used to return substring matches? i.e Text to search: Hi my email addres is [EMAIL PROTECTED] Query psql would match the email address? Query mail would also match? Query reeu would also match? Or is tsearch not suitable for this type of query? should i use FTI instead? Thanks. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] german tsearch in 7.2.4
Latest version of tsearch which works with 7.2.4 is in contrib directory. German stemmer and stop words could be downloaded from http://snowball.tartarus.org/german/stemmer.html I'm afraid this version of tsearch requires a lot of work :) New version is available from www.sai.msu.su/~megera/postgres/gist/tsearch and it works with 7.3.X and above Oleg On Tue, 22 Jul 2003, Thomas Beutin wrote: Hello, what is the latest version of tsearch working in postgresql 7.2.4 (unfortunately i cannot upgrade at this time) and how can i add support for the german language to this version? Are german files (stopwords etc.) ready for download out there? Can i use german and english support of this version of tsearch at the same time in the same database? Any help welcome! Thanks in advance, -tb Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[GENERAL] Re: [ANNOUNCE] Request for speakers at O'Reilly conference
This is a good opportunity to promote PostgreSQL and I think we should utilize it. I'm not sure I'll have a chance to take part in the conference, but I'd like somebody describe GiST extension and presents tutorial how to use it. We could prepare some materials and examples from real life. I think Gene Selkov might be a good person Regards, Oleg On Sun, 4 Feb 2001, Bruce Momjian wrote: I have been asked to help prepare a list of PostgreSQL speakers for a future O'Reilly Open Source conference. If you are interested in being a presenter, please see the following URL: http://candle.pha.pa.us/oreilly/ Submissions are due Febuary 17th. Sorry for the short notice. Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Re: [HACKERS] Re: [GENERAL] How do I activate and change the postgres user's password?
Hi, followin this thread, I think It would be useful to allow user to connect to database he owned (created) without password even if pg_hba.conf is configured with password requirement to this database. Or owner of database could maintain list of users/groups whom he granted trusted connection. After user connects usual grant priviliges could works. Currently it's a pain to work with authentification system - I have to input my password every time I use psql and moreover I had to specify it in perl scripts I developed. Sometimes it's not easy to maintain secure file permissions espec. if several developers share common work. Any user (even not postgres user) could use stealed password to connects to your database. In my proposal, security is rely on local login security. You already passed password control. There are another checks like priviliges. You write your scripts without hardcoded passwords ! Of course this could be just an option in case you need "paranoic" security. Having more granulated privilege types as Mysql does would only make my proposal more secure. You're allowed to connect, but owner of database could restrict you even list of tables, indices et. all. Regards, Oleg PS. I didn't find any plans to improve authen. in TODO On Wed, 13 Oct 1999, Peter Eisentraut wrote: Date: Wed, 13 Oct 1999 21:56:15 +0200 (CEST) From: Peter Eisentraut [EMAIL PROTECTED] To: Lincoln Yeoh [EMAIL PROTECTED] Cc: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: [HACKERS] Re: [GENERAL] How do I activate and change the postgres user's password? On Oct 13, Lincoln Yeoh mentioned: Then I have problems logging in as ANY user. Couldn't figure out what the default password for the postgres user was. Only after some messing around I found that I could log on as the postgres user with the password \N. Not obvious, at least to me. There is a todo item for the postgres user to have a password by default. I'm not sure though how that would be done. Probably in initdb. (?) I only guessed it after looking at the pg_pwd file and noticing a \N there. Is this where the passwords are stored? By the way should they be stored in the clear and in a 666 permissions file? How about hashing them with some salt? I had this on my personal things-to-consider-working-on list but I don't see an official todo item. I am personally not sure why this is not done but authentication and security are not most people's specialty around here. (including me) 1) There is no obvious way to specify the password for users when you create a user using the supplied shell script createuser. One has to resort to psql and stuff. Aah. Another misguided user. Some people are of the opinion that using the createuser scripts is a bad idea because it gives you the wrong impression of how things work. (All createuser does is call psql.) Of course, we could somehow put a password prompt in there, I'll put that on the above mentioned list. 2) Neither is there an obvious and easy way to change the user's password. alter user joe with password "foo"; I'm not sure how obvious it is but it's certainly easy. 3) You can specify a password for a user by using pg_passwd and stick it into a separate password file, but then there really is no link between createuser and pg_passwd. This shows how bad the idea of the scripts was in the first place. I find the bundled scripts and their associated documentation make things very nonintuitive when one switches from a blind trust postgres to an authenticated postgres. So that would put your vote in the "drop altogether" column? Voting is still in progress! -Peter -- Peter Eisentraut Sernanders vaeg 10:115 [EMAIL PROTECTED] 75262 Uppsala http://yi.org/peter-e/Sweden _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
[GENERAL] Re: [INTERFACES] problem with LOAD
This is a known problem with FreeBSD-3.1 and PostgreSQL I had too with 6.5 cvs version. I suppose you use FreeBSD elf ? Take a look to ports for freebsd-elf specific patches or just change src/Makefile.shlib ifeq ($(PORTNAME), freebsd) ifdef BSD_SHLIB install-shlib-dep := install-shlib ifdef ELF_SYSTEM shlib := lib$(NAME)$(DLSUFFIX).$(SO_MAJOR_VERSION) LDFLAGS_SL:= -x -shared -soname $(shlib) else shlib := lib$(NAME)$(DLSUFFIX).$(SO_MAJOR_VERSION).$(SO_MINOR_VERSION) LDFLAGS_SL:= -x -Bshareable -Bforcearchive endif CFLAGS += $(CFLAGS_SL) endif endif and makefiles/Makefile.freebsd ifdef ELF_SYSTEM LDFLAGS+= -export-dynamic endif %.so: %.o ifdef ELF_SYSTEM $(LD) -x -shared -o $@ $ else $(LD) -x -r -o $.obj $ @echo building shared object $@ @rm -f $@.pic @${AR} cq $@.pic order $.obj | tsort ${RANLIB} $@.pic @rm -f $@ $(LD) -x -Bshareable -Bforcearchive -o $@ $@.pic endif This works for me ! Regards, Oleg PS. These patches are already applied to current 6.5 sources. On Thu, 20 May 1999, abdelkrim wrote: Date: Thu, 20 May 1999 18:06:30 + From: abdelkrim [EMAIL PROTECTED] To: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: [INTERFACES] problem with LOAD hello every body i have some problem with LOAD command postgres= LOAD '/usr/local/pgsql/complex.so'; pqReadData() -- backend closed the channel unexpectedly. This probably means the backend terminated abnormally before or while processing the request. We have lost the connection to the backend, so further processing is impossible. Terminating. $ I build complex.so by: gcc -I../include -I../backend -O2 -m486 -pipe -Wall -Wmissing-prototypes -I../interfaces/libpq -I../../include -c complex.c -o complex.o ld -x -r -o complex.o.obj complex.o ranlib complex.so.pic ld -x -Bshareable -o complex.so complex.so.pic I use FreeBSD-3.1 with PostgreSQL 6.4.2 thanks _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83