[GENERAL] tsearch dictionary list?
Hi all. I'm new to postgresql world, and I have to extend an existing product for Thailand - that product has some features based on tsearch, and I was wondering if there is an existing dictionary for that language... I failed to find any reference of such dictionary on the web, and of course I don't speak thailandese at all! In fact, is there somewhere a list of existing tsearch dictionaries? Are japanese, chinese, or other "exotic" languages supported by tsearch? Thanks for your attention, Daniel Chiaramello -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Using ISpell dictionary - headaches...
Hi everybody. Well... I have a problem when trying to install and use an ISpell dictionary (the Thai one to be more precise) with the tsearch feature. _What I am trying to do_ I have a table containing a "title" field, and I want to fill a "vector" field with the following command: *UPDATE thai_table SET vectors = to_tsvector('thai_utf8', coalesce(title,''));* _How I installed the Thai dictionary_ I installed the "th_TH.dic" and the "th_TH.aff" files (downloaded from http://wiki.services.openoffice.org/wiki/Dictionaries) in a "/usr/local/share/dicts/ispell/" folder, and I executed the following commands: SET search_path = public; BEGIN; INSERT INTO pg_ts_dict (dict_name, dict_init, dict_initoption, dict_lexize, dict_comment) VALUES ( 'th_spell_utf8', 'spell_init(internal)', 'DictFile="/usr/local/share/dicts/ispell/th_TH.dic",AffFile="/usr/local/share/dicts/ispell/th_TH.aff"', 'spell_lexize(internal,internal,integer)', 'Thai ISpell dict utf8 encoding' ); INSERT INTO pg_ts_cfg (ts_name, prs_name, locale) VALUES ('thai_utf8', 'default', 'th_TH.utf8'); INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'email', '{simple}'); INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'url', '{simple}'); INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'host', '{simple}'); INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'sfloat', '{simple}'); INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'version', '{simple}'); INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'uri', '{simple}'); INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'file', '{simple}'); INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'float', '{simple}'); INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'int', '{simple}'); INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'uint', '{simple}'); INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'lword', '{th_spell_utf8,simple}'); INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'nlword', '{th_spell_utf8,simple}'); INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'word', '{th_spell_utf8,simple}'); INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'part_hword', '{th_spell_utf8,simple}'); INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'nlpart_hword', '{th_spell_utf8,simple}'); INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES ('thai_utf8', 'lpart_hword', '{th_spell_utf8,simple}'); COMMIT; _What my problem is_ The problem is that, when i execute the request to fill my "vectors" field, psql crashes... la connexion au serveur a été coupée à l'improviste Le serveur s'est peut-être arrêté anormalement avant ou durant le traitement de la requête. La connexion au serveur a été perdue. Tentative de réinitialisation: Echec. !> (it means: the connection with the server has been cut unexpectedly. The server may have stop abnormaly before or during the request handling. The connection with the server has been lost. Trying to reinitialization: Failed) I have no idea on what may cause that, nor what I could look for to find idea on how to solve that. It *may* be because I'm using psql 8.0.3 and not the latest version (but I'm stucked with that version), i'm just hoping that one of you have met similar problem and have successfully solved it, or maybe if you know a site where an Ispell dictionary installation is detailed step by step so that I can check if I did something wrong somewhere... Many thanks for your attention, Daniel Chiaramello
[GENERAL] [TextSearch] syntax error while parsing affix file
Hello everybody. I am using Postrges 8.3.5, and I am trying to install a bulgarian ISpell dictionary (the OpenOffice one) for Textsearch features. I converted the dictionary encoding to UTF-8, and I installed it in the "tsearch_data" folder. But when I try to create the dictionary, I have a syntax error: CREATE TEXT SEARCH DICTIONARY bulgarian_ispell ( TEMPLATE = ispell, DictFile = bulgarian_utf8, AffFile = bulgarian_utf8, StopWords = english ); ERREUR: erreur de syntaxe CONTEXTE : ligne 24 du fichier de configuration « /usr/share/pgsql/tsearch_data/bulgarian_utf8.affix » : « . > А » (it means ERROR: syntax error, CONTEXT: line 24 of configuration file ...) Extract of the file arount that line: flag *A: . > А (this is line 24) . > АТА . > И . > ИТЕ The file has Unix end_of_lines (I suspected something like that since the "CONTEXT" error line was split on 2 lines). I'm really lost on how I can go further with the bulgarian dictionary... Could you help me, please? Thanks for your attention! Daniel Chiaramello -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] [TSearch2] Chinese dictionary?
Hello. I am looking for a Chinese dictionary for TSearch2 Postgresql feature, with no success yet. I found a reference on a chinese webpage: http://code.google.com/p/nlpbamboo/wiki/TSearch2 With the help of translate.google.com, I thought managed to figure out what to do: I installed "nlpbamboo" on my Redhat server, but I failed to follow the next steps because of compilation errors trying to make "postgres/bamboo" Anyways, even if i had successfully followed the different steps described, I would still have lacked the most important part: the dictionary files themselves... Has one of you managed to instaleld a chinese TSearch2 dictionary on Postgres? Thanks for your attention, Daniel Chiaramello -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [TextSearch] syntax error while parsing affix file
Teodor Sigaev a écrit : iconv -f windows-1251 -t utf-8 bulgarian.dic >bulgarian_utf8.dict iconv -f windows-1251 -t utf-8 bulgarian.aff >bulgarian_utf8.affix The locale of the database is fr_FR, and its encoding is UTF8. I believe that characters 'И', 'А' (non-ascii) and other cyrillic ones are not acceptable for french locale :( I was able to install a thailandese dictionary - why would such dictionary be ok and not a bulgarian one? Which locale should I use to enable my database to be multi-language compatible? I would never have suspected a locale problem... Ouch! Daniel -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [TextSearch] syntax error while parsing affix file
Teodor Sigaev a écrit : I am using Postrges 8.3.5, and I am trying to install a bulgarian ISpell dictionary (the OpenOffice one) for Textsearch features. flag *A: . > А (this is line 24) . > АТА . > И . > ИТЕ OpenOffice or ISpell? Pls, provide: - link to download of dictionary - Locale and encoding setting of your db The dictionary is the ISpell one I got from http://wiki.services.openoffice.org/wiki/Dictionaries list. Here is a direct link for it: http://heanet.dl.sourceforge.net/sourceforge/bgoffice/ispell-bg-4.1.tar.gz I converted its encoding from windows-1251 to UTF-8 before running the CREATE TEXT SEARCH DICTIONARY: iconv -f windows-1251 -t utf-8 bulgarian.dic >bulgarian_utf8.dict iconv -f windows-1251 -t utf-8 bulgarian.aff >bulgarian_utf8.affix The locale of the database is fr_FR, and its encoding is UTF8. Thanks! Daniel -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] (TSearch2] Chinese dictionary?
(reposted, in case people who could help me have missed my 2-weeks-ago post) Hello. I am looking for a Chinese dictionary for TSearch2 Postgresql feature, with no success yet. I found a reference on a chinese webpage: http://code.google.com/p/nlpbamboo/wiki/TSearch2 With the help of translate.google.com, I thought I had managed to figure out what to do: I installed "nlpbamboo" on my Redhat server, but I failed to follow the next steps because of compilation errors trying to make "postgres/bamboo" Anyways, even if I had successfully followed the different steps described, I would still have lacked the most important part: the dictionary files themselves... Has one of you managed to installed a chinese TSearch2 dictionary on Postgres? Thanks for your attention, Daniel Chiaramello
Re: [GENERAL] chinese parser for text search !
Hello Oleg and others. I also found that reference, but failed to find the corresponding Chinese dictionary it mentions. And when I tried to compile nlpbamboo, it fails. Has one of you tried (and succeeded) to use Tsearch for Chinese? Thanks for your attention, Daniel Oleg Bartunov a écrit : Hi there, there is a chinese parser for tsearch2 available from http://code.google.com/p/nlpbamboo/wiki/TSearch2 under BSD license. It'd be nice to have it for chinese text search configuration we are currently completely missing, as well as for japanese language ( any guess ?) Unfortunately, we unable to verify this parser, so I'm asking about help. We need usual extensive testing (memory leaking, portability, etc), regression test and patch for documentation. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Fulltext search configuration
Hi Mohamed. I don't know where you get the dictionary - I unsuccessfully tried the OpenOffice one by myself (the Ayaspell one), and I had no arabic stopwords file. Renaming the file is supposed to be enough (I did it successfully for Thailandese dictionary) - the ".aff'" file becoming the ".affix" one. When I tried to create the dictionary: CREATE TEXT SEARCH DICTIONARY ar_ispell ( TEMPLATE = ispell, DictFile = ar_utf8, AffFile = ar_utf8, StopWords = english ); I had an error: ERREUR: mauvais format de fichier affixe pour le drapeau CONTEXTE : ligne 42 du fichier de configuration « /usr/share/pgsql/tsearch_data/ar_utf8.affix » : « PFX Aa Y 40 (which means Bad format of Affix file for flag, line 42 of configuration file) Do you have an error when creating your dictionary? Daniel Mohamed a écrit : I have ran into some problems here. I am trying to implement arabic fulltext search on three columns. To create a dictionary I have a hunspell dictionary and and arabic stop file. CREATE TEXT SEARCH DICTIONARY hunspell_dic ( TEMPLATE = ispell, DictFile = hunarabic, AffFile = hunarabic, StopWords = arabic ); 1) The problem is that the hunspell contains a .dic and a .aff file but the configuration requeries a .dict and .affix file. I have tried to change the endings but with no success. 2) ts_lexize('hunspell_dic', 'ARABIC WORD') returns nothing 3) How can I convert my .dic and .aff to valid .dict and .affix ? 4) I have read that when using dictionaries, if a word is not recognized by any dictionary it will not be indexed. I find that troublesome. I would like everything but the stop words to be indexed. I guess this might be a step that I am not ready for yet, but just wanted to put it out there. Also I would like to know how the process of the fulltext search implementation looks like, from config to search. Create dictionary, then a text configuration, add dic to configuration, index columns with gin or gist ... How does a search look like? Does it match against the gin/gist index. Have that index been built up using the dictionary/configuration, or is the dictionary only used on search frases? / Moe
[GENERAL] [Newbie] UPDATE based on other table content
Hello. I have a very basic question, relative to the following "problem". I have the following tables: product id qty intermediate id product_id orders intermediate_id I want to update the "qty" field of the "product" table by incrementing it each time there is an order in the "orders" table, referencing a given product through the "intermediate" table. I tried the following request: UPDATE qty = qty+1 FROM intermediate, orders WHERE orders.intermediate_id=intermediate.id AND intermediate.product_id=product.id ; But of course it does what was predictable - ie the qty "field" is incremented only once, even if more than one entry is referencing a given product. But it's not what I was hoping... What would be the "good" solution to do that UPDATE? Thanks for your attention! Daniel Chiaramello -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [Newbie] UPDATE based on other table content
Never mind, I found how finally: UPDATE product SET qty = qty+s_count FROM ( SELECT intermediate.product_id, count(*) AS s_count FROM intermediate, orders WHERE orders.intermediate_id=intermediate.id GROUP BY intermediate.product_id ) AS summary WHERE summary.product_id = product.id ; Sorry for disturbance! Daniel Daniel Chiaramello a écrit : Hello. I have a very basic question, relative to the following "problem". I have the following tables: product id qty intermediate id product_id orders intermediate_id I want to update the "qty" field of the "product" table by incrementing it each time there is an order in the "orders" table, referencing a given product through the "intermediate" table. I tried the following request: UPDATE qty = qty+1 FROM intermediate, orders WHERE orders.intermediate_id=intermediate.id AND intermediate.product_id=product.id ; But of course it does what was predictable - ie the qty "field" is incremented only once, even if more than one entry is referencing a given product. But it's not what I was hoping... What would be the "good" solution to do that UPDATE? Thanks for your attention! Daniel Chiaramello -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [Newbie] UPDATE based on other table content
Thom Brown a écrit : ... 2009/10/29 Daniel Chiaramello : Never mind, I found how finally: UPDATE product SET qty = qty+s_count FROM ( SELECT intermediate.product_id, count(*) AS s_count FROM intermediate, orders WHERE orders.intermediate_id=intermediate.id GROUP BY intermediate.product_id ) AS summary WHERE summary.product_id = product.id ; Sorry for disturbance! Daniel Your solution looks like it would perform a cumulative calculation. Surely you'd want qty = s_count? In any case, wouldn't you be better off not having a quantity column at all and just calculate it with either a query or a view? Example: SELECT product.id, COUNT(intermediate.product_id) FROM product LEFT JOIN intermediate ON product.id = intermediate.product_id GROUP BY product.id ORDER BY product.id Or include a product name in the product table to get more meaningful output. You'd then get an output like: id namecount 1 Orange 5 2 Apples 7 3 Pears 2 4 Kiwi0 If you don't want ones for which there have been no orders for (or whatever your intermediate table is for), use an INNER JOIN instead. Regards Thom Thanks for anwer, but no, it's really adding the number of entries to my "qty" field. The "orders" table is a transient one and is cleaned regularly. Of course, the example I gave is a simplified one (there are no orders or products, I chose these names to ease the understanding of my problem) - in reality, the problem is much complicated than that :) But thanks for answer anyways. Daniel -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] dblink and commit
Hello. I an wondering how things happen, related to transactions using dblink. I am doing the following thing (simplified, of course), on ServerA: BEGIN insert into TableA_ServerA (...) select * from dblink(ServerB, select * from tableA_ServerB) insert into TableB_ServerA (...) select * from dblink(ServerB, select * from tableB_ServerB) dblink(serverB, delete from tableA_ServerB) dblink(serverB, delete from tableB_ServerB) COMMIT What happens if connection is lost when executing the last dblink? Are the data from tableA_serverB deleted (what I would like to avoid)? If they are deleted, is there a way to handle this - do the 2 local insert and the 2 remote delete (normal case), but do nothing at all if something went bad at one time? Thanks for your attention, Daniel -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general