[GENERAL] tsearch dictionary list?

2008-07-18 Thread Daniel Chiaramello

Hi all.

I'm new to postgresql world, and I have to extend an existing product 
for Thailand - that product has some features based on tsearch, and I 
was wondering if there is an existing dictionary for that language... I 
failed to find any reference of such dictionary on the web, and of 
course I don't speak thailandese at all!


In fact, is there somewhere a list of existing tsearch dictionaries? Are 
japanese, chinese, or other "exotic" languages supported by tsearch?


Thanks for your attention,
Daniel Chiaramello

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Using ISpell dictionary - headaches...

2008-07-22 Thread Daniel Chiaramello

Hi everybody.

Well... I have a problem when trying to install and use an ISpell 
dictionary (the Thai one to be more precise) with the tsearch feature.


_What I am trying to do_

I have a table containing a "title" field, and I want to fill a "vector" 
field with the following command:
*UPDATE thai_table SET vectors = to_tsvector('thai_utf8', 
coalesce(title,''));*


_How I installed the Thai dictionary_

I installed the "th_TH.dic" and the "th_TH.aff" files (downloaded from 
http://wiki.services.openoffice.org/wiki/Dictionaries) in a 
"/usr/local/share/dicts/ispell/" folder, and I executed the following 
commands:


SET search_path = public;
BEGIN;

INSERT INTO pg_ts_dict (dict_name, dict_init, dict_initoption, 
dict_lexize, dict_comment)

VALUES (
   'th_spell_utf8',
   'spell_init(internal)',
   
'DictFile="/usr/local/share/dicts/ispell/th_TH.dic",AffFile="/usr/local/share/dicts/ispell/th_TH.aff"',

   'spell_lexize(internal,internal,integer)',
   'Thai ISpell dict utf8 encoding'
   );

INSERT INTO pg_ts_cfg (ts_name, prs_name, locale) VALUES ('thai_utf8', 
'default', 'th_TH.utf8');


INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES 
('thai_utf8', 'email', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES 
('thai_utf8', 'url', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES 
('thai_utf8', 'host', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES 
('thai_utf8', 'sfloat', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES 
('thai_utf8', 'version', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES 
('thai_utf8', 'uri', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES 
('thai_utf8', 'file', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES 
('thai_utf8', 'float', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES 
('thai_utf8', 'int', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES 
('thai_utf8', 'uint', '{simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES 
('thai_utf8', 'lword', '{th_spell_utf8,simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES 
('thai_utf8', 'nlword', '{th_spell_utf8,simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES 
('thai_utf8', 'word', '{th_spell_utf8,simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES 
('thai_utf8', 'part_hword', '{th_spell_utf8,simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES 
('thai_utf8', 'nlpart_hword', '{th_spell_utf8,simple}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) VALUES 
('thai_utf8', 'lpart_hword', '{th_spell_utf8,simple}');


COMMIT;

_What my problem is_

The problem is that, when i execute the request to fill my "vectors" 
field, psql crashes...


la connexion au serveur a été coupée à l'improviste
   Le serveur s'est peut-être arrêté anormalement
   avant ou durant le traitement de la requête.
La connexion au serveur a été perdue. Tentative de réinitialisation: Echec.
!>

(it means: the connection with the server has been cut unexpectedly. The 
server may have stop abnormaly before or during the request handling. 
The connection with the server has been lost. Trying to 
reinitialization: Failed)


I have no idea on what may cause that, nor what I could look for to find 
idea on how to solve that.


It *may* be because I'm using psql 8.0.3 and not the latest version (but 
I'm stucked with that version), i'm just hoping that one of you have met 
similar problem and have successfully solved it, or maybe if you know a 
site where an Ispell dictionary installation is detailed step by step so 
that I can check if I did something wrong somewhere...


Many thanks for your attention,
Daniel Chiaramello


[GENERAL] [TextSearch] syntax error while parsing affix file

2008-11-28 Thread Daniel Chiaramello

Hello everybody.

I am using Postrges 8.3.5, and I am trying to install a bulgarian ISpell 
dictionary (the OpenOffice one) for Textsearch features.


I converted the dictionary encoding to UTF-8, and I installed it in the 
"tsearch_data" folder.


But when I try to create the dictionary, I have a syntax error:

CREATE TEXT SEARCH DICTIONARY bulgarian_ispell (
TEMPLATE = ispell,
DictFile = bulgarian_utf8,
AffFile = bulgarian_utf8,
StopWords = english
);
ERREUR: erreur de syntaxe
CONTEXTE : ligne 24 du fichier de configuration « 
/usr/share/pgsql/tsearch_data/bulgarian_utf8.affix » : « . > А

»

(it means ERROR: syntax error, CONTEXT: line 24 of configuration file ...)


Extract of the file arount that line:

flag *A:
. > А (this is line 24)
. > АТА
. > И
. > ИТЕ

The file has Unix end_of_lines (I suspected something like that since 
the "CONTEXT" error line was split on 2 lines).


I'm really lost on how I can go further with the bulgarian dictionary... 
Could you help me, please?


Thanks for your attention!
Daniel Chiaramello

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] [TSearch2] Chinese dictionary?

2008-12-01 Thread Daniel Chiaramello

Hello.

I am looking for a Chinese dictionary for TSearch2 Postgresql feature, 
with no success yet.


I found a reference on a chinese webpage:

http://code.google.com/p/nlpbamboo/wiki/TSearch2

With the help of translate.google.com, I thought managed to figure out 
what to do: I installed "nlpbamboo" on my Redhat server, but I failed to 
follow the next steps because of compilation errors trying to make 
"postgres/bamboo"


Anyways, even if i had successfully followed the different steps 
described, I would still have lacked the most important part: the 
dictionary files themselves...


Has one of you managed to instaleld a chinese TSearch2 dictionary on 
Postgres?



Thanks for your attention,
Daniel Chiaramello

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [TextSearch] syntax error while parsing affix file

2008-12-02 Thread Daniel Chiaramello

Teodor Sigaev a écrit :

iconv -f windows-1251 -t utf-8 bulgarian.dic >bulgarian_utf8.dict
iconv -f windows-1251 -t utf-8 bulgarian.aff >bulgarian_utf8.affix

The locale of the database is fr_FR, and its encoding is UTF8.
I believe that characters 'И', 'А' (non-ascii) and other cyrillic ones 
are not acceptable for french locale :(


I was able to install a thailandese dictionary - why would such 
dictionary be ok and not a bulgarian one?
Which locale should I use to enable my database to be multi-language 
compatible?


I would never have suspected a locale problem... Ouch!

Daniel

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [TextSearch] syntax error while parsing affix file

2008-12-02 Thread Daniel Chiaramello

Teodor Sigaev a écrit :
I am using Postrges 8.3.5, and I am trying to install a bulgarian 
ISpell dictionary (the OpenOffice one) for Textsearch features.



flag *A:
. > А (this is line 24)
. > АТА
. > И
. > ИТЕ

OpenOffice or ISpell? Pls, provide:
- link to download of dictionary
- Locale and encoding setting of your db

The dictionary is the ISpell one I got from 
http://wiki.services.openoffice.org/wiki/Dictionaries list.


Here is a direct link for it:
http://heanet.dl.sourceforge.net/sourceforge/bgoffice/ispell-bg-4.1.tar.gz

I converted its encoding from windows-1251 to UTF-8 before running the 
CREATE TEXT SEARCH DICTIONARY:


iconv -f windows-1251 -t utf-8 bulgarian.dic >bulgarian_utf8.dict
iconv -f windows-1251 -t utf-8 bulgarian.aff >bulgarian_utf8.affix

The locale of the database is fr_FR, and its encoding is UTF8.

Thanks!
Daniel

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] (TSearch2] Chinese dictionary?

2008-12-12 Thread Daniel Chiaramello

(reposted, in case people who could help me have missed my 2-weeks-ago post)

Hello.

I am looking for a Chinese dictionary for TSearch2 Postgresql feature, 
with no success yet.


I found a reference on a chinese webpage:

http://code.google.com/p/nlpbamboo/wiki/TSearch2

With the help of translate.google.com, I thought I had managed to figure 
out what to do: I installed "nlpbamboo" on my Redhat server, but I 
failed to follow the next steps because of compilation errors trying to 
make "postgres/bamboo"


Anyways, even if I had successfully followed the different steps 
described, I would still have lacked the most important part: the 
dictionary files themselves...


Has one of you managed to installed a chinese TSearch2 dictionary on 
Postgres?



Thanks for your attention,
Daniel Chiaramello



Re: [GENERAL] chinese parser for text search !

2009-01-29 Thread Daniel Chiaramello

Hello Oleg and others.

I also found that reference, but failed to find the corresponding 
Chinese dictionary it mentions.

And when I tried to compile nlpbamboo, it fails.

Has one of you tried (and succeeded) to use Tsearch for Chinese?

Thanks for your attention,
Daniel

Oleg Bartunov a écrit :

Hi there,

there is a chinese parser for tsearch2 available from 
http://code.google.com/p/nlpbamboo/wiki/TSearch2 under BSD license.


It'd be nice to have it for chinese text search configuration we are 
currently

completely missing, as well as for japanese language ( any guess ?)

Unfortunately, we unable to verify this parser, so I'm asking about help.
We need usual extensive testing (memory leaking, portability, etc),
regression test and patch for documentation.

Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Fulltext search configuration

2009-02-02 Thread Daniel Chiaramello

Hi Mohamed.

I don't know where you get the dictionary - I unsuccessfully tried the 
OpenOffice one by myself (the Ayaspell one), and I had no arabic 
stopwords file.


Renaming the file is supposed to be enough (I did it successfully for 
Thailandese dictionary) - the ".aff'" file becoming the ".affix" one.

When I tried to create the dictionary:

CREATE TEXT SEARCH DICTIONARY ar_ispell (
   TEMPLATE = ispell,
   DictFile = ar_utf8,
   AffFile = ar_utf8,
   StopWords = english
);

I had an error:

ERREUR:  mauvais format de fichier affixe pour le drapeau
CONTEXTE : ligne 42 du fichier de configuration « 
/usr/share/pgsql/tsearch_data/ar_utf8.affix » : « PFX Aa  Y   40


(which means Bad format of Affix file for flag, line 42 of configuration 
file)


Do you have an error when creating your dictionary?

Daniel

Mohamed a écrit :

I have ran into some problems here.

I am trying to implement arabic fulltext search on three columns.

To create a dictionary I have a hunspell dictionary and and arabic 
stop file.


CREATE TEXT SEARCH DICTIONARY hunspell_dic (
TEMPLATE = ispell,
DictFile = hunarabic,
AffFile = hunarabic,
StopWords = arabic
);


1) The problem is that the hunspell contains a .dic and a .aff
file but the configuration requeries a .dict and .affix file. I
have tried to change the endings but with no success.

2) ts_lexize('hunspell_dic', 'ARABIC WORD') returns nothing

3) How can I convert my .dic and .aff to valid .dict and .affix ? 


4) I have read that when using dictionaries, if a word is not
recognized by any dictionary it will not be indexed. I find that
troublesome. I would like everything but the stop words to be
indexed. I guess this might be a step that I am not ready for yet,
but just wanted to put it out there.



Also I would like to know how the process of the fulltext search 
implementation looks like, from config to search.


Create dictionary, then a text configuration, add dic to 
configuration, index columns with gin or gist ...


How does a search look like? Does it match against the gin/gist index. 
Have that index been built up using the dictionary/configuration, or 
is the dictionary only used on search frases? 


/ Moe








[GENERAL] [Newbie] UPDATE based on other table content

2009-10-29 Thread Daniel Chiaramello

Hello.

I have a very basic question, relative to the following "problem".

I have the following tables:

product
   id
   qty

intermediate
   id
   product_id

orders
   intermediate_id

I want to update the "qty" field of the "product" table by incrementing 
it each time there is an order in the "orders" table, referencing a 
given product through the "intermediate" table.


I tried the following request:

UPDATE
   qty = qty+1
FROM
   intermediate,
   orders
WHERE
   orders.intermediate_id=intermediate.id AND
   intermediate.product_id=product.id
;

But of course it does what was predictable - ie the qty "field" is 
incremented only once, even if more than one entry is referencing a 
given product. But it's not what I was hoping...


What would be the "good" solution to do that UPDATE?

Thanks for your attention!
Daniel Chiaramello

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [Newbie] UPDATE based on other table content

2009-10-29 Thread Daniel Chiaramello

Never mind, I found how finally:

UPDATE
   product
SET
   qty = qty+s_count
FROM (
   SELECT
   intermediate.product_id,
   count(*) AS s_count
   FROM
   intermediate,
   orders
   WHERE
   orders.intermediate_id=intermediate.id
   GROUP BY
   intermediate.product_id
   ) AS summary
WHERE
   summary.product_id = product.id
;

Sorry for disturbance!
Daniel

Daniel Chiaramello a écrit :

Hello.

I have a very basic question, relative to the following "problem".

I have the following tables:

product
   id
   qty

intermediate
   id
   product_id

orders
   intermediate_id

I want to update the "qty" field of the "product" table by 
incrementing it each time there is an order in the "orders" table, 
referencing a given product through the "intermediate" table.


I tried the following request:

UPDATE
   qty = qty+1
FROM
   intermediate,
   orders
WHERE
   orders.intermediate_id=intermediate.id AND
   intermediate.product_id=product.id
;

But of course it does what was predictable - ie the qty "field" is 
incremented only once, even if more than one entry is referencing a 
given product. But it's not what I was hoping...


What would be the "good" solution to do that UPDATE?

Thanks for your attention!
Daniel Chiaramello




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [Newbie] UPDATE based on other table content

2009-10-29 Thread Daniel Chiaramello

Thom Brown a écrit :


...

2009/10/29 Daniel Chiaramello :
  

Never mind, I found how finally:

UPDATE
  product
SET
  qty = qty+s_count
FROM (
  SELECT
  intermediate.product_id,
  count(*) AS s_count
  FROM
  intermediate,
  orders
  WHERE
  orders.intermediate_id=intermediate.id
  GROUP BY
  intermediate.product_id
  ) AS summary
WHERE
  summary.product_id = product.id
;

Sorry for disturbance!
Daniel



Your solution looks like it would perform a cumulative calculation.
Surely you'd want qty = s_count?

In any case, wouldn't you be better off not having a quantity column
at all and just calculate it with either a query or a view?

Example:

SELECT product.id, COUNT(intermediate.product_id)
FROM product
LEFT JOIN intermediate ON product.id = intermediate.product_id
GROUP BY product.id
ORDER BY product.id

Or include a product name in the product table to get more meaningful
output.  You'd then get an output like:

id  namecount
1   Orange  5
2   Apples  7
3   Pears   2
4   Kiwi0

If you don't want ones for which there have been no orders for (or
whatever your intermediate table is for), use an INNER JOIN instead.

Regards

Thom

  
Thanks for anwer, but no, it's really adding the number of entries to my 
"qty" field. The "orders" table is a transient one and is cleaned regularly.
Of course, the example I gave is a simplified one (there are no orders 
or products, I chose these names to ease the understanding of my 
problem) - in reality, the problem is much complicated than that :)


But thanks for answer anyways.

Daniel

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] dblink and commit

2009-10-30 Thread Daniel Chiaramello

Hello.

I an wondering how things happen, related to transactions using dblink.

I am doing the following thing (simplified, of course), on ServerA:


BEGIN

insert into TableA_ServerA (...)
   select * from dblink(ServerB, select * from tableA_ServerB)

insert into TableB_ServerA (...)
   select * from dblink(ServerB, select * from tableB_ServerB)

dblink(serverB, delete from tableA_ServerB)
dblink(serverB, delete from tableB_ServerB)

COMMIT


What happens if connection is lost when executing the last dblink? Are 
the data from tableA_serverB deleted (what I would like to avoid)?


If they are deleted, is there a way to handle this - do the 2 local 
insert and the 2 remote delete (normal case), but do nothing at all if 
something went bad at one time?


Thanks for your attention,
Daniel

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general