[HACKERS] unaccent

nngodinh Wed, 18 Sep 2002 02:50:05 -0700

Greetings,

As far as I use the txtidx data structure in conjunction with gist indexing
to make a word indexing of a very large UNICODE db, I've implemented a PostgreSQL
function that uses libunac to unaccent TEXT fileds.


The resulting text is in UTF-8, but you can modify it in the sources with
an appropriate value (using iconv charset names).

Get libunac from: http://www.nongnu.org/unac/ (it uses iconv)

Extract the archive, compile it (make). Move pg_unac.so to your postgresql
shared libraries dir.

Link it in postgresql:

CREATE FUNCTION unac(TEXT) RETURNS TEXT AS 'path_to_pg_unac.so' LANGUAGE
C;

What about integrating unaccent libraries directly in tsearch? It is useful
for french search engines (for instance).

Bye.

Nhan NGO DINH


__________________________________________________________________
Tiscali Ricaricasa
la prima prepagata per navigare in Internet a meno di un'urbana e
risparmiare su tutte le tue telefonate. Acquistala on line e non avrai
nessun costo di attivazione n� di ricarica!
http://ricaricasaonline.tiscali.it/

pg_unac-1.0.tar.gz
Description: application/gzip-compressed


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

[HACKERS] unaccent

Reply via email to