Re: [GENERAL] [tsearch2] Problem with case sensitivity (or with creating own dictionary)

2013-08-07 Thread Krzysztof xaru Rajda
Ok, so to be sure if I understand everything - first I should install a 
postgresql-contrib extension. Next, there will appear a contrib/dict_int 
directory with dict_int sourcecode inside, which I can modify. Then, 
I'll be able to install this modified dictionary, and it would be 
working properly, like ispell or snowball dictionaries. Finally, if 
everything will be ok, I'll share a little tutorial at wiki :)


Am I right, or it isn't that easy?

Regards,
xaru




W dniu 2013-08-05 18:37, Oleg Bartunov pisze:

Please,

take a look on contrib/dict_int and create your own dict_noop.
It should be easy.  I think you could document it and share
with people (wiki.postgresql.org ?), since there were other people
interesting in noop dictionary. Also, don't forget to modify
your configuration - use ts_debug(), it will helps you.

Regards,
Oleg

On Sat, 3 Aug 2013, Krzysztof xaru Rajda wrote:


Hello,

I encountered such a problem. my goal is to extract links from a text 
using tsearch2. Everything seemed to be well, unless I got some 
youtube links - there are some small and big letters inside, and a 
tsearch parser is lowering everything (from 
http://youtube.com/Y6dsHDX I got http://youtube.com/y6dshdx, which is 
not working). I went through PostgreSQL docs, and it seem that each 
of default dictionaries (simple, ispell, snowball) are lowering 
lexems during normalization, and there is no option to disable it.


I started to look for some tutorials, how to create own dictionary, 
or modify existing one (I'm talking about dictionary like snowball, 
with my own source code - not just a dictionary created by 'CREATE 
DICTIONARY...' query), but all I found is really out-of-date, and 
uses some mechanisms that are deprecated in latest version of 
Postgres (I'm working on v 9.2) - like 'contrib/gendict' here: 
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/custom-dict.html 
http://www.sai.msu.su/%7Emegera/postgres/gist/tsearch/V2/docs/custom-dict.html 

So now, I have no idea what to do with my case sensitivity problem... 
Is there any other way to overcome it, apart from creating own 
dictionary? If no - how to create one on the Postgres 9.2?


Regards,
xaru





Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] [tsearch2] Problem with case sensitivity (or with creating own dictionary)

2013-08-05 Thread Krzysztof xaru Rajda

Hello,

I encountered such a problem. my goal is to extract links from a text 
using tsearch2. Everything seemed to be well, unless I got some youtube 
links - there are some small and big letters inside, and a tsearch 
parser is lowering everything (from http://youtube.com/Y6dsHDX I got 
http://youtube.com/y6dshdx, which is not working). I went through 
PostgreSQL docs, and it seem that each of default dictionaries (simple, 
ispell, snowball) are lowering lexems during normalization, and there is 
no option to disable it.


I started to look for some tutorials, how to create own dictionary, or 
modify existing one (I'm talking about dictionary like snowball, with my 
own source code - not just a dictionary created by 'CREATE 
DICTIONARY...' query), but all I found is really out-of-date, and uses 
some mechanisms that are deprecated in latest version of Postgres (I'm 
working on v 9.2) - like 'contrib/gendict' here: 
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/custom-dict.html 
http://www.sai.msu.su/%7Emegera/postgres/gist/tsearch/V2/docs/custom-dict.html 



So now, I have no idea what to do with my case sensitivity problem... Is 
there any other way to overcome it, apart from creating own dictionary? 
If no - how to create one on the Postgres 9.2?


Regards,
xaru


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] [tsearch2] Problem with case sensitivity (or with creating own dictionary)

2013-08-05 Thread Krzysztof xaru Rajda

Hello,

I encountered such a problem. my goal is to extract links from a text 
using tsearch2. Everything seemed to be well, unless I got some youtube 
links - there are some small and big letters inside, and a tsearch 
parser is lowering everything (from http://youtube.com/Y6dsHDX I got 
http://youtube.com/y6dshdx, which is not working). I went through 
PostgreSQL docs, and it seem that each of default dictionaries (simple, 
ispell, snowball) are lowering lexems during normalization, and there is 
no option to disable it.


I started to look for some tutorials, how to create own dictionary, or 
modify existing one (I'm talking about dictionary like snowball, with my 
own source code - not just a dictionary created by 'CREATE 
DICTIONARY...' query), but all I found is really out-of-date, and uses 
some mechanisms that are deprecated in latest version of Postgres (I'm 
working on v 9.2) - like 'contrib/gendict' here: 
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/custom-dict.html 
http://www.sai.msu.su/%7Emegera/postgres/gist/tsearch/V2/docs/custom-dict.html 



So now, I have no idea what to do with my case sensitivity problem... Is 
there any other way to overcome it, apart from creating own dictionary? 
If no - how to create one on the Postgres 9.2?


Regards,
xaru


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [tsearch2] Problem with case sensitivity (or with creating own dictionary)

2013-08-05 Thread Oleg Bartunov

Please,

take a look on contrib/dict_int and create your own dict_noop.
It should be easy.  I think you could document it and share
with people (wiki.postgresql.org ?), since there were other people
interesting in noop dictionary. Also, don't forget to modify
your configuration - use ts_debug(), it will helps you.

Regards,
Oleg

On Sat, 3 Aug 2013, Krzysztof xaru Rajda wrote:


Hello,

I encountered such a problem. my goal is to extract links from a text using 
tsearch2. Everything seemed to be well, unless I got some youtube links - 
there are some small and big letters inside, and a tsearch parser is lowering 
everything (from http://youtube.com/Y6dsHDX I got http://youtube.com/y6dshdx, 
which is not working). I went through PostgreSQL docs, and it seem that each 
of default dictionaries (simple, ispell, snowball) are lowering lexems during 
normalization, and there is no option to disable it.


I started to look for some tutorials, how to create own dictionary, or modify 
existing one (I'm talking about dictionary like snowball, with my own source 
code - not just a dictionary created by 'CREATE DICTIONARY...' query), but 
all I found is really out-of-date, and uses some mechanisms that are 
deprecated in latest version of Postgres (I'm working on v 9.2) - like 
'contrib/gendict' here: 
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/custom-dict.html 
http://www.sai.msu.su/%7Emegera/postgres/gist/tsearch/V2/docs/custom-dict.html 

So now, I have no idea what to do with my case sensitivity problem... Is 
there any other way to overcome it, apart from creating own dictionary? If no 
- how to create one on the Postgres 9.2?


Regards,
xaru





Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general