[HACKERS] dot to be considered as a word delimiter?

2010-07-16 Thread Paul Fariello
Hi all, I was reading a post from Sushant Sinha about english parser wich do not consider dot as a word delimiter. In a following mail it has been proposed to add a patch. Is there any news about that ? I would enjoy this patch, too ;) Thank's -- Paul Fariello Étudiant ingénieur à l'Universit

Re: [HACKERS] dot to be considered as a word delimiter?

2009-06-02 Thread Kenneth Marshall
On Tue, Jun 02, 2009 at 04:40:51PM -0400, Sushant Sinha wrote: > Fair enough. I agree that there is a valid need for returning such tokens as > a host. But I think there is definitely a need to break it down into > individual words. This will help in cases when a document is missing a space > in be

Re: [HACKERS] dot to be considered as a word delimiter?

2009-06-02 Thread Kevin Grittner
Sushant Sinha wrote: > So what we can do is: return the entire compound word as Host and > also break it down into individual words. So, pretty much like we handle hyphenation? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscriptio

Re: [HACKERS] dot to be considered as a word delimiter?

2009-06-02 Thread Sushant Sinha
Fair enough. I agree that there is a valid need for returning such tokens as a host. But I think there is definitely a need to break it down into individual words. This will help in cases when a document is missing a space in between the words. So what we can do is: return the entire compound wor

Re: [HACKERS] dot to be considered as a word delimiter?

2009-06-02 Thread Kenneth Marshall
On Mon, Jun 01, 2009 at 08:22:23PM -0500, Kevin Grittner wrote: > Sushant Sinha wrote: > > > I think that dot should be considered by as a word delimiter because > > when dot is not followed by a space, most of the time it is an error > > in typing. Beside they are not many valid english words

Re: [HACKERS] dot to be considered as a word delimiter?

2009-06-01 Thread Kevin Grittner
Sushant Sinha wrote: > I think that dot should be considered by as a word delimiter because > when dot is not followed by a space, most of the time it is an error > in typing. Beside they are not many valid english words that have > dot in between. It's not treating it as an English word, but

[HACKERS] dot to be considered as a word delimiter?

2009-05-29 Thread Sushant Sinha
Currently it seems like that dot is not considered as a word delimiter by the english parser. lawdb=# select to_tsvector('english', 'Mr.J.Sai Deepak'); to_tsvector - 'deepak':2 'mr.j.sai':1 (1 row) So the word obtained is "mr.j.sai" rather than three words "