Currently it seems like that dot is not considered as a word delimiter
by the english parser.

lawdb=# select to_tsvector('english', 'Mr.J.Sai Deepak');
       to_tsvector       
-------------------------
 'deepak':2 'mr.j.sai':1
(1 row)

So the word obtained is "mr.j.sai" rather than three words "mr", "j",
"sai"

It does it correctly if there is space in between, as space is
definitely a word delimiter.

lawdb=# select to_tsvector('english', 'Mr. J. Sai Deepak');
           to_tsvector           
---------------------------------
 'j':2 'mr':1 'sai':3 'deepak':4
(1 row)


I think that dot should be considered by as a word delimiter because
when dot is not followed by a space, most of the time it is an error in
typing. Beside they are not many valid english words that have dot in
between.

-Sushant.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to