Re: pg full text search very slow for Chinese characters

Andreas Joseph Krogh Tue, 10 Sep 2019 09:42:59 -0700

På tirsdag 10. september 2019 kl. 18:21:45, skrev Tom Lane <[email protected] 
<mailto:[email protected]>>: Jimmy Huang <[email protected]> writes:
 > I tried pg_trgm and my own customized token parser 
https://github.com/huangjimmy/pg_cjk_parser


 pg_trgm is going to be fairly useless for indexing text that's mostly
 multibyte characters, since its unit of indexable data is just 3 bytes
 (not characters). I don't know of any comparable issue in the core
 tsvector logic, though. The numbers you're quoting do sound quite awful,
 but I share Cory's suspicion that it's something about your setup rather
 than an inherent Postgres issue.

 regards, tom lane We experienced quite awful performance when we hosted the 
DB on virtual servers (~5 years ago) and it turned out we hit the write-cache 
limit (then 8GB), which resulted in ~1MB/s IO thruput. Running iozone might 
help tracing down IO-problems. --
 Andreas Joseph Krogh

Re: pg full text search very slow for Chinese characters

Reply via email to