> On 7 Jun 2023, at 12:13 AM, Peter Eisentraut <pe...@eisentraut.org> wrote:
> 
> On 03.06.23 19:47, Florents Tselai wrote:
>> There’s another previous relevant patch [0] but was never merged. I’ve 
>> included these stop words and added some more (info in README.md).
>> For my personal projects looks like it yields much better results.
>> I’d like some feedback on the extension ; particularly on the installation 
>> infra (I’m not sure I’ve handled properly the permissions in the .sql files)
>> I’ll then try to make a .patch for this.
> 
> The open question at the previous attempt was that it wasn't clear what the 
> upstream source or long-term maintenance of the stop words list would be.  If 
> it's just a personally composed list, then it's okay if you use it yourself, 
> but for including it into PostgreSQL it ought to come from a reputable 
> non-individual source like snowball.

I’ve used the NLTK list [0] as my base of stopwords; Wouldn’t this be 
considered reputable enough ? 

0 
https://github.com/nltk/nltk_data/blob/gh-pages/packages/corpora/stopwords.zip 
(see greek.stop file in the archive)

> 

Reply via email to