Re: Question - Why stopwords.txt provided by smartcn contains blank lines?

2023-05-15 Thread Jerry Chin
Hi Michael, Thanks for clarifying, I have created an issue to follow up in Github. Much appreciated! On Monday, May 15, 2023, Michael McCandless wrote: > Hi Jerry, > > I agree, that makes no sense! Maybe the stopload loader should ignore > truly

Re: Question - Why stopwords.txt provided by smartcn contains blank lines?

2023-05-15 Thread Michael McCandless
Hi Jerry, I agree, that makes no sense! Maybe the stopload loader should ignore truly blank lines? Also, the comments on lines 57 and 59 are confusing -- there are no (default) English and Chinese stopwords in the file. I guess they are placeholders. Could you open an issue in Lucene's GitHub

Question - Why stopwords.txt provided by smartcn contains blank lines?

2023-05-15 Thread Jerry Chin
Hi all, This following line contains two blank lines, including line 56 & 58: https://github.com/apache/lucene/blob/main/lucene/analysis/smartcn/src/resources/org/apache/lucene/analysis/cn/smart/stopwords.txt As a result, SmartChineseAnalyzer.getDefaultStopSet() will produce a empty string as st