Thanks!

I created an issue (https://issues.apache.org/jira/browse/LUCENE-9567) and
PR (https://github.com/apache/lucene-solr/pull/1961), and followed your
suggestion of using the default stop tags and modifying MIGRATE.md.

Given that the "do nothing" behavior has been around for years, I don't see
much need to change it in 8.x (though I'm happy to do that if someone asks).

On Fri, Oct 2, 2020 at 9:49 AM Michael McCandless <luc...@mikemccandless.com>
wrote:

> +1 to make this less trappy.
>
> It looks like KoreanPartOfSpeechStopFilterFactory will fallback to default
> stop tags if no args were provided.  I think we should indeed make
> JapanesePartOfSpeechStopFilterFactory consistent.
>
> Maybe, we fix this only in next major release (9.0), add an entry to
> MIGRATE.txt explaining that, and go with option 2?  And possibly option 1
> for 8.x releases?  (Or maybe don't fix it in 8.x releases... not sure).
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Fri, Oct 2, 2020 at 12:10 PM Michael Froh <msf...@gmail.com> wrote:
>
>> I am currently working on migrating a project from an old version of Solr
>> to Elasticsearch, and came across a funny (to me at least) difference in
>> the "default" behavior of JapanesePartOfSpeechStopFilterFactory.
>>
>> If JapanesePartOfSpeechStopFilterFactory is given empty args, it does
>> nothing. It doesn't load any stop tags, and just passes along the
>> TokenStream passed to create(). (By comparison, the Elasticsearch filter
>> will default to loading the stop tags shipped in the Kuromoji analyzer
>> JAR.) So, for many years, my project was not using
>> JapanesePartOfSpeechStopFilter, when I thought that it was.
>>
>> I would like to create an issue and submit a patch, in case other users
>> out there are failing to use the filter factory correctly, but I'm not sure
>> what the best approach is, between:
>>
>> 1. If someone doesn't specify the tags argument, then throw an exception
>> (because the user probably doesn't know what they're doing).
>> 2. If someone doesn't specify the tags argument, then load the default
>> stop tags (like JapaneseAnalyzer does).
>>
>> I would lean more toward 1, to avoid a silent change in behavior.
>>
>

Reply via email to