Thanks! I created an issue (https://issues.apache.org/jira/browse/LUCENE-9567) and PR (https://github.com/apache/lucene-solr/pull/1961), and followed your suggestion of using the default stop tags and modifying MIGRATE.md.
Given that the "do nothing" behavior has been around for years, I don't see much need to change it in 8.x (though I'm happy to do that if someone asks). On Fri, Oct 2, 2020 at 9:49 AM Michael McCandless <luc...@mikemccandless.com> wrote: > +1 to make this less trappy. > > It looks like KoreanPartOfSpeechStopFilterFactory will fallback to default > stop tags if no args were provided. I think we should indeed make > JapanesePartOfSpeechStopFilterFactory consistent. > > Maybe, we fix this only in next major release (9.0), add an entry to > MIGRATE.txt explaining that, and go with option 2? And possibly option 1 > for 8.x releases? (Or maybe don't fix it in 8.x releases... not sure). > > Mike McCandless > > http://blog.mikemccandless.com > > > On Fri, Oct 2, 2020 at 12:10 PM Michael Froh <msf...@gmail.com> wrote: > >> I am currently working on migrating a project from an old version of Solr >> to Elasticsearch, and came across a funny (to me at least) difference in >> the "default" behavior of JapanesePartOfSpeechStopFilterFactory. >> >> If JapanesePartOfSpeechStopFilterFactory is given empty args, it does >> nothing. It doesn't load any stop tags, and just passes along the >> TokenStream passed to create(). (By comparison, the Elasticsearch filter >> will default to loading the stop tags shipped in the Kuromoji analyzer >> JAR.) So, for many years, my project was not using >> JapanesePartOfSpeechStopFilter, when I thought that it was. >> >> I would like to create an issue and submit a patch, in case other users >> out there are failing to use the filter factory correctly, but I'm not sure >> what the best approach is, between: >> >> 1. If someone doesn't specify the tags argument, then throw an exception >> (because the user probably doesn't know what they're doing). >> 2. If someone doesn't specify the tags argument, then load the default >> stop tags (like JapaneseAnalyzer does). >> >> I would lean more toward 1, to avoid a silent change in behavior. >> >