You may also want to look at something like: https://docs.querqy.org/index.html
ApacheCon had (is having..) a presentation on it that seemed quite relevant to your needs. The videos should be live in a week or so. Regards, Alex. On Tue, 29 Sep 2020 at 22:56, Alexandre Rafalovitch <arafa...@gmail.com> wrote: > > I am not sure why you think stop words are your first choice. Maybe I > misunderstand the question. I read it as that you need to exclude > completely a set of documents that include specific keywords when > called from specific module. > > If I wanted to differentiate the searches from specific module, I > would give that module a different end-point (Request Query Handler), > instead of /select. So, /nocigs or whatever. > > Then, in that end-point, you could do all sorts of extra things, such > as setting appends or even invariants parameters, which would include > filter query to exclude any documents matching specific keywords. I > assume it is ok to return documents that are matching for other > reasons. > > Ideally, you would mark the cigs documents during indexing with a > binary or enumeration flag and then during search you just need to > check against that flag. In that case, you could copyField your text > and run it against something like > https://lucene.apache.org/solr/guide/8_6/filter-descriptions.html#keep-word-filter > combined with Shingles for multiwords. Or similar. And just transform > it as index-only so that the result is basically a yes/no flag. > Similar thing could be done with UpdateRequestProcessor pipeline if > you want to end up with a true boolean flag. The idea is the same, > just to have an index-only flag that you force lock into for any > request from specific module. > > Or even with something like ElevationSearchComponent. Same idea. > > Hope this helps. > > Regards, > Alex. > > On Tue, 29 Sep 2020 at 22:28, Derek Poh <d...@globalsources.com> wrote: > > > > Hi > > > > I have read in the mailings list that we should try to avoid using stop > > words. > > > > I have a use case where I would like to know if there is other > > alternative solutions beside using stop words. > > > > There is business requirement to return zero result when the search is > > cigarette related words and the search is coming from a particular > > module on our site. It does not apply to all searches from our site. > > There is a list of these cigarette related words. This list contains > > single word, multiple words (Electronic cigar), multiple words with > > punctuation (e-cigarette case). > > I am planning to copy a different set of search fields, that will > > include the stopword filter in the index and query stage, for this > > module to use. > > > > For this use case, other than using stop words to handle it, is there > > any alternative solution? > > > > Derek > > > > ---------------------- > > CONFIDENTIALITY NOTICE > > > > This e-mail (including any attachments) may contain confidential and/or > > privileged information. If you are not the intended recipient or have > > received this e-mail in error, please inform the sender immediately and > > delete this e-mail (including any attachments) from your computer, and you > > must not use, disclose to anyone else or copy this e-mail (including any > > attachments), whether in whole or in part. > > > > This e-mail and any reply to it may be monitored for security, legal, > > regulatory compliance and/or other appropriate reasons.