Hi Alex

The business requirement (for now) is not to return any result when the search keywords are cigarette related. The business user team will provide the list of the cigarette related keywords.

Will digest, explore and research on your suggestions. Thank you.

On 30/9/2020 10:56 am, Alexandre Rafalovitch wrote:
I am not sure why you think stop words are your first choice. Maybe I
misunderstand the question. I read it as that you need to exclude
completely a set of documents that include specific keywords when
called from specific module.

If I wanted to differentiate the searches from specific module, I
would give that module a different end-point (Request Query Handler),
instead of /select. So, /nocigs or whatever.

Then, in that end-point, you could do all sorts of extra things, such
as setting appends or even invariants parameters, which would include
filter query to exclude any documents matching specific keywords. I
assume it is ok to return documents that are matching for other
reasons.

Ideally, you would mark the cigs documents during indexing with a
binary or enumeration flag and then during search you just need to
check against that flag. In that case, you could copyField  your text
and run it against something like
https://lucene.apache.org/solr/guide/8_6/filter-descriptions.html#keep-word-filter
combined with Shingles for multiwords. Or similar. And just transform
it as index-only so that the result is basically a yes/no flag.
Similar thing could be done with UpdateRequestProcessor pipeline if
you want to end up with a true boolean flag. The idea is the same,
just to have an index-only flag that you force lock into for any
request from specific module.

Or even with something like ElevationSearchComponent. Same idea.

Hope this helps.

Regards,
    Alex.

On Tue, 29 Sep 2020 at 22:28, Derek Poh <d...@globalsources.com> wrote:
Hi

I have read in the mailings list that we should try to avoid using stop
words.

I have a use case where I would like to know if there is other
alternative solutions beside using stop words.

There is business requirement to return zero result when the search is
cigarette related words and the search is coming from a particular
module on our site. It does not apply to all searches from our site.
There is a list of these cigarette related words. This list contains
single word, multiple words (Electronic cigar), multiple words with
punctuation (e-cigarette case).
I am planning to copy a different set of search fields, that will
include the stopword filter in the index and query stage, for this
module to use.

For this use case, other than using stop words to handle it, is there
any alternative solution?

Derek

----------------------
CONFIDENTIALITY NOTICE

This e-mail (including any attachments) may contain confidential and/or 
privileged information. If you are not the intended recipient or have received 
this e-mail in error, please inform the sender immediately and delete this 
e-mail (including any attachments) from your computer, and you must not use, 
disclose to anyone else or copy this e-mail (including any attachments), 
whether in whole or in part.

This e-mail and any reply to it may be monitored for security, legal, 
regulatory compliance and/or other appropriate reasons.


----------------------
CONFIDENTIALITY NOTICE This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part.
This e-mail and any reply to it may be monitored for security, legal, 
regulatory compliance and/or other appropriate reasons.

Reply via email to