> data or it won’t.
>
> Thanks
>
> Sid
>
> Sent from my iPhone
>
> > On Jan 8, 2018, at 4:38 PM, John Blythe <johnbly...@gmail.com> wrote:
> >
> > you could use the keepwords functionality. have a field that only keeps
> > profanity and then y
ohnbly...@gmail.com> wrote:
>
> you could use the keepwords functionality. have a field that only keeps
> profanity and then you can query against that field having its default
> value vs. profane text
>
> --
> John Blythe
>
>> On Mon, Jan 8, 2018 at 3:12 PM, Sadiki Latt
be overkill
hence why I was thinking the list. The data being inserted is from sources that
we have “control” over. This requirement is simply for the worst case scenario
that we miss something. We might also want to allow this profanity which is why
we need to flag it rather than strip it all
,
Markus
-Original message-
> From:Davis, Daniel (NIH/NLM) [C] <daniel.da...@nih.gov>
> Sent: Monday 8th January 2018 23:12
> To: solr-user@lucene.apache.org
> Subject: RE: Profanity
>
> Fun topic. Same complicated issues as normal search:
>
> Multilingual su
Fun topic. Same complicated issues as normal search:
Multilingual support?Is "Merde" profanity too, or just in French.
Multi-word synonyms? Does "God Damn" becomes "goddamn", or do you treat
"Damn" and &
text input field for 'profanity' and set another boolean field
if it matches or doesn't. If you are using a list of words - or an SVM or
another machine learning algorithm - to detect provanity is up to you.
Cheers,
Markus
-Original message-
> From:Sadiki Latty <sla...@uottawa.ca&g
you could use the keepwords functionality. have a field that only keeps
profanity and then you can query against that field having its default
value vs. profane text
--
John Blythe
On Mon, Jan 8, 2018 at 3:12 PM, Sadiki Latty <sla...@uottawa.ca> wrote:
> Hey
>
> I would like to
Hey
I would like to find a solution to flag (at index-time) profanity. Optimally,
it would be good if it function similar to stopwords in the sense that I can
have a predefined list that is read and if token is on the list that document
is 'flagged' in a different field. Does anyone know
A problem is that your profanity list will not stop growing, and with
each new word you will want to rescrub the index.
We had a thousand-word NOT clause in every query (a filter query would
be true for 99% of the index) until we switched to another
arrangement.
Another small problem was that I
On Thu, Feb 11, 2010 at 10:49 AM, Grant Ingersoll gsing...@apache.org wrote:
Otherwise, I'd do it via copy fields. Your first field is your main field
and is analyzed as before. Your second field does the profanity detection
and simply outputs a single token at the end, safe/unsafe.
How
- A TokenFilter would allow me to tap into the existing analysis pipeline so
I get the tokens for free but I can't access the document.
https://issues.apache.org/jira/browse/SOLR-1536
On Fri, Jan 29, 2010 at 12:46 AM, Mike Perham mper...@onespot.com wrote:
We'd like to implement a profanity
On Jan 28, 2010, at 4:46 PM, Mike Perham wrote:
We'd like to implement a profanity detector for documents during indexing.
That is, given a file of profane words, we'd like to be able to mark a
document as safe or not safe if it contains any of those words so that we
can have something
To: solr-user@lucene.apache.org
Sent: Thu, January 28, 2010 4:46:54 PM
Subject: implementing profanity detector
We'd like to implement a profanity detector for documents during indexing.
That is, given a file of profane words, we'd like to be able to mark a
document as safe or not safe
We'd like to implement a profanity detector for documents during indexing.
That is, given a file of profane words, we'd like to be able to mark a
document as safe or not safe if it contains any of those words so that we
can have something similar to google's safe search.
I'm trying to figure out
-user@lucene.apache.org
Sent: Thu, January 28, 2010 4:46:54 PM
Subject: implementing profanity detector
We'd like to implement a profanity detector for documents during indexing.
That is, given a file of profane words, we'd like to be able to mark a
document as safe or not safe if it contains any
mper...@onespot.com
To: solr-user@lucene.apache.org
Sent: Thu, January 28, 2010 4:46:54 PM
Subject: implementing profanity detector
We'd like to implement a profanity detector for documents during indexing.
That is, given a file of profane words, we'd like to be able to mark a
document
16 matches
Mail list logo