Re: Profanity

2018-01-08 Thread John Blythe
> data or it won’t. > > Thanks > > Sid > > Sent from my iPhone > > > On Jan 8, 2018, at 4:38 PM, John Blythe <johnbly...@gmail.com> wrote: > > > > you could use the keepwords functionality. have a field that only keeps > > profanity and then y

Re: Profanity

2018-01-08 Thread Sadiki Latty
ohnbly...@gmail.com> wrote: > > you could use the keepwords functionality. have a field that only keeps > profanity and then you can query against that field having its default > value vs. profane text > > -- > John Blythe > >> On Mon, Jan 8, 2018 at 3:12 PM, Sadiki Latt

Re: Profanity

2018-01-08 Thread Sadiki Latty
be overkill hence why I was thinking the list. The data being inserted is from sources that we have “control” over. This requirement is simply for the worst case scenario that we miss something. We might also want to allow this profanity which is why we need to flag it rather than strip it all

RE: Profanity

2018-01-08 Thread Markus Jelsma
, Markus -Original message- > From:Davis, Daniel (NIH/NLM) [C] <daniel.da...@nih.gov> > Sent: Monday 8th January 2018 23:12 > To: solr-user@lucene.apache.org > Subject: RE: Profanity > > Fun topic.   Same complicated issues as normal search: > > Multilingual su

RE: Profanity

2018-01-08 Thread Davis, Daniel (NIH/NLM) [C]
Fun topic. Same complicated issues as normal search: Multilingual support?Is "Merde" profanity too, or just in French. Multi-word synonyms? Does "God Damn" becomes "goddamn", or do you treat "Damn" and &

RE: Profanity

2018-01-08 Thread Markus Jelsma
text input field for 'profanity' and set another boolean field if it matches or doesn't. If you are using a list of words - or an SVM or another machine learning algorithm - to detect provanity is up to you. Cheers, Markus -Original message- > From:Sadiki Latty <sla...@uottawa.ca&g

Re: Profanity

2018-01-08 Thread John Blythe
you could use the keepwords functionality. have a field that only keeps profanity and then you can query against that field having its default value vs. profane text -- John Blythe On Mon, Jan 8, 2018 at 3:12 PM, Sadiki Latty <sla...@uottawa.ca> wrote: > Hey > > I would like to

Profanity

2018-01-08 Thread Sadiki Latty
Hey I would like to find a solution to flag (at index-time) profanity. Optimally, it would be good if it function similar to stopwords in the sense that I can have a predefined list that is read and if token is on the list that document is 'flagged' in a different field. Does anyone know

Re: implementing profanity detector

2010-02-16 Thread Lance Norskog
A problem is that your profanity list will not stop growing, and with each new word you will want to rescrub the index. We had a thousand-word NOT clause in every query (a filter query would be true for 99% of the index) until we switched to another arrangement. Another small problem was that I

Re: implementing profanity detector

2010-02-12 Thread Mike Perham
On Thu, Feb 11, 2010 at 10:49 AM, Grant Ingersoll gsing...@apache.org wrote: Otherwise, I'd do it via copy fields.  Your first field is your main field and is analyzed as before.  Your second field does the profanity detection and simply outputs a single token at the end, safe/unsafe. How

Re: implementing profanity detector

2010-02-11 Thread Alexey Serba
- A TokenFilter would allow me to tap into the existing analysis pipeline so I get the tokens for free but I can't access the document. https://issues.apache.org/jira/browse/SOLR-1536 On Fri, Jan 29, 2010 at 12:46 AM, Mike Perham mper...@onespot.com wrote: We'd like to implement a profanity

Re: implementing profanity detector

2010-02-11 Thread Grant Ingersoll
On Jan 28, 2010, at 4:46 PM, Mike Perham wrote: We'd like to implement a profanity detector for documents during indexing. That is, given a file of profane words, we'd like to be able to mark a document as safe or not safe if it contains any of those words so that we can have something

implementing profanity detector

2010-02-10 Thread Mike Perham
To: solr-user@lucene.apache.org Sent: Thu, January 28, 2010 4:46:54 PM Subject: implementing profanity detector We'd like to implement a profanity detector for documents during indexing. That is, given a file of profane words, we'd like to be able to mark a document as safe or not safe

implementing profanity detector

2010-01-28 Thread Mike Perham
We'd like to implement a profanity detector for documents during indexing. That is, given a file of profane words, we'd like to be able to mark a document as safe or not safe if it contains any of those words so that we can have something similar to google's safe search. I'm trying to figure out

Re: implementing profanity detector

2010-01-28 Thread Otis Gospodnetic
-user@lucene.apache.org Sent: Thu, January 28, 2010 4:46:54 PM Subject: implementing profanity detector We'd like to implement a profanity detector for documents during indexing. That is, given a file of profane words, we'd like to be able to mark a document as safe or not safe if it contains any

Re: implementing profanity detector

2010-01-28 Thread Lance Norskog
mper...@onespot.com To: solr-user@lucene.apache.org Sent: Thu, January 28, 2010 4:46:54 PM Subject: implementing profanity detector We'd like to implement a profanity detector for documents during indexing. That is, given a file of profane words, we'd like to be able to mark a document