Ah. Sorry. Last post was a ProfanitySelector rather than ProfanityFilter! - 
this fixes it anyway....

  <CachedFilter>
        <BooleanFilter>
            <Clause occurs="mustNot">
                    <TermsFilter fieldName="content">            
                        naughty1 naughty2 xxx
                    </TermsFilter>
            </Clause>
        </BooleanFilter> 
   </CachedFilter>





----- Original Message ----
From: mark harwood <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Wednesday, 7 March, 2007 4:05:56 PM
Subject: Re: Negative Filtering (such as for profanity)

Sounds like the sort of filter that could be usefully cached.
You can do all this in Java code or the XML query parser (in contrib) might be 
a quick and simple way to externalize the profanity settings in a stylesheet 
which is actually used at query time e.g.

<?xml version="1.0" encoding="ISO-8859-1"?>

<xsl:template match="/Document">
<FilteredQuery>
    <Query>
        <<UserQuery><xsl:value-of select="content"/></UserQuery>
    </Query>
    <Filter>
        <CachedFilter>
            <TermsFilter fieldName="content">            
                naughty1 naughty2 xxx
            </TermsFilter>
        </CachedFilter>        
    </Filter>    
</FilteredQuery>
</xsl:template>
</xsl:stylesheet>

The above example also automatically adds caching to the results of the 
profanity filter.
Your app code to use this would then look like this:
init()
        //parse and cache the stylesheet
        QueryTemplateManager qtm=new 
QueryTemplateManager(getClass().getResourceAsStream("query.xsl"));
....


runQuery()
            //get the user input
            Properties userInput=new Properties();
             
userInput.setProperty("content",httpRequest.getParameter("queryCriteria");

            //Transform the user input into a Lucene XML query
            org.w3c.dom.Document doc=qtm.getQueryAsDOM(userInput);
            
            //Parse the XML query using the XML parser
            Query q=xmlQueryBuilder.getQuery(doc.getDocumentElement());
  
            //run query as normal

Cheers
Mark


----- Original Message ----
From: Greg Gershman <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Wednesday, 7 March, 2007 3:07:45 PM
Subject: Negative Filtering (such as for profanity)

I'm attempting to create a profanity filter.  I thought to use a QueryFilter 
created with a Query of (-$#!+ AND [EMAIL PROTECTED] AND etc).  The problem I 
have run into is that, as a pure negative query is not supported (a query for 
(-term) DOES NOT return the inverse of a query for (term)), I believe the bit 
set returned by a purely negative QueryFilter is empty, so no matter how many 
results returned by the initial query, the result after filtering is always 
zero documents.

I was wondering if anyone had suggestions as to how else to do this.  I've 
considered simply amending the query string submitted by the user to include a 
pre-generated String that would exclude the query terms, but I consider this a 
non-elegant solution.  I had also thought about creating a new sub-class of 
QueryFilter, NegativeQueryFilter.  Basically, it would works just like a 
QueryFilter, taking a positive query (so, I would pass it an OR'ed list of 
profane words), then the resulting bits are simply flipped.  I think this would 
work, unless I'm missing something.  I'm going to experiment with it, I'd 
appreciate anyone's thoughts on this.

Thanks,

Greg




 
____________________________________________________________________________________
It's here! Your new message!  
Get new email alerts with the free Yahoo! Toolbar.
http://tools.search.yahoo.com/toolbar/features/mail/




    
    
        
___________________________________________________________ 
New Yahoo! Mail is the ultimate force in competitive emailing. Find out more at 
the Yahoo! Mail Championships. Plus: play games and win prizes. 
http://uk.rd.yahoo.com/evt=44106/*http://mail.yahoo.net/uk

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]






                
___________________________________________________________ 
What kind of emailer are you? Find out today - get a free analysis of your 
email personality. Take the quiz at the Yahoo! Mail Championship. 
http://uk.rd.yahoo.com/evt=44106/*http://mail.yahoo.net/uk

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to