Jack

Reading through the documentation for UpdateRequestProcessor my
understanding is that its good for handling processing of documents before
analysis. 
Is it true that processAdd (where we can have custom logic) is invoked once
per document and is invoked before any of the analyzers gets invoked?

I couldn't figure out how I can use UpdateRequestProcessor to access the
tokens stored in memory by CustomFilterFactory/CustomFilter.

Can you please provide more information on how I can use
UpdateRequestProcessor to handle any post processing that needs to be done
after all documents are added to the index?

Also does CustomFilterFactory/CustomFilter has any ways to do post
processing after all documents are added to index?

Here is the code i have for CustomFilterFactory/CustomFilter. This might
help understand what i am trying to do and may be there is a better way to
do this.
The main problem i have with this approach is that i am forced to write
results stored in memory (customMap) to database per document and if i have
1 million documents then thats 1 million db calls. I am trying to avoid the
number of calls made to database by storing results in memory and write
results to database once for every X documents (say, every 10000 docs).

public class CustomFilterFactory extends BaseTokenFilterFactory {
          public CustomFilter create(TokenStream input) {
                    String databaseName = getArgs().get("paramname");           
                    return new CustomFilter(input, databasename);
         }
}

public class CustomFilter extends TokenFilter {
        private TermAttribute termAtt;
        Map<TermAttribute, Integer> customMap = new HashMap<TermAttribute,
Integer>();
        String databasename = null;     
          protected CustomFilter(TokenStream input, String databasename) {
                  super(input);
                  termAtt = (TermAttribute) addAttribute(TermAttribute.class);
                  this.databasename  = databasename;
          }

          public final boolean incrementToken() throws IOException {
                  if (!input.incrementToken()) {
                      writeResultsToDB()          
                      return false;
                  }
                  
                  if (addWordToCustomMap()) {
                        // do some analysis on term and then populate customMap 
                        // customMap.put(term,somevalue);
                  }

                  if (customMap.size() > commitSize) {
                        writeResultsToDB()
                  }
                  return true;
          }

          boolean addWordToCustomMap() {                  
                // custom logic - some validation on term to determine if this 
should be
added to customMap
          }

          void writeResultsToDB() throws IOException {
                // custom logic that reads data from customMap, does some 
analysis and
writes them to database.
          }
}





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Custom-Filter-Factory-How-to-pass-parameters-tp4002217p4002531.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to