Hi

I am trying to use NaiveBayesClassifier in my solr project. Currently
looking at its test case ClassificationTestBase.java.

Below codes seems like that classifier read the whole index db to train the
model everytime when classification happened for inputDocument. or am I
misunderstanding something here? If i had a large index db, will it impact
performance?

protected void checkCorrectClassification(Classifier<T> classifier, String
inputDoc, T expectedResult, Analyzer analyzer, String textFieldName, String
classFieldName, Query query) throws Exception {

    AtomicReader atomicReader = null;

    try {

      populateSampleIndex(analyzer);

      atomicReader = SlowCompositeReaderWrapper.wrap(indexWriter
.getReader());

      classifier.train(atomicReader, textFieldName, classFieldName, analyzer,
query);

      ClassificationResult<T> classificationResult = classifier.assignClass(
inputDoc);

      assertNotNull(classificationResult.getAssignedClass());

      assertEquals("got an assigned class of " +
classificationResult.getAssignedClass(),
expectedResult, classificationResult.getAssignedClass());

      assertTrue("got a not positive score " + classificationResult.getScore(),
classificationResult.getScore() > 0);

    } finally {

      if (atomicReader != null)

        atomicReader.close();

    }

  }

Reply via email to