The LuceneIterator has a built-in circuit breaker if it gets too many errors.
If you are using lucene.vector, you can pass in --maxPercentErrorDocs X, where
X is some percentage of docs you are willing to allow errors in. The default
is no errors.
On Sep 18, 2011, at 10:48 AM, Philippe Adji
Hi,
I was trying to generate vectors from a lucene index using the lucene.vector
driver, it worked fine using mahout 0.4 but in mahout 0.5 i get the
following exception:
SEVERE: There are too many documents that do not have a term vector for
description
Exception in thread "main" java.lang.Illega