Here is a test case:
string text = @"Califórnia";
Lucene.Net.Analysis.KeywordTokenizer tokenizer = new KeywordTokenizer(new
StringReader(text));
Lucene.Net.Analysis.Snowball.SnowballFilter stemmer=
new Lucene.Net.Analysis.Snowball.SnowballFilter(tokenizer,
"Portuguese");
Lucene.Net.Analysis.Token token;
while ((token = stemmer.Next()) != null)
{
System.Console.WriteLine(tokenText);
}
Seems to go into infinite loop. Call to stemmer.Next() never returns. Not
sure if this is the only stemmer I am having trouble with. And it does happen
to us on a near daily basis.
Thanks,
Bob
On Sep 13, 2011, at 9:37 AM, Robert Stewart wrote:
> Are there any known issues with snowball stemmers (portuguese in particular)
> going into some infinite loop? I have a problem that happens on a recurring
> basis where IndexWriter locks up on AddDocument and never returns (it has
> taken up to 3 days before we realize it), requiring manual killing of the
> process. It seems to happen only on portuguese documents from what I can
> tell so far, and the stack trace when thread is aborted is always as follows:
>
> System.Threading.ThreadAbortException: Thread was being aborted.
> at System.RuntimeMethodHandle._InvokeMethodFast(IRuntimeMethodInfo method,
> Object target, Object[] arguments, SignatureStruct& sig, MethodAttributes
> methodAttributes, RuntimeType typeOwner)
> at System.RuntimeMethodHandle.InvokeMethodFast(IRuntimeMethodInfo method,
> Object target, Object[] arguments, Signature sig, MethodAttributes
> methodAttributes, RuntimeType typeOwner)
> at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags
> invokeAttr, Binder binder, Object[] parameters, CultureInfo culture, Boolean
> skipVisibilityChecks)
> at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags
> invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
> at Lucene.Net.Analysis.Snowball.SnowballFilter.Next()
> System.SystemException: System.Threading.ThreadAbortException: Thread was
> being aborted.
> at System.RuntimeMethodHandle._InvokeMethodFast(IRuntimeMethodInfo method,
> Object target, Object[] arguments, SignatureStruct& sig, MethodAttributes
> methodAttributes, RuntimeType typeOwner)
> at System.RuntimeMethodHandle.InvokeMethodFast(IRuntimeMethodInfo method,
> Object target, Object[] arguments, Signature sig, MethodAttributes
> methodAttributes, RuntimeType typeOwner)
> at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags
> invokeAttr, Binder binder, Object[] parameters, CultureInfo culture, Boolean
> skipVisibilityChecks)
> at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags
> invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
> at Lucene.Net.Analysis.Snowball.SnowballFilter.Next()
> at Lucene.Net.Analysis.Snowball.SnowballFilter.Next()
> at Lucene.Net.Analysis.TokenStream.IncrementToken()
> at Lucene.Net.Index.DocInverterPerField.ProcessFields(Fieldable[] fields,
> Int32 count)
> at Lucene.Net.Index.DocFieldProcessorPerThread.ProcessDocument()
> at Lucene.Net.Index.DocumentsWriter.UpdateDocument(Document doc, Analyzer
> analyzer, Term delTerm)
> at Lucene.Net.Index.IndexWriter.AddDocument(Document doc, Analyzer analyzer)
>
>
> Is there another list of contrib/snowball issues? I have not been able to
> reproduce a small test case yet however. Have there been any such issues
> with stemmers in the past?
>
> Thanks,
> Bob