Hi,
I think I may have found a bug in MultiPhraseQuery.ExtractTerms().
If the same word occurs twice, an "System.ArgumentException: Item has already
been added." is thrown.
Original code:
public override void ExtractTerms(System.Collections.Hashtable terms)
{
for (System.Collections.IEnumerator iter = termArrays.GetEnumerator();
iter.MoveNext(); )
{
Term[] arr = (Term[]) iter.Current;
for (int i = 0; i < arr.Length; i++)
{
terms.Add(arr[i], arr[i]);
}
}
}
Possible patch:
public override void ExtractTerms(System.Collections.Hashtable terms)
{
for (System.Collections.IEnumerator iter = termArrays.GetEnumerator();
iter.MoveNext(); )
{
Term[] arr = (Term[]) iter.Current;
for (int i = 0; i < arr.Length; i++)
{
if(!terms.Contains(arr[i]))
terms.Add(arr[i], arr[i]);
}
}
}
It looks like this a bug in the Java version too. (Or is the behaviour of a
java Hashtable different???)
Perhaps we should notify them.
Jeroen