Oh, you want me to change the getSorted method to be synchronized?
I'll put a lock in there and see what happens, if that is what you are
referring to.


On 6/1/07, Andrzej Bialecki <[EMAIL PROTECTED]> wrote:
Briggs wrote:
> What is the design contract on plugins when it comes to thread safety?
> I was under the assumption that plugins should be thread safe, but I
> have been running into concurrent modification exceptions from the
> language identifier plugin while indexing.  My application is a bit

They should be thread-safe. E.g. Fetcher runs many threads in parallel,
each thread using plugins to handle fetching, parsing, url filtering,
etc, etc.


> different from the normal nutch way.  I have may crawls going on
> concurrently within an application.  So, that means I would also have
> many concurrent indexing tasks.  So, if I can't be guaranteed that
> plugins are threadsafe, I may need to do a nasty thing and synchronize
> my index() method (ouch).
>
>
> Here is the exception, just for info:
>
> java.util.ConcurrentModificationException
>        at java.util.HashMap$HashIterator.nextEntry(HashMap.java:787)
>        at java.util.HashMap$ValueIterator.next(HashMap.java:817)
>        at
> org.apache.nutch.analysis.lang.NGramProfile.normalize(NGramProfile.java:277)

This is a bug. My guess is that NGramProfile.getSorted() should be
synchronized. Could you please test if this works?

--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com




--
"Conscious decisions by conscious minds are what make reality real"

Reply via email to