Oh, you want me to change the getSorted method to be synchronized? I'll put a lock in there and see what happens, if that is what you are referring to.
On 6/1/07, Andrzej Bialecki <[EMAIL PROTECTED]> wrote:
Briggs wrote: > What is the design contract on plugins when it comes to thread safety? > I was under the assumption that plugins should be thread safe, but I > have been running into concurrent modification exceptions from the > language identifier plugin while indexing. My application is a bit They should be thread-safe. E.g. Fetcher runs many threads in parallel, each thread using plugins to handle fetching, parsing, url filtering, etc, etc. > different from the normal nutch way. I have may crawls going on > concurrently within an application. So, that means I would also have > many concurrent indexing tasks. So, if I can't be guaranteed that > plugins are threadsafe, I may need to do a nasty thing and synchronize > my index() method (ouch). > > > Here is the exception, just for info: > > java.util.ConcurrentModificationException > at java.util.HashMap$HashIterator.nextEntry(HashMap.java:787) > at java.util.HashMap$ValueIterator.next(HashMap.java:817) > at > org.apache.nutch.analysis.lang.NGramProfile.normalize(NGramProfile.java:277) This is a bug. My guess is that NGramProfile.getSorted() should be synchronized. Could you please test if this works? -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
-- "Conscious decisions by conscious minds are what make reality real"