Fuad Efendi wrote:
Hi Andrzej,

Yes, I measured/compared (two years ago), I am actually using
simplified rewritten code based on Nutch, with non-synchronized
instance per thread.

This was probably based on the original Fetcher code (now OldFetcher.java) - the new Fetcher uses threads very differently.


Imagine 1024 threads, each having 100 Outlinks and trying to call
synchronized method... total 102,400 concurrent calls to synchronized
method (during, in average (network delays), 3-seconds frame)... I
was even able to have 1024 concurrent threads without any performance
impact! Also, each synchronization requires additional CPU cycles
(500-1000) even when concurrency is small.

With non-synchronized, I can't have more than 128 threads - CPU
overloaded. It run faster. -Fuad

Ok, sounds cool - could you prepare a patch for the RegexURLNormalizer that removes this problem?


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to