well, at the moment it solve the problem I mentioned yesterday where all
tasktrackers will access the same site with hadoop. it seems that the
use of job.setBoolean("mapred.speculative.execution", false); didn't
help and I'm not sure why.However, though it is one more software it removes the need for special treatment for fetcher, i.e. special fetch lists built by the generator. So now fetcher/tasktracker suppose to access politely to hosts but still its list contains various hosts. Sometimes I noticed that generator created a fetchlist where (only 2 hosts in the seed) were put in the same fetchlist which made only one tasktracker work instead of two. I'm sorry if It sound a little confusing :) or unreasonable... :) Gal On Thu, 2006-02-16 at 13:47 -0800, Doug Cutting wrote: > Gal Nitzan wrote: > > I have implemented a down and dirty Global Locking: > > [ ... ] > > > > I changed FetcherThread constructor to create an instance of > > SyncManager. > > > > And in also in the run method I try to get a lock on the host. If not > > successful I add the url into a ListArray<key,datum> for a later > > processing... > > > > I also changed generator to put each url into a separate array so all > > fetchlists are even. > > What problem does this fix? > > Doug > ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
