Likely missing file:/. If I get rid of lines 617-622
of conf/nutch-default.xml
Oups, sorry.
I made this last change just after testing the whole patch.
And I doesn't test it once again since I was sure it was a minor change.
I correct this right now. Sorry.
Regards
Jérôme
--
Hi,
the umbilical.done is called two times in case a task is finished.
The map and the reduce task implementation call done when in the last
line of the run methods. (Maptask: 132, ReduceTask: 273)
But the tasktracker calls the the umbilical.done a second time in
line 585.
Is this a bug?
Fuad Efendi wrote:
I found this in J2SE API for setReuseAddress(default: false):
=
When a TCP connection is closed the connection may remain in a timeout
state for a period of time after the connection is closed (typically
known as the TIME_WAIT state or 2MSL wait state). For applications
Stefan Groschupf wrote:
I notice that can happen that a task is still running when the job
already was killed.
The web gui says there is no running job and process hold the nodes busy.
I haven't found the source of the problem yet.
I have seen this too. I think the solution is that, when
+1
I have read the paper about OPIc and it seam very good. I think it a
must for Nutch to have good (and fast) rank algo webgraph based. I
have fetched about 250 milions of pages and what I see is that the
only inlinks count is not good for big crawl and quality results.
Thanks,
Massimo
Piotr Kosiorowski wrote:
Should we have version independent site - always modified in trunk?
Or should we think about having a site (eg. JavaDocs, tutorial etc)
versioned and available for all versions at the same time?
The practice I've followed is to have the website reflect the latest
[
http://issues.apache.org/jira/browse/NUTCH-99?page=comments#action_12331224 ]
Stefan Groschupf commented on NUTCH-99:
---
OK, make sense.
Do you prefer command line args for the ports for this 'lets search for a port'
code?
I personal would prefer
[
http://issues.apache.org/jira/browse/NUTCH-99?page=comments#action_12331225 ]
Doug Cutting commented on NUTCH-99:
---
What command line would you add this to? I think this should simply start at
the default port (e.g., 7030) and loop trying port+1 until
Doug,
Thanks for reply,
I'll try to perform specific tests against in-home Apache during this
week(end) (limited in time slightly... Sorry!). Everything possible,
usually Apache httpd has timeout setting for keep-alive, and default
setting is (I don't remember) probably 600 seconds. I performed
Another cause of another problem:
By default, Java 1.4 caches DNS-to-IP mappings forever...
java.security.Security.setProperty(networkaddress.cache.ttl ,
1);
10 matches
Mail list logo