Hi Sami, In case it helps (since I've experience the same issue) I'm running on a multiple node setup and run dfs and the nutch commands same as Otis.
However, with my "fix" of hard-wiring the path of the hadoop.log file in log4j.properties I get multiple machines and threads trying to write simultaneously to this same file. I haven't looked at the code to see whether the logging function tries to lock the file before writing but the logs certainly show time-stamps that are all over the place. Using the nightly build from a month or so ago I believe that the fetcher logs were written to the individual tasktracker logs. Ought this not to still be the case? -Ed On 8/13/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
Hi Sami, This is a single box setup. I start dfs and mapred daemons with bin/start-all.sh . I'm using bin/nutch with various commands (inject, generate, crawl, updatedb, merge, invertlinks, index, dedup ...), as described in http://lucene.apache.org/nutch/tutorial8.html. Otis ----- Original Message ---- From: Sami Siren <[EMAIL PROTECTED]> To: nutch-user@lucene.apache.org Sent: Saturday, August 12, 2006 1:48:40 AM Subject: Re: [Nutch-general] log4j.properties bug(?) [EMAIL PROTECTED] wrote: > Hi Sami, > > ----- Original Message ---- [EMAIL PROTECTED] wrote: >> I assume the idea is that the JVM knows about hadoop.log.dir system >> property, and then log4j knows about it, too. However, it doesn't >> _always_ work. >> >> That is, when invoking various bin/nutch commands as described in >> http://lucene.apache.org/nutch/tutorial8.html , this fails, and the >> system attempts to write to "/" which, of course, is a directory, >> not a file. >> > Can you be more precise on this one - what commands do fail? What > kind of configuration are you running this on? > > > I'll have to look at another server's logs tomorrow, but I can tell > you that the error is much like the one in > http://issues.apache.org/jira/browse/NUTCH-307 : > > java.io.FileNotFoundException: / (Is a directory) cr06: at > java.io.FileOutputStream.openAppend(Native Method) cr06: at > java.io.FileOutputStream.<init>(FileOutputStream.java:177) cr06: at > java.io.FileOutputStream.<init>(FileOutputStream.java:102) cr06: at > org.apache.log4j.FileAppender.setFile(FileAppender.java:289) cr06: > at > org.apache.log4j.FileAppender.activateOptions(FileAppender.java:163) > cr06: at > org.apache.log4j.DailyRollingFileAppender.activateOptions( DailyRollingFileAppender.java:215) > cr06: at > org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:256) > > > > There is really not much to any kind of particular configuration, it > is just that those properties are unset, so when log4j has to > interpret this: > > log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file} > > > It gets interpreted as: > > log4j.appender.DRFA.File=/ > > Because those 2 properties are undefined. And that will happen if you > follow this tutorial: http://lucene.apache.org/nutch/tutorial8.html > This tutorial uses things like inject, generate, fetch, etc., while > the 0.8 tutorial on Wiki does not. When you use the 0.8 tutorial > from the Wiki, the properties do get set somehow, so everything > works. So it's a matter of those properties not getting set. What I meant by configuration was that are you running on one box and executing your tasks with LocalJobRunner or do you use one or more boxes and use TaskTracker to run your jobs? -- Sami SIren ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list Nutch-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-general