Hi Berlin, Nutch needs a file called urls.txt inside the directory that you are passing to the inject command. Try renaming the urls file to urls.txt.
Also, are you using the local FS or hadoop dfs? If it's the latter, you'll have to put your dmoz directory on the hadoop fs. -vishal. -----Original Message----- From: Berlin Brown [mailto:[EMAIL PROTECTED] Sent: Sunday, June 03, 2007 5:41 AM To: [EMAIL PROTECTED] Subject: Re: Error with the inject command Anybody? Still cant figure it out. I even created the crawl/crawldb directory. Nothing. The URLS are just a set of URLS. I am using 0.9.1 is it a bug maybe? On 6/2/07, Berlin Brown <[EMAIL PROTECTED]> wrote: > I am getting this error when I am trying to run the inject: > I have done this: > > mkdir dmoz > bin/nutch org.apache.nutch.tools.DmozParser content.rdf.u8 -subset > 5000 > dmoz/urls > > And an error here: > > bin/nutch inject crawl/crawldb dmoz > > > 2007-06-02 02:37:19,796 WARN plugin.PluginRepository - Plugins: not a > file: url. Can't load plugins from: > jar:file:/C:/Berlin/Downloads4/workspaceTrunk/BotListProjects/botcrawl/nutch /nutch-0.9.job!/plugins > 2007-06-02 02:37:19,812 INFO plugin.PluginRepository - Plugin > Auto-activation mode: [true] > 2007-06-02 02:37:19,812 INFO plugin.PluginRepository - Registered Plugins: > 2007-06-02 02:37:19,812 INFO plugin.PluginRepository - NONE > 2007-06-02 02:37:19,812 INFO plugin.PluginRepository - Registered > Extension-Points: > 2007-06-02 02:37:19,812 INFO plugin.PluginRepository - NONE > 2007-06-02 02:37:19,812 WARN mapred.LocalJobRunner - job_5ysi6h > java.lang.RuntimeException: x point org.apache.nutch.net.URLNormalizer > not found. > at org.apache.nutch.net.URLNormalizers.<init>(URLNormalizers.java:120) > at org.apache.nutch.crawl.Injector$InjectMapper.configure(Injecto > > -- > Berlin Brown > http://www.newspiritcompany.com - newspirit technologies > -- Berlin Brown http://www.newspiritcompany.com - newspirit technologies ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
