Bartosz,

That fixed the problem.  Thanks for your help and for updating the wiki.

Frank


2009/4/14 Bartosz Gadzimski <bartek...@o2.pl>:
> Hello Frank,
>
> Yes, it is memory issue you must increase java heap size.
>
> Just follow this instructions (another things to add to wiki ;)
>
> Eclipse -> Window -> Preferences -> Java -> Installed JREs -> edit ->
> Default VM arguments
>
> I've set mine to -Xms5m -Xmx150m because I have like 200MB RAM left after
> runnig all apps
>
> -Xms (minimum ammount of RAM memory for running applications)
> -Xmx (maximum)
>
> It should help.
>
> Thanks,
> Bartosz
>
> Frank McCown pisze:
>>
>> Hello Bartosz,
>>
>> I'm running the default Nutch 1.0 version on Windows XP (2 GB RAM)
>> with Eclipse 3.3.0.  I followed the directions at
>>
>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>
>> exactly as stated.  I'm able to run the default Nutch 0.9 release
>> without any problems in Eclipse.  But when I run 1.0, I always get the
>> java.io.IOException as stated in my last email.  I had assumed it was
>> due to the plugin issue, but maybe not.  I'm just running a very small
>> crawl with two seed URLs.
>>
>> Here's what hadoop.log says:
>>
>> 2009-04-13 13:41:03,010 INFO  crawl.Crawl - crawl started in: crawl
>> 2009-04-13 13:41:03,025 INFO  crawl.Crawl - rootUrlDir = urls
>> 2009-04-13 13:41:03,025 INFO  crawl.Crawl - threads = 10
>> 2009-04-13 13:41:03,025 INFO  crawl.Crawl - depth = 3
>> 2009-04-13 13:41:03,025 INFO  crawl.Crawl - topN = 5
>> 2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: starting
>> 2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: crawlDb:
>> crawl/crawldb
>> 2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: urlDir: urls
>> 2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: Converting
>> injected urls to crawl db entries.
>> 2009-04-13 13:41:03,588 WARN  mapred.JobClient - Use
>> GenericOptionsParser for parsing the arguments. Applications should
>> implement Tool for the same.
>> 2009-04-13 13:41:06,105 WARN  mapred.LocalJobRunner - job_local_0001
>> java.lang.OutOfMemoryError: Java heap space
>>        at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:498)
>>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>        at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
>>
>>
>> I have not tried Sanjoy's advice yet... it looks like this is a memory
>> issue.
>>
>> Any advice would be much appreciated,
>> Frank
>>
>>
>> 2009/4/10 Bartosz Gadzimski <bartek...@o2.pl>:
>>
>>>
>>> Hello Frank,
>>>
>>> Please look into hadoop.log and let maybe there is something more.
>>>
>>> About your error - you must give us more specific configuration of your
>>> nutch.
>>>
>>> Default nutch installation is working with no problems (I'v never changed
>>> src/plugin path)
>>>
>>> Please tell us: version of nutch
>>> any changes
>>> different configurations (different then crawl-urlfilter - adding your
>>> domain).
>>>
>>> Thanks,
>>> Bartosz
>>>
>>> Frank McCown pisze:
>>>
>>>>
>>>> Adding cygwin to my PATH solved my problem with whoami.  But now I'm
>>>> getting an exception when running the crawler:
>>>>
>>>> Injector: Converting injected urls to crawl db entries.
>>>> Exception in thread "main" java.io.IOException: Job failed!
>>>>       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
>>>>       at org.apache.nutch.crawl.Injector.inject(Injector.java:160)
>>>>       at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)
>>>>
>>>> I know from searching the mailing list that this is normally due to a
>>>> bad plugin.folders setting in the nutch-default.xml, but I used the
>>>> same value as the tutorial (./src/plugin) to no avail.
>>>>
>>>> (As an aside, seems like Hadoop should provide a better error message
>>>> if the plugin folder doesn't exist.)
>>>>
>>>> Anyway, thanks, Bartosz, for your help.
>>>>
>>>> Frank
>>>>
>>>>
>>>> 2009/4/10 Bartosz Gadzimski <bartek...@o2.pl>:
>>>>
>>>>
>>>>>
>>>>> Hello,
>>>>>
>>>>> So now you have to install cygwin and be sure that you add it to PATH
>>>>>
>>>>> it's in http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>>
>>>>> After this you should be able to run "bash" command from command prompt
>>>>> (Menu Start > RUN > cmd.exe)
>>>>>
>>>>> Then you'r done - everything will be working.
>>>>>
>>>>> I must add it to wiki, I forgot about whoami problem.
>>>>>
>>>>> Take care,
>>>>> Bartosz
>>>>>
>>>>> sanjoy.gh...@thomsonreuters.com pisze:
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks for the suggestion Bartosz.  I downloaded whoami, and It
>>>>>> promptly
>>>>>> crashed on "bash".
>>>>>>
>>>>>> 09/04/10 12:02:28 WARN fs.FileSystem: uri=file:///
>>>>>> javax.security.auth.login.LoginException: Login failed: Cannot run
>>>>>> program "bash": CreateProcess error=2, The system cannot find the file
>>>>>> specified
>>>>>>      at
>>>>>>
>>>>>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>>>>> nformation.java:250)
>>>>>>      at
>>>>>>
>>>>>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>>>>> nformation.java:275)
>>>>>>      at
>>>>>>
>>>>>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>>>>> nformation.java:257)
>>>>>>      at
>>>>>>
>>>>>> org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformati
>>>>>> on.java:67)
>>>>>>      at
>>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1438)
>>>>>>      at
>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1376)
>>>>>>      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
>>>>>>      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
>>>>>>      at org.apache.nutch.crawl.Crawl.main(Crawl.java:84)
>>>>>>
>>>>>> Where am I going to find "bash" on Windows without running commandline
>>>>>> cygwin?  Is there a way to turn off this security in Hadoop?
>>>>>>
>>>>>> Thanks,
>>>>>> Sanjoy
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Bartosz Gadzimski [mailto:bartek...@o2.pl] Sent: Friday, April
>>>>>> 10,
>>>>>> 2009 5:06 AM
>>>>>> To: nutch-dev@lucene.apache.org
>>>>>> Subject: Re: login failed exception
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I am not sure if it's the case but you should try to add whoami to
>>>>>> your
>>>>>> windows box.
>>>>>>
>>>>>> for example for windows xp and sp2:
>>>>>>
>>>>>> http://www.microsoft.com/downloads/details.aspx?FamilyId=49AE8576-9BB9-4
>>>>>> 126-9761-BA8011FABF38&displaylang=en
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Bartosz
>>>>>>
>>>>>> Frank McCown pisze:
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> I've been running 0.9 in Eclipse on Windows for some time, and I was
>>>>>>> successful in running the NutchBean from version 1.0 in Eclipse, but
>>>>>>> the crawler gave me the same exception as it gave this individual.
>>>>>>> Maybe there's something else I'm overlooking, but I followed the
>>>>>>> Tutorial at
>>>>>>>
>>>>>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>>>>
>>>>>>> to a T.  I'll keep working on it though.
>>>>>>>
>>>>>>> Frank
>>>>>>>
>>>>>>>
>>>>>>> 2009/4/10 Bartosz Gadzimski <bartek...@o2.pl>:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> fmccown pisze:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> You must run Nutch's crawler using cygwin on Windows since cygwin
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>> has the
>>>>>>
>>>>>>
>>>>>>
>>>>>>>>>
>>>>>>>>> whoami program.  If you run it from Eclipse on Windows, it can't
>>>>>>>>> use
>>>>>>>>> cygwin's whoami program and will fail with the exceptions you saw.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>> This
>>>>>>
>>>>>>
>>>>>>
>>>>>>>>>
>>>>>>>>> is
>>>>>>>>> an unfortunately design decision in Hadoop which makes anything
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>> after
>>>>>>
>>>>>>
>>>>>>
>>>>>>>>>
>>>>>>>>> version 9.0 not work in Eclipse on Windows.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> It's not true, please look at
>>>>>>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>>>>>
>>>>>>>> I am using nutch 1.0 with eclipse on windows with no problems.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Bartosz
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

Reply via email to