Yes, I see the same behavior. Usually reducing the number of concurrent tasks, reducing memory usage, or breaking up I/O helps. It could be other things as well, but that's been my experience.
Take care, -stu -----Original Message----- From: baran cakici <barancak...@gmail.com> Date: Wed, 16 Mar 2011 21:15:58 To: <hdfs-user@hadoop.apache.org>; <stu24m...@yahoo.com> Reply-To: hdfs-user@hadoop.apache.org Subject: Re: Lost Task Tracker because of no heartbeat Hi Stu, Actually I lose the Task Tracker not always. For example, when I start Tracker just new, then I can have some Results, but after 2. or 3. Try(Job) I lose Tracker. I mean it is somehow not stable. Take Care too Baran 2011/3/16 <stu24m...@yahoo.com> > I'm not sure, but I've seen TT failures from jobs that blocked on > computationally intense operation (GC?) or large I/O. > > Take care, > -stu > ------------------------------ > *From: *baran cakici <barancak...@gmail.com> > *Date: *Wed, 16 Mar 2011 17:52:10 +0100 > *To: *<hdfs-user@hadoop.apache.org> > *ReplyTo: *hdfs-user@hadoop.apache.org > *Subject: *Lost Task Tracker because of no heartbeat > > Hi Everyone, > > I make a Project with Hadoop-MapRedeuce for my master-Thesis. I have a > strange problem on my System. > > First of all, I use Hadoop-0.20.2 on Windows XP Pro with Eclipse Plug-In. > When I start a job with big Input(4GB - it`s may be not to big, but > algorithm require some time), then i lose my Task Tracker in several minutes > or seconds. I mean, "Seconds since heartbeat" increase > and then after 600 Seconds I lose TaskTracker. > > I read somewhere, that can be occured because of small number of open files > (ulimit -n). I try to increase this value, but i can write as max value in > Cygwin 3200.(ulimit -n 3200) and default value is 256. Actually I don`t > know, is it helps or not. > > In my job and task tracker.log have I some Errors, I posted those to. > > Jobtracker.log > > -Call to localhost/127.0.0.1:9000 failed on local exception: > java.io.IOException: An existing connection was forcibly closed by the > remote host > > another : > - > 2011-03-15 12:13:30,718 INFO org.apache.hadoop.mapred.JobTracker: > attempt_201103151143_0002_m_000091_0 is 97125 ms debug. > 2011-03-15 12:16:50,718 INFO org.apache.hadoop.mapred.JobTracker: > attempt_201103151143_0002_m_000091_0 is 297125 ms debug. > 2011-03-15 12:20:10,718 INFO org.apache.hadoop.mapred.JobTracker: > attempt_201103151143_0002_m_000091_0 is 497125 ms debug. > 2011-03-15 12:23:30,718 INFO org.apache.hadoop.mapred.JobTracker: > attempt_201103151143_0002_m_000091_0 is 697125 ms debug. > > Error launching task > Lost tracker 'tracker_apple:localhost/127.0.0.1:2654' > > there are my logs(jobtracker.log, tasktracker.log ...) in attachment > > I need really Help, I don`t have so much time for my Thessis. > > Thanks a lot for your Helps, > > Baran > > > > >