You may want to check why System time is high. Check your system call stats. This should give you some clue.
-Bharath ________________________________ From: Robert Dyer <rd...@iastate.edu> To: user@hadoop.apache.org; Bharath Mundlapudi <bharathw...@yahoo.com> Sent: Monday, December 10, 2012 7:32 PM Subject: Re: Strange machine behavior Yes there is performance impact. It should be visible from the graph I attached. Basically, the CPU is spending much more time on System and the User time is lowered. When this happens (if I don't do a drop_caches in time) the MR job winds up taking significantly longer than usual. On Mon, Dec 10, 2012 at 8:06 PM, Bharath Mundlapudi <bharathw...@yahoo.com> wrote: Are you seeing any performance impact with this cache increase? It is normal in linux system to grab high cache level. > > > >-Bharath > > > >________________________________ > From: Andy Isaacson <a...@cloudera.com> >To: user@hadoop.apache.org >Sent: Monday, December 10, 2012 11:23 AM >Subject: Re: Strange machine behavior > > >What kernel did you see this on? Was there significant swap traffic >(si/so in vmstat output) during the high-system-time period? > >BTW, you don't need to nor do you want to run sync(1) when >manipulating drop_caches, it just causes additional noise and >slowdown. drop_caches doesn't have any impact on correctness; it won't >cause data loss (by dropping a dirty page or whatever). I've had sync >calls take 10 minutes to complete, so the unnecessary impact can be >significant. > >-andy > >On Sat, Dec 8, 2012 at 4:09 PM, Robert Dyer <rd...@iastate.edu> wrote: >> Has anyone experienced a TaskTracker/DataNode behaving like the attached >> image? >> >> This was during a MR job (which runs often). Note the extremely high System >> CPU time. Upon investigating I saw that out of 64GB ram the system had >> allocated almost 45GB to cache! >> >> I did a sudo sh -c "sync ; echo 3 > /proc/sys/vm/drop_cache ; sync" which is >> roughly where the graph goes back to normal (much lower System, much higher >> User). >> >> This has happened a few times. >> >> I have tried playing with the sysctl vm.swappiness value (default of 60) by >> setting it to 30 (which it was at when the graph was collected) and now to >> 10. I am not sure that helps. >> >> Any ideas? Anyone else run into this before? >> >> 24 cores >> 64GB ram >> 4x2TB sata3 hdd >> >> Running Hadoop 1.0.4, with a DataNode (2gb heap), TaskTracker (2gb heap) on >> this machine. >> >> 24 map slots (1gb heap each), no reducers. >> >> Also running HBase 0.94.2 with a RS (8gb ram) on this machine. > > > -- Robert Dyer rd...@iastate.edu