I'm guessing your OutOfMemory then is due to "Unable to create native thread" message? Do you mind sharing your error logs with us? Cause if its that, then its a ulimit/system limits issue and not a real memory issue.
On Sat, Mar 23, 2013 at 2:30 PM, Ted <r6squee...@gmail.com> wrote: > I just checked and after running my tests, I generate only 670mb of > data, on 89 blocks. > > What's more, when I ran the test this time, I had increased my memory > to 2048mb so it completed fine - but I decided to run jconsole through > the test so I could see what's happenning. The data node never > exceeded 200mb of memory usage. It mostly stayed under 100mb. > > I'm not sure why it would complain about out of memory and shut itself > down when it was only 1024. It was fairly consistently doing that the > last few days including this morning right before I switched it to > 2048. > > I'm going to run the test again with 1024mb and jconsole running, none > of this makes any sense to me. > > On 3/23/13, Harsh J <ha...@cloudera.com> wrote: >> I run a 128 MB heap size DN for my simple purposes on my Mac and it >> runs well for what load I apply on it. >> >> A DN's primary, growing memory consumption comes from the # of blocks >> it carries. All of these blocks' file paths are mapped and kept in the >> RAM during its lifetime. If your DN has acquired a lot of blocks by >> now, like say close to a million or more, then 1 GB may not suffice >> anymore to hold them in and you'd need to scale up (add more RAM or >> increase heap size if you have more RAM)/scale out (add another node >> and run the balancer). >> >> On Sat, Mar 23, 2013 at 10:03 AM, Ted <r6squee...@gmail.com> wrote: >>> Hi I'm new to hadoop/hdfs and I'm just running some tests on my local >>> machines in a single node setup. I'm encountering out of memory errors >>> on the jvm running my data node. >>> >>> I'm pretty sure I can just increase the heap size to fix the errors, >>> but my question is about how memory is actually used. >>> >>> As an example, with other things like an OS's disk-cache or say >>> databases, if you have or let it use as an example 1gb of ram, it will >>> "work" with what it has available, if the data is more than 1gb of ram >>> it just means it'll swap in and out of memory/disk more often, i.e. >>> the cached data is smaller. If you give it 8gb of ram it still >>> functions the same, just performance increases. >>> >>> With my hdfs setup, this does not appear to be true, if I allocate it >>> 1gb of heap, it doesn't just perform worst / swap data to disk more. >>> It out right fails with out of memory and shuts the data node down. >>> >>> So my question is... how do I really tune the memory / decide how much >>> memory I need to prevent shutdowns? Is 1gb just too small even on a >>> single machine test environment with almost no data at all, or is it >>> suppose to work like OS-disk caches were it always works but just >>> performs better or worst and I just have something configured wrong?. >>> Basically my objective isn't performance, it's that the server must >>> not shut itself down, it can slow down but not shut off. >>> >>> -- >>> Ted. >> >> >> >> -- >> Harsh J >> > > > -- > Ted. -- Harsh J