So it turns out the issue was just the size of the filesystem.
2012-12-27 16:37:22,390 WARN
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Checkpoint done.
New Image Size: 4,354,340,042
Basically if the NN image size hits ~ 5,000,000,000 you get f'ed. So you
need about 3x ram as your
You did free up lot of old generation with reducing young generation,
right? The extra 5G of RAM for the old generation should have helped.
Based on my calculation, for the current number of objects you have, you
need roughly:
12G of total heap with young generation size of 1G. This assumes the
I am not sure GC had a factor. Even when I forced a GC it cleared 0%
memory. One would think that since the entire NameNode image is stored in
memory that the heap would not need to grow beyond that, but that sure does
not seem to be the case. a 5GB image starts off using 10GB of memory and
after
I do not follow what you mean here.
Even when I forced a GC it cleared 0% memory.
Is this with new younggen setting? Because earlier, based on the
calculation I posted, you need ~11G in old generation. With 6G as the
default younggen size, you actually had just enough memory to fit the
namespace
I tried your suggested setting and forced GC from Jconsole and once it
crept up nothing was freeing up.
So just food for thought:
You said average file name size is 32 bytes. Well most of my data sits in
/user/hive/warehouse/
Then I have a tables with partitions.
Does it make sense to just
I tried your suggested setting and forced GC from Jconsole and once it
crept up nothing was freeing up.
That is very surprising. If possible, take a live dump when namenode starts
up (when memory used is low) and when namenode memory consumption has gone
up considerably, closer to the heap
Hi,
is there a way to find out in the setup function of a mapper on which
node of the cluster the current mapper is running ?
thank you very much,
Eduard
You don't need Hadoop to do this. Just use an InetAddress.
http://docs.oracle.com/javase/6/docs/api/java/net/InetAddress.html
--Bobby
On 12/27/12 8:51 AM, Eduard Skaley e.v.ska...@gmail.com wrote:
Hi,
is there a way to find out in the setup function of a mapper on which
node of the cluster
Hi,
Firstly, I am talking about Hadoop 1.0. Please note that in Hadoop 2.x and
trunk, the Mapreduce framework is completely revamped to Yarn (
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html)
and you may need to look at different interfaces for building your own
what are those libraries and how are they reading data from HDFS? you were
trying with MR jobs if i'm not wrong? in order to perform read/write on
HDFS we need HDFS API with a Configuration object. how are you doing it
here?
Best Regards,
Tariq
+91-9741563634
https://mtariq.jux.com/
On Fri, Dec
--
Ray Bagby
Weatherford, OK
11 matches
Mail list logo