Re: Name node heap space problem

2008-07-28 Thread Gert Pfeifer

Bull's eye. I am using 0.17.1.

Taeho Kang schrieb:

Gert,
What version of Hadoop are you using?

One of the people at my work who is using 0.17.1 is reporting a similar
problem - namenode's heapspace filling up too fast.

This is the status of his cluster (17 node cluster with version 0.17.1)
*- 174541 files and directories, 121000 blocks = 295541 total. Heap Size is
898.38 MB / 1.74 GB (50%) **
*
Here is the status of one of my clusters. (70 node cluster with version
0.16.3)
- *265241 files and directories, 1155060 blocks = 1420301 total. Heap Size
is 797.94 MB / 1.39 GB (56%)*
**
Notice that the second cluster has about 9 times more blocks than the first
one (and more files and dir's, too) but heap usage is in similar figures
(actually smaller...)

Has anyone also noticed any problems/inefficiencies in namenode's memory
utilization in 0.17.x version?




On Mon, Jul 28, 2008 at 2:13 AM, Gert Pfeifer
<[EMAIL PROTECTED]>wrote:


There I have:
  export HADOOP_HEAPSIZE=8000
,which should be enough (actually in this case I don't know).

Running the fsck on the directory it turned out that there are 1785959
files in this dir... I have no clue how I can get  the data out of there.
Can I somehow calculate, how much heap a namenode would need to do an ls on
this dir?

Gert


Taeho Kang schrieb:

Check how much memory is allocated for the JVM running namenode.

In a file HADOOP_INSTALL/conf/hadoop-env.sh
you should change a line that starts with "export HADOOP_HEAPSIZE=1000"

It's set to 1GB by default.


On Fri, Jul 25, 2008 at 2:51 AM, Gert Pfeifer <
[EMAIL PROTECTED]>
wrote:

Update on this one...

I put some more memory in the machine running the name node. Now fsck is
running. Unfortunately ls fails with a time-out.

I identified one directory that causes the trouble. I can run fsck on it
but not ls.

What could be the problem?

Gert

Gert Pfeifer schrieb:

Hi,


I am running a Hadoop DFS on a cluster of 5 data nodes with a name node
and one secondary name node.

I have 1788874 files and directories, 1465394 blocks = 3254268 total.
Heap Size max is 3.47 GB.

My problem is that I produce many small files. Therefore I have a cron
job which just runs daily across the new files and copies them into
bigger files and deletes the small files.

Apart from this program, even a fsck kills the cluster.

The problem is that, as soon as I start this program, the heap space of
the name node reaches 100 %.

What could be the problem? There are not many small files right now and
still it doesn't work. I guess we have this problem since the upgrade to
0.17.

Here is some additional data about the DFS:
Capacity :   2 TB
DFS Remaining   :   1.19 TB
DFS Used:   719.35 GB
DFS Used%   :   35.16 %

Thanks for hints,
Gert








Re: Name node heap space problem

2008-07-28 Thread Taeho Kang
Gert,
What version of Hadoop are you using?

One of the people at my work who is using 0.17.1 is reporting a similar
problem - namenode's heapspace filling up too fast.

This is the status of his cluster (17 node cluster with version 0.17.1)
*- 174541 files and directories, 121000 blocks = 295541 total. Heap Size is
898.38 MB / 1.74 GB (50%) **
*
Here is the status of one of my clusters. (70 node cluster with version
0.16.3)
- *265241 files and directories, 1155060 blocks = 1420301 total. Heap Size
is 797.94 MB / 1.39 GB (56%)*
**
Notice that the second cluster has about 9 times more blocks than the first
one (and more files and dir's, too) but heap usage is in similar figures
(actually smaller...)

Has anyone also noticed any problems/inefficiencies in namenode's memory
utilization in 0.17.x version?




On Mon, Jul 28, 2008 at 2:13 AM, Gert Pfeifer
<[EMAIL PROTECTED]>wrote:

> There I have:
>   export HADOOP_HEAPSIZE=8000
> ,which should be enough (actually in this case I don't know).
>
> Running the fsck on the directory it turned out that there are 1785959
> files in this dir... I have no clue how I can get  the data out of there.
> Can I somehow calculate, how much heap a namenode would need to do an ls on
> this dir?
>
> Gert
>
>
> Taeho Kang schrieb:
>
> Check how much memory is allocated for the JVM running namenode.
>>
>> In a file HADOOP_INSTALL/conf/hadoop-env.sh
>> you should change a line that starts with "export HADOOP_HEAPSIZE=1000"
>>
>> It's set to 1GB by default.
>>
>>
>> On Fri, Jul 25, 2008 at 2:51 AM, Gert Pfeifer <
>> [EMAIL PROTECTED]>
>> wrote:
>>
>> Update on this one...
>>>
>>> I put some more memory in the machine running the name node. Now fsck is
>>> running. Unfortunately ls fails with a time-out.
>>>
>>> I identified one directory that causes the trouble. I can run fsck on it
>>> but not ls.
>>>
>>> What could be the problem?
>>>
>>> Gert
>>>
>>> Gert Pfeifer schrieb:
>>>
>>> Hi,
>>>
 I am running a Hadoop DFS on a cluster of 5 data nodes with a name node
 and one secondary name node.

 I have 1788874 files and directories, 1465394 blocks = 3254268 total.
 Heap Size max is 3.47 GB.

 My problem is that I produce many small files. Therefore I have a cron
 job which just runs daily across the new files and copies them into
 bigger files and deletes the small files.

 Apart from this program, even a fsck kills the cluster.

 The problem is that, as soon as I start this program, the heap space of
 the name node reaches 100 %.

 What could be the problem? There are not many small files right now and
 still it doesn't work. I guess we have this problem since the upgrade to
 0.17.

 Here is some additional data about the DFS:
 Capacity :   2 TB
 DFS Remaining   :   1.19 TB
 DFS Used:   719.35 GB
 DFS Used%   :   35.16 %

 Thanks for hints,
 Gert


>>>
>>
>


Re: Name node heap space problem

2008-07-28 Thread Konstantin Shvachko

It looks like you have the whole file system flattened in one directory.
Both fsck and ls call the same method on the name-node getListing(), which 
returns
an array FileStatus for each file in the directory.
I think that fsck works in this case because it does not use rpc and therefore
does not create an additional copy of the array of FileStatus-es, as opposed
to ls, which gets the array and send it back as an rpc reply. The rpc system
serializes the reply, and this where you get the second copy of the array.

You can try to add more memory on the node, or you can also try to break the
directory into smaller directories, say by moving files starting with 'a', 'b', 
'c', etc.
into new separate directories.

--Konstantin


Gert Pfeifer wrote:

There I have:
   export HADOOP_HEAPSIZE=8000
,which should be enough (actually in this case I don't know).

Running the fsck on the directory it turned out that there are 1785959 
files in this dir... I have no clue how I can get  the data out of there.
Can I somehow calculate, how much heap a namenode would need to do an ls 
on this dir?


Gert


Taeho Kang schrieb:

Check how much memory is allocated for the JVM running namenode.

In a file HADOOP_INSTALL/conf/hadoop-env.sh
you should change a line that starts with "export HADOOP_HEAPSIZE=1000"

It's set to 1GB by default.


On Fri, Jul 25, 2008 at 2:51 AM, Gert Pfeifer 
<[EMAIL PROTECTED]>

wrote:


Update on this one...

I put some more memory in the machine running the name node. Now fsck is
running. Unfortunately ls fails with a time-out.

I identified one directory that causes the trouble. I can run fsck on it
but not ls.

What could be the problem?

Gert

Gert Pfeifer schrieb:

Hi,

I am running a Hadoop DFS on a cluster of 5 data nodes with a name node
and one secondary name node.

I have 1788874 files and directories, 1465394 blocks = 3254268 total.
Heap Size max is 3.47 GB.

My problem is that I produce many small files. Therefore I have a cron
job which just runs daily across the new files and copies them into
bigger files and deletes the small files.

Apart from this program, even a fsck kills the cluster.

The problem is that, as soon as I start this program, the heap space of
the name node reaches 100 %.

What could be the problem? There are not many small files right now and
still it doesn't work. I guess we have this problem since the 
upgrade to

0.17.

Here is some additional data about the DFS:
Capacity :   2 TB
DFS Remaining   :   1.19 TB
DFS Used:   719.35 GB
DFS Used%   :   35.16 %

Thanks for hints,
Gert










Re: Name node heap space problem

2008-07-27 Thread Gert Pfeifer

There I have:
   export HADOOP_HEAPSIZE=8000
,which should be enough (actually in this case I don't know).

Running the fsck on the directory it turned out that there are 1785959 
files in this dir... I have no clue how I can get  the data out of there.
Can I somehow calculate, how much heap a namenode would need to do an ls 
on this dir?


Gert


Taeho Kang schrieb:

Check how much memory is allocated for the JVM running namenode.

In a file HADOOP_INSTALL/conf/hadoop-env.sh
you should change a line that starts with "export HADOOP_HEAPSIZE=1000"

It's set to 1GB by default.


On Fri, Jul 25, 2008 at 2:51 AM, Gert Pfeifer <[EMAIL PROTECTED]>
wrote:


Update on this one...

I put some more memory in the machine running the name node. Now fsck is
running. Unfortunately ls fails with a time-out.

I identified one directory that causes the trouble. I can run fsck on it
but not ls.

What could be the problem?

Gert

Gert Pfeifer schrieb:

Hi,

I am running a Hadoop DFS on a cluster of 5 data nodes with a name node
and one secondary name node.

I have 1788874 files and directories, 1465394 blocks = 3254268 total.
Heap Size max is 3.47 GB.

My problem is that I produce many small files. Therefore I have a cron
job which just runs daily across the new files and copies them into
bigger files and deletes the small files.

Apart from this program, even a fsck kills the cluster.

The problem is that, as soon as I start this program, the heap space of
the name node reaches 100 %.

What could be the problem? There are not many small files right now and
still it doesn't work. I guess we have this problem since the upgrade to
0.17.

Here is some additional data about the DFS:
Capacity :   2 TB
DFS Remaining   :   1.19 TB
DFS Used:   719.35 GB
DFS Used%   :   35.16 %

Thanks for hints,
Gert









Re: Name node heap space problem

2008-07-24 Thread Taeho Kang
Check how much memory is allocated for the JVM running namenode.

In a file HADOOP_INSTALL/conf/hadoop-env.sh
you should change a line that starts with "export HADOOP_HEAPSIZE=1000"

It's set to 1GB by default.


On Fri, Jul 25, 2008 at 2:51 AM, Gert Pfeifer <[EMAIL PROTECTED]>
wrote:

> Update on this one...
>
> I put some more memory in the machine running the name node. Now fsck is
> running. Unfortunately ls fails with a time-out.
>
> I identified one directory that causes the trouble. I can run fsck on it
> but not ls.
>
> What could be the problem?
>
> Gert
>
> Gert Pfeifer schrieb:
>
> Hi,
>> I am running a Hadoop DFS on a cluster of 5 data nodes with a name node
>> and one secondary name node.
>>
>> I have 1788874 files and directories, 1465394 blocks = 3254268 total.
>> Heap Size max is 3.47 GB.
>>
>> My problem is that I produce many small files. Therefore I have a cron
>> job which just runs daily across the new files and copies them into
>> bigger files and deletes the small files.
>>
>> Apart from this program, even a fsck kills the cluster.
>>
>> The problem is that, as soon as I start this program, the heap space of
>> the name node reaches 100 %.
>>
>> What could be the problem? There are not many small files right now and
>> still it doesn't work. I guess we have this problem since the upgrade to
>> 0.17.
>>
>> Here is some additional data about the DFS:
>> Capacity :   2 TB
>> DFS Remaining   :   1.19 TB
>> DFS Used:   719.35 GB
>> DFS Used%   :   35.16 %
>>
>> Thanks for hints,
>> Gert
>>
>
>


Re: Name node heap space problem

2008-07-24 Thread Gert Pfeifer

Update on this one...

I put some more memory in the machine running the name node. Now fsck is 
running. Unfortunately ls fails with a time-out.


I identified one directory that causes the trouble. I can run fsck on it 
but not ls.


What could be the problem?

Gert

Gert Pfeifer schrieb:

Hi,
I am running a Hadoop DFS on a cluster of 5 data nodes with a name node
and one secondary name node.

I have 1788874 files and directories, 1465394 blocks = 3254268 total.
Heap Size max is 3.47 GB.

My problem is that I produce many small files. Therefore I have a cron
job which just runs daily across the new files and copies them into
bigger files and deletes the small files.

Apart from this program, even a fsck kills the cluster.

The problem is that, as soon as I start this program, the heap space of
the name node reaches 100 %.

What could be the problem? There are not many small files right now and
still it doesn't work. I guess we have this problem since the upgrade to
0.17.

Here is some additional data about the DFS:
Capacity :   2 TB
DFS Remaining   :   1.19 TB
DFS Used:   719.35 GB
DFS Used%   :   35.16 %

Thanks for hints,
Gert




Name node heap space problem

2008-07-16 Thread Gert Pfeifer
Hi,
I am running a Hadoop DFS on a cluster of 5 data nodes with a name node
and one secondary name node.

I have 1788874 files and directories, 1465394 blocks = 3254268 total.
Heap Size max is 3.47 GB.

My problem is that I produce many small files. Therefore I have a cron
job which just runs daily across the new files and copies them into
bigger files and deletes the small files.

Apart from this program, even a fsck kills the cluster.

The problem is that, as soon as I start this program, the heap space of
the name node reaches 100 %.

What could be the problem? There are not many small files right now and
still it doesn't work. I guess we have this problem since the upgrade to
0.17.

Here is some additional data about the DFS:
Capacity :   2 TB
DFS Remaining   :   1.19 TB
DFS Used:   719.35 GB
DFS Used%   :   35.16 %

Thanks for hints,
Gert