Re: Name node heap space problem
Bull's eye. I am using 0.17.1. Taeho Kang schrieb: Gert, What version of Hadoop are you using? One of the people at my work who is using 0.17.1 is reporting a similar problem - namenode's heapspace filling up too fast. This is the status of his cluster (17 node cluster with version 0.17.1) *- 174541 files and directories, 121000 blocks = 295541 total. Heap Size is 898.38 MB / 1.74 GB (50%) ** * Here is the status of one of my clusters. (70 node cluster with version 0.16.3) - *265241 files and directories, 1155060 blocks = 1420301 total. Heap Size is 797.94 MB / 1.39 GB (56%)* ** Notice that the second cluster has about 9 times more blocks than the first one (and more files and dir's, too) but heap usage is in similar figures (actually smaller...) Has anyone also noticed any problems/inefficiencies in namenode's memory utilization in 0.17.x version? On Mon, Jul 28, 2008 at 2:13 AM, Gert Pfeifer <[EMAIL PROTECTED]>wrote: There I have: export HADOOP_HEAPSIZE=8000 ,which should be enough (actually in this case I don't know). Running the fsck on the directory it turned out that there are 1785959 files in this dir... I have no clue how I can get the data out of there. Can I somehow calculate, how much heap a namenode would need to do an ls on this dir? Gert Taeho Kang schrieb: Check how much memory is allocated for the JVM running namenode. In a file HADOOP_INSTALL/conf/hadoop-env.sh you should change a line that starts with "export HADOOP_HEAPSIZE=1000" It's set to 1GB by default. On Fri, Jul 25, 2008 at 2:51 AM, Gert Pfeifer < [EMAIL PROTECTED]> wrote: Update on this one... I put some more memory in the machine running the name node. Now fsck is running. Unfortunately ls fails with a time-out. I identified one directory that causes the trouble. I can run fsck on it but not ls. What could be the problem? Gert Gert Pfeifer schrieb: Hi, I am running a Hadoop DFS on a cluster of 5 data nodes with a name node and one secondary name node. I have 1788874 files and directories, 1465394 blocks = 3254268 total. Heap Size max is 3.47 GB. My problem is that I produce many small files. Therefore I have a cron job which just runs daily across the new files and copies them into bigger files and deletes the small files. Apart from this program, even a fsck kills the cluster. The problem is that, as soon as I start this program, the heap space of the name node reaches 100 %. What could be the problem? There are not many small files right now and still it doesn't work. I guess we have this problem since the upgrade to 0.17. Here is some additional data about the DFS: Capacity : 2 TB DFS Remaining : 1.19 TB DFS Used: 719.35 GB DFS Used% : 35.16 % Thanks for hints, Gert
Re: Name node heap space problem
Gert, What version of Hadoop are you using? One of the people at my work who is using 0.17.1 is reporting a similar problem - namenode's heapspace filling up too fast. This is the status of his cluster (17 node cluster with version 0.17.1) *- 174541 files and directories, 121000 blocks = 295541 total. Heap Size is 898.38 MB / 1.74 GB (50%) ** * Here is the status of one of my clusters. (70 node cluster with version 0.16.3) - *265241 files and directories, 1155060 blocks = 1420301 total. Heap Size is 797.94 MB / 1.39 GB (56%)* ** Notice that the second cluster has about 9 times more blocks than the first one (and more files and dir's, too) but heap usage is in similar figures (actually smaller...) Has anyone also noticed any problems/inefficiencies in namenode's memory utilization in 0.17.x version? On Mon, Jul 28, 2008 at 2:13 AM, Gert Pfeifer <[EMAIL PROTECTED]>wrote: > There I have: > export HADOOP_HEAPSIZE=8000 > ,which should be enough (actually in this case I don't know). > > Running the fsck on the directory it turned out that there are 1785959 > files in this dir... I have no clue how I can get the data out of there. > Can I somehow calculate, how much heap a namenode would need to do an ls on > this dir? > > Gert > > > Taeho Kang schrieb: > > Check how much memory is allocated for the JVM running namenode. >> >> In a file HADOOP_INSTALL/conf/hadoop-env.sh >> you should change a line that starts with "export HADOOP_HEAPSIZE=1000" >> >> It's set to 1GB by default. >> >> >> On Fri, Jul 25, 2008 at 2:51 AM, Gert Pfeifer < >> [EMAIL PROTECTED]> >> wrote: >> >> Update on this one... >>> >>> I put some more memory in the machine running the name node. Now fsck is >>> running. Unfortunately ls fails with a time-out. >>> >>> I identified one directory that causes the trouble. I can run fsck on it >>> but not ls. >>> >>> What could be the problem? >>> >>> Gert >>> >>> Gert Pfeifer schrieb: >>> >>> Hi, >>> I am running a Hadoop DFS on a cluster of 5 data nodes with a name node and one secondary name node. I have 1788874 files and directories, 1465394 blocks = 3254268 total. Heap Size max is 3.47 GB. My problem is that I produce many small files. Therefore I have a cron job which just runs daily across the new files and copies them into bigger files and deletes the small files. Apart from this program, even a fsck kills the cluster. The problem is that, as soon as I start this program, the heap space of the name node reaches 100 %. What could be the problem? There are not many small files right now and still it doesn't work. I guess we have this problem since the upgrade to 0.17. Here is some additional data about the DFS: Capacity : 2 TB DFS Remaining : 1.19 TB DFS Used: 719.35 GB DFS Used% : 35.16 % Thanks for hints, Gert >>> >> >
Re: Name node heap space problem
It looks like you have the whole file system flattened in one directory. Both fsck and ls call the same method on the name-node getListing(), which returns an array FileStatus for each file in the directory. I think that fsck works in this case because it does not use rpc and therefore does not create an additional copy of the array of FileStatus-es, as opposed to ls, which gets the array and send it back as an rpc reply. The rpc system serializes the reply, and this where you get the second copy of the array. You can try to add more memory on the node, or you can also try to break the directory into smaller directories, say by moving files starting with 'a', 'b', 'c', etc. into new separate directories. --Konstantin Gert Pfeifer wrote: There I have: export HADOOP_HEAPSIZE=8000 ,which should be enough (actually in this case I don't know). Running the fsck on the directory it turned out that there are 1785959 files in this dir... I have no clue how I can get the data out of there. Can I somehow calculate, how much heap a namenode would need to do an ls on this dir? Gert Taeho Kang schrieb: Check how much memory is allocated for the JVM running namenode. In a file HADOOP_INSTALL/conf/hadoop-env.sh you should change a line that starts with "export HADOOP_HEAPSIZE=1000" It's set to 1GB by default. On Fri, Jul 25, 2008 at 2:51 AM, Gert Pfeifer <[EMAIL PROTECTED]> wrote: Update on this one... I put some more memory in the machine running the name node. Now fsck is running. Unfortunately ls fails with a time-out. I identified one directory that causes the trouble. I can run fsck on it but not ls. What could be the problem? Gert Gert Pfeifer schrieb: Hi, I am running a Hadoop DFS on a cluster of 5 data nodes with a name node and one secondary name node. I have 1788874 files and directories, 1465394 blocks = 3254268 total. Heap Size max is 3.47 GB. My problem is that I produce many small files. Therefore I have a cron job which just runs daily across the new files and copies them into bigger files and deletes the small files. Apart from this program, even a fsck kills the cluster. The problem is that, as soon as I start this program, the heap space of the name node reaches 100 %. What could be the problem? There are not many small files right now and still it doesn't work. I guess we have this problem since the upgrade to 0.17. Here is some additional data about the DFS: Capacity : 2 TB DFS Remaining : 1.19 TB DFS Used: 719.35 GB DFS Used% : 35.16 % Thanks for hints, Gert
Re: Name node heap space problem
There I have: export HADOOP_HEAPSIZE=8000 ,which should be enough (actually in this case I don't know). Running the fsck on the directory it turned out that there are 1785959 files in this dir... I have no clue how I can get the data out of there. Can I somehow calculate, how much heap a namenode would need to do an ls on this dir? Gert Taeho Kang schrieb: Check how much memory is allocated for the JVM running namenode. In a file HADOOP_INSTALL/conf/hadoop-env.sh you should change a line that starts with "export HADOOP_HEAPSIZE=1000" It's set to 1GB by default. On Fri, Jul 25, 2008 at 2:51 AM, Gert Pfeifer <[EMAIL PROTECTED]> wrote: Update on this one... I put some more memory in the machine running the name node. Now fsck is running. Unfortunately ls fails with a time-out. I identified one directory that causes the trouble. I can run fsck on it but not ls. What could be the problem? Gert Gert Pfeifer schrieb: Hi, I am running a Hadoop DFS on a cluster of 5 data nodes with a name node and one secondary name node. I have 1788874 files and directories, 1465394 blocks = 3254268 total. Heap Size max is 3.47 GB. My problem is that I produce many small files. Therefore I have a cron job which just runs daily across the new files and copies them into bigger files and deletes the small files. Apart from this program, even a fsck kills the cluster. The problem is that, as soon as I start this program, the heap space of the name node reaches 100 %. What could be the problem? There are not many small files right now and still it doesn't work. I guess we have this problem since the upgrade to 0.17. Here is some additional data about the DFS: Capacity : 2 TB DFS Remaining : 1.19 TB DFS Used: 719.35 GB DFS Used% : 35.16 % Thanks for hints, Gert
Re: Name node heap space problem
Check how much memory is allocated for the JVM running namenode. In a file HADOOP_INSTALL/conf/hadoop-env.sh you should change a line that starts with "export HADOOP_HEAPSIZE=1000" It's set to 1GB by default. On Fri, Jul 25, 2008 at 2:51 AM, Gert Pfeifer <[EMAIL PROTECTED]> wrote: > Update on this one... > > I put some more memory in the machine running the name node. Now fsck is > running. Unfortunately ls fails with a time-out. > > I identified one directory that causes the trouble. I can run fsck on it > but not ls. > > What could be the problem? > > Gert > > Gert Pfeifer schrieb: > > Hi, >> I am running a Hadoop DFS on a cluster of 5 data nodes with a name node >> and one secondary name node. >> >> I have 1788874 files and directories, 1465394 blocks = 3254268 total. >> Heap Size max is 3.47 GB. >> >> My problem is that I produce many small files. Therefore I have a cron >> job which just runs daily across the new files and copies them into >> bigger files and deletes the small files. >> >> Apart from this program, even a fsck kills the cluster. >> >> The problem is that, as soon as I start this program, the heap space of >> the name node reaches 100 %. >> >> What could be the problem? There are not many small files right now and >> still it doesn't work. I guess we have this problem since the upgrade to >> 0.17. >> >> Here is some additional data about the DFS: >> Capacity : 2 TB >> DFS Remaining : 1.19 TB >> DFS Used: 719.35 GB >> DFS Used% : 35.16 % >> >> Thanks for hints, >> Gert >> > >
Re: Name node heap space problem
Update on this one... I put some more memory in the machine running the name node. Now fsck is running. Unfortunately ls fails with a time-out. I identified one directory that causes the trouble. I can run fsck on it but not ls. What could be the problem? Gert Gert Pfeifer schrieb: Hi, I am running a Hadoop DFS on a cluster of 5 data nodes with a name node and one secondary name node. I have 1788874 files and directories, 1465394 blocks = 3254268 total. Heap Size max is 3.47 GB. My problem is that I produce many small files. Therefore I have a cron job which just runs daily across the new files and copies them into bigger files and deletes the small files. Apart from this program, even a fsck kills the cluster. The problem is that, as soon as I start this program, the heap space of the name node reaches 100 %. What could be the problem? There are not many small files right now and still it doesn't work. I guess we have this problem since the upgrade to 0.17. Here is some additional data about the DFS: Capacity : 2 TB DFS Remaining : 1.19 TB DFS Used: 719.35 GB DFS Used% : 35.16 % Thanks for hints, Gert
Name node heap space problem
Hi, I am running a Hadoop DFS on a cluster of 5 data nodes with a name node and one secondary name node. I have 1788874 files and directories, 1465394 blocks = 3254268 total. Heap Size max is 3.47 GB. My problem is that I produce many small files. Therefore I have a cron job which just runs daily across the new files and copies them into bigger files and deletes the small files. Apart from this program, even a fsck kills the cluster. The problem is that, as soon as I start this program, the heap space of the name node reaches 100 %. What could be the problem? There are not many small files right now and still it doesn't work. I guess we have this problem since the upgrade to 0.17. Here is some additional data about the DFS: Capacity : 2 TB DFS Remaining : 1.19 TB DFS Used: 719.35 GB DFS Used% : 35.16 % Thanks for hints, Gert