subject:"HDFS file system size issue"

Re: HDFS file system size issue

2014-04-15 Thread Saumitra Shahapure

Hi Rahman,

These are few lines from hadoop fsck / -blocks -files -locations

/mnt/hadoop/hive/warehouse/user.db/table1/000255_0 44323326 bytes, 1
block(s):  OK
0. blk_-7919979022650423857_446500 len=44323326 repl=3 [ip1:50010,
ip2:50010, ip3:50010]

/mnt/hadoop/hive/warehouse/user.db/table1/000256_0 44566965 bytes, 1
block(s):  OK
0. blk_-576894812882540_446288 len=44566965 repl=3 [ip1:50010,
ip2:50010, ip4:50010]


Biswa may have guessed replication factor from fsck summary that I posted
earlier. I am posting it again for today's run:

Status: HEALTHY
 Total size:58143055251 B
 Total dirs:307
 Total files:   5093
 Total blocks (validated):  3903 (avg. block size 14897016 B)
 Minimally replicated blocks:   3903 (100.0 %)
 Over-replicated blocks:0 (0.0 %)
 Under-replicated blocks:   92 (2.357161 %)
 Mis-replicated blocks: 0 (0.0 %)
 Default replication factor:2
 Average block replication: 3.1401486
 Corrupt blocks:0
 Missing replicas:  92 (0.75065273 %)
 Number of data-nodes:  9
 Number of racks:   1
FSCK ended at Tue Apr 15 13:20:25 UTC 2014 in 655 milliseconds


The filesystem under path '/' is HEALTHY

I have not overridden dfs.datanode.du.reserved. It defaults to 0.

$ less $HADOOP_HOME/conf/hdfs-site.xml |grep -A3 'dfs.datanode.du.reserved'
$ less $HADOOP_HOME/src/hdfs/hdfs-default.xml |grep -A3
'dfs.datanode.du.reserved'
  namedfs.datanode.du.reserved/name
  value0/value
  descriptionReserved space in bytes per volume. Always leave this much
space free for non dfs use.
  /description

Below is du -h on every node. FYI, my dfs.data.dir is /mnt/hadoop/dfs/data
and all hadoop/hive logs are dumped in /mnt/logs in various directories.
All machines have 400GB for /mnt.

$for i in `echo $dfs_slaves`; do  ssh $i 'du -sh /mnt/hadoop; du -sh
/mnt/hadoop/dfs/data; du -sh /mnt/logs;'; done


225G/mnt/hadoop
224G/mnt/hadoop/dfs/data
61M /mnt/logs

281G/mnt/hadoop
281G/mnt/hadoop/dfs/data
63M /mnt/logs

139G/mnt/hadoop
139G/mnt/hadoop/dfs/data
68M /mnt/logs

135G/mnt/hadoop
134G/mnt/hadoop/dfs/data
92M /mnt/logs

165G/mnt/hadoop
164G/mnt/hadoop/dfs/data
75M /mnt/logs

137G/mnt/hadoop
137G/mnt/hadoop/dfs/data
95M /mnt/logs

160G/mnt/hadoop
160G/mnt/hadoop/dfs/data
74M /mnt/logs

180G/mnt/hadoop
122G/mnt/hadoop/dfs/data
23M /mnt/logs

139G/mnt/hadoop
138G/mnt/hadoop/dfs/data
76M /mnt/logs



All these numbers are for today, and may differ bit from yesterday.

Today hadoop dfs -dus is 58GB and namenode is reporting DFS Used as 1.46TB.

Pardon me for making the mail dirty by lot of copy-pastes, hope it's still
readable,

-- Saumitra S. Shahapure


On Tue, Apr 15, 2014 at 2:57 AM, Abdelrahman Shettia 
ashet...@hortonworks.com wrote:

 Hi Biswa,

 Are you sure that the replication factor of the files are three? Please
 run a ‘hadoop fsck / -blocks -files -locations’ and see the replication
 factor for each file.  Also, Post the configuration of namedfs.datanode.
 du.reserved/name and please check the real space presented by a
 DataNode by running ‘du -h’

 Thanks,
 Rahman

 On Apr 14, 2014, at 2:07 PM, Saumitra saumitra.offic...@gmail.com wrote:

 Hello,

 Biswanath, looks like we have confusion in calculation, 1TB would be equal
 to 1024GB, not 114GB.


 Sandeep, I checked log directory size as well. Log directories are hardly
 in few GBs, I have configured log4j properties so that logs won’t be too
 large.

 In our slave machines, we have 450GB disk partition for hadoop logs and
 DFS. Over there logs directory is  10GBs and rest space is occupied by
 DFS. 10GB partition is for /.

 Let me quote my confusion point once again:

  Basically I wanted to point out discrepancy in name node status page and 
 hadoop
 dfs -dus. In my case, earlier one reports DFS usage as 1TB and later
 one reports it to be 35GB. What are the factors that can cause this
 difference? And why is just 35GB data causing DFS to hit its limits?



 I am talking about name node status page on 50070 port. Here is the
 screenshot of my name node status page

 Screen Shot 2014-04-15 at 2.07.19 am.png

 As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is
 spaces taken by non-DFS data like logs or other local files from users.
 Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be
 ~38GB.



 On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri nhsande...@gmail.com wrote:

  Please check your logs directory usage.



 On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak 
 biswajit.na...@inmobi.com wrote:

 Whats the replication factor you have? I believe it should be 3. hadoop
 dus shows that disk usage without replication. While name node ui page
 gives with replication.

 38gb * 3 =114gb ~ 1TB

 ~Biswa
 -oThe important thing is not to stop questioning o-


 On Mon, Apr 14, 2014 at 9:38 AM, Saumitra

Re: HDFS file system size issue

2014-04-14 Thread Biswajit Nayak

Whats the replication factor you have? I believe it should be 3. hadoop dus
shows that disk usage without replication. While name node ui page gives
with replication.

38gb * 3 =114gb ~ 1TB

~Biswa
-oThe important thing is not to stop questioning o-


On Mon, Apr 14, 2014 at 9:38 AM, Saumitra saumitra.offic...@gmail.comwrote:

 Hi Biswajeet,

 Non-dfs usage is ~100GB over the cluster. But still the number are nowhere
 near 1TB.

 Basically I wanted to point out discrepancy in name node status page and 
 hadoop
 dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
 reports it to be 35GB. What are the factors that can cause this difference?
 And why is just 35GB data causing DFS to hit its limits?




 On 14-Apr-2014, at 8:31 am, Biswajit Nayak biswajit.na...@inmobi.com
 wrote:

 Hi Saumitra,

 Could you please check the non-dfs usage. They also contribute to filling
 up the disk space.



 ~Biswa
 -oThe important thing is not to stop questioning o-


 On Mon, Apr 14, 2014 at 1:24 AM, Saumitra saumitra.offic...@gmail.comwrote:

 Hello,

 We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We
 are using default HDFS block size.

 We have noticed that disks of slaves are almost full. From name node's
 status page (namenode:50070), we could see that disks of live nodes are 90%
 full and DFS Used% in cluster summary page  is ~1TB.

 However hadoop dfs -dus / shows that file system size is merely 38GB.
 38GB number looks to be correct because we keep only few Hive tables and
 hadoop's /tmp (distributed cache and job outputs) in HDFS. All other data
 is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
 that there is no internal fragmentation because the files in our Hive
 tables are well-chopped in ~50MB chunks. Here are last few lines of
 hadoop fsck / -files -blocks

 Status: HEALTHY
  Total size: 38086441332 B
  Total dirs: 232
  Total files: 802
  Total blocks (validated): 796 (avg. block size 47847288 B)
  Minimally replicated blocks: 796 (100.0 %)
  Over-replicated blocks: 0 (0.0 %)
  Under-replicated blocks: 6 (0.75376886 %)
  Mis-replicated blocks: 0 (0.0 %)
  Default replication factor: 2
  Average block replication: 3.0439699
  Corrupt blocks: 0
  Missing replicas: 6 (0.24762692 %)
  Number of data-nodes: 9
  Number of racks: 1
 FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds


 My question is that why disks of slaves are getting full even though
 there are only few files in DFS?



 _
 The information contained in this communication is intended solely for the
 use of the individual or entity to whom it is addressed and others
 authorized to receive it. It may contain confidential or legally privileged
 information. If you are not the intended recipient you are hereby notified
 that any disclosure, copying, distribution or taking any action in reliance
 on the contents of this information is strictly prohibited and may be
 unlawful. If you have received this communication in error, please notify
 us immediately by responding to this email and then delete it from your
 system. The firm is neither liable for the proper and complete transmission
 of the information contained in this communication nor for any delay in its
 receipt.




-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.

Re: HDFS file system size issue

2014-04-14 Thread Sandeep Nemuri

Please check your logs directory usage.



On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak
biswajit.na...@inmobi.comwrote:

 Whats the replication factor you have? I believe it should be 3. hadoop
 dus shows that disk usage without replication. While name node ui page
 gives with replication.

 38gb * 3 =114gb ~ 1TB

 ~Biswa
 -oThe important thing is not to stop questioning o-


 On Mon, Apr 14, 2014 at 9:38 AM, Saumitra saumitra.offic...@gmail.comwrote:

 Hi Biswajeet,

 Non-dfs usage is ~100GB over the cluster. But still the number are
 nowhere near 1TB.

 Basically I wanted to point out discrepancy in name node status page and 
 hadoop
 dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
 reports it to be 35GB. What are the factors that can cause this difference?
 And why is just 35GB data causing DFS to hit its limits?




 On 14-Apr-2014, at 8:31 am, Biswajit Nayak biswajit.na...@inmobi.com
 wrote:

 Hi Saumitra,

 Could you please check the non-dfs usage. They also contribute to filling
 up the disk space.



 ~Biswa
 -oThe important thing is not to stop questioning o-


 On Mon, Apr 14, 2014 at 1:24 AM, Saumitra saumitra.offic...@gmail.comwrote:

 Hello,

 We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
 We are using default HDFS block size.

 We have noticed that disks of slaves are almost full. From name node’s
 status page (namenode:50070), we could see that disks of live nodes are 90%
 full and DFS Used% in cluster summary page  is ~1TB.

 However hadoop dfs -dus / shows that file system size is merely 38GB.
 38GB number looks to be correct because we keep only few Hive tables and
 hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data
 is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
 that there is no internal fragmentation because the files in our Hive
 tables are well-chopped in ~50MB chunks. Here are last few lines of
 hadoop fsck / -files -blocks

 Status: HEALTHY
  Total size: 38086441332 B
  Total dirs: 232
  Total files: 802
  Total blocks (validated): 796 (avg. block size 47847288 B)
  Minimally replicated blocks: 796 (100.0 %)
  Over-replicated blocks: 0 (0.0 %)
  Under-replicated blocks: 6 (0.75376886 %)
  Mis-replicated blocks: 0 (0.0 %)
  Default replication factor: 2
  Average block replication: 3.0439699
  Corrupt blocks: 0
  Missing replicas: 6 (0.24762692 %)
  Number of data-nodes: 9
  Number of racks: 1
 FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds


 My question is that why disks of slaves are getting full even though
 there are only few files in DFS?



 _
 The information contained in this communication is intended solely for
 the use of the individual or entity to whom it is addressed and others
 authorized to receive it. It may contain confidential or legally privileged
 information. If you are not the intended recipient you are hereby notified
 that any disclosure, copying, distribution or taking any action in reliance
 on the contents of this information is strictly prohibited and may be
 unlawful. If you have received this communication in error, please notify
 us immediately by responding to this email and then delete it from your
 system. The firm is neither liable for the proper and complete transmission
 of the information contained in this communication nor for any delay in its
 receipt.




 _
 The information contained in this communication is intended solely for the
 use of the individual or entity to whom it is addressed and others
 authorized to receive it. It may contain confidential or legally privileged
 information. If you are not the intended recipient you are hereby notified
 that any disclosure, copying, distribution or taking any action in reliance
 on the contents of this information is strictly prohibited and may be
 unlawful. If you have received this communication in error, please notify
 us immediately by responding to this email and then delete it from your
 system. The firm is neither liable for the proper and complete transmission
 of the information contained in this communication nor for any delay in its
 receipt.




-- 
--Regards
  Sandeep Nemuri

Re: HDFS file system size issue

2014-04-14 Thread Abdelrahman Shettia

Hi Biswa, 

Are you sure that the replication factor of the files are three? Please run a 
‘hadoop fsck / -blocks -files -locations’ and see the replication factor for 
each file.  Also, Post the configuration of 
namedfs.datanode.du.reserved/name and please check the real space presented 
by a DataNode by running ‘du -h’

Thanks,
Rahman

On Apr 14, 2014, at 2:07 PM, Saumitra saumitra.offic...@gmail.com wrote:

 Hello,
 
 Biswanath, looks like we have confusion in calculation, 1TB would be equal to 
 1024GB, not 114GB.
 
 
 Sandeep, I checked log directory size as well. Log directories are hardly in 
 few GBs, I have configured log4j properties so that logs won’t be too large.
 
 In our slave machines, we have 450GB disk partition for hadoop logs and DFS. 
 Over there logs directory is  10GBs and rest space is occupied by DFS. 10GB 
 partition is for /.
 
 Let me quote my confusion point once again:
 
 Basically I wanted to point out discrepancy in name node status page and 
 hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later 
 one reports it to be 35GB. What are the factors that can cause this 
 difference? And why is just 35GB data causing DFS to hit its limits?
 
 
 
 I am talking about name node status page on 50070 port. Here is the 
 screenshot of my name node status page
 
 Screen Shot 2014-04-15 at 2.07.19 am.png
 
 As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is spaces 
 taken by non-DFS data like logs or other local files from users. Namenode 
 shows that DFS used is ~1TB but hadoop dfs -dus shows it to be ~38GB.
 
 
 
 On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri nhsande...@gmail.com wrote:
 
 Please check your logs directory usage.
 
 
 
 On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak biswajit.na...@inmobi.com 
 wrote:
 Whats the replication factor you have? I believe it should be 3. hadoop dus 
 shows that disk usage without replication. While name node ui page gives 
 with replication. 
 
 38gb * 3 =114gb ~ 1TB
 
 ~Biswa
 -oThe important thing is not to stop questioning o-
 
 
 On Mon, Apr 14, 2014 at 9:38 AM, Saumitra saumitra.offic...@gmail.com 
 wrote:
 Hi Biswajeet,
 
 Non-dfs usage is ~100GB over the cluster. But still the number are nowhere 
 near 1TB. 
 
 Basically I wanted to point out discrepancy in name node status page and 
 hadoop dfs -dus. In my case, earlier one reports DFS usage as 1TB and later 
 one reports it to be 35GB. What are the factors that can cause this 
 difference? And why is just 35GB data causing DFS to hit its limits?
 
 
 
 
 On 14-Apr-2014, at 8:31 am, Biswajit Nayak biswajit.na...@inmobi.com wrote:
 
 Hi Saumitra,
 
 Could you please check the non-dfs usage. They also contribute to filling 
 up the disk space. 
 
 
 
 ~Biswa
 -oThe important thing is not to stop questioning o-
 
 
 On Mon, Apr 14, 2014 at 1:24 AM, Saumitra saumitra.offic...@gmail.com 
 wrote:
 Hello,
 
 We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We 
 are using default HDFS block size.
 
 We have noticed that disks of slaves are almost full. From name node’s 
 status page (namenode:50070), we could see that disks of live nodes are 90% 
 full and DFS Used% in cluster summary page  is ~1TB.
 
 However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB 
 number looks to be correct because we keep only few Hive tables and 
 hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data 
 is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think that 
 there is no internal fragmentation because the files in our Hive tables are 
 well-chopped in ~50MB chunks. Here are last few lines of hadoop fsck / 
 -files -blocks
 
 Status: HEALTHY
  Total size:38086441332 B
  Total dirs:232
  Total files:   802
  Total blocks (validated):  796 (avg. block size 47847288 B)
  Minimally replicated blocks:   796 (100.0 %)
  Over-replicated blocks:0 (0.0 %)
  Under-replicated blocks:   6 (0.75376886 %)
  Mis-replicated blocks: 0 (0.0 %)
  Default replication factor:2
  Average block replication: 3.0439699
  Corrupt blocks:0
  Missing replicas:  6 (0.24762692 %)
  Number of data-nodes:  9
  Number of racks:   1
 FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
 
 
 My question is that why disks of slaves are getting full even though there 
 are only few files in DFS?
 
 
 _
 The information contained in this communication is intended solely for the 
 use of the individual or entity to whom it is addressed and others 
 authorized to receive it. It may contain confidential or legally privileged 
 information. If you are not the intended recipient you are hereby notified 
 that any disclosure, copying, distribution or taking any action in reliance 
 on the contents of this information is strictly prohibited and may be 
 unlawful.

HDFS file system size issue

2014-04-13 Thread Saumitra

Hello,

We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We are 
using default HDFS block size.

We have noticed that disks of slaves are almost full. From name node’s status 
page (namenode:50070), we could see that disks of live nodes are 90% full and 
DFS Used% in cluster summary page  is ~1TB.

However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB 
number looks to be correct because we keep only few Hive tables and hadoop’s 
/tmp (distributed cache and job outputs) in HDFS. All other data is cleaned up. 
I cross-checked this from hadoop dfs -ls. Also I think that there is no 
internal fragmentation because the files in our Hive tables are well-chopped in 
~50MB chunks. Here are last few lines of hadoop fsck / -files -blocks

Status: HEALTHY
 Total size:38086441332 B
 Total dirs:232
 Total files:   802
 Total blocks (validated):  796 (avg. block size 47847288 B)
 Minimally replicated blocks:   796 (100.0 %)
 Over-replicated blocks:0 (0.0 %)
 Under-replicated blocks:   6 (0.75376886 %)
 Mis-replicated blocks: 0 (0.0 %)
 Default replication factor:2
 Average block replication: 3.0439699
 Corrupt blocks:0
 Missing replicas:  6 (0.24762692 %)
 Number of data-nodes:  9
 Number of racks:   1
FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds


My question is that why disks of slaves are getting full even though there are 
only few files in DFS?

Re: HDFS file system size issue

2014-04-13 Thread Biswajit Nayak

Hi Saumitra,

Could you please check the non-dfs usage. They also contribute to filling
up the disk space.



~Biswa
-oThe important thing is not to stop questioning o-


On Mon, Apr 14, 2014 at 1:24 AM, Saumitra saumitra.offic...@gmail.comwrote:

 Hello,

 We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We
 are using default HDFS block size.

 We have noticed that disks of slaves are almost full. From name node's
 status page (namenode:50070), we could see that disks of live nodes are 90%
 full and DFS Used% in cluster summary page  is ~1TB.

 However hadoop dfs -dus / shows that file system size is merely 38GB.
 38GB number looks to be correct because we keep only few Hive tables and
 hadoop's /tmp (distributed cache and job outputs) in HDFS. All other data
 is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
 that there is no internal fragmentation because the files in our Hive
 tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop 
 fsck
 / -files -blocks

 Status: HEALTHY
  Total size: 38086441332 B
  Total dirs: 232
  Total files: 802
  Total blocks (validated): 796 (avg. block size 47847288 B)
  Minimally replicated blocks: 796 (100.0 %)
  Over-replicated blocks: 0 (0.0 %)
  Under-replicated blocks: 6 (0.75376886 %)
  Mis-replicated blocks: 0 (0.0 %)
  Default replication factor: 2
  Average block replication: 3.0439699
  Corrupt blocks: 0
  Missing replicas: 6 (0.24762692 %)
  Number of data-nodes: 9
  Number of racks: 1
 FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds


 My question is that why disks of slaves are getting full even though there
 are only few files in DFS?


-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.

Re: HDFS file system size issue

2014-04-13 Thread Saumitra

Hi Biswajeet,

Non-dfs usage is ~100GB over the cluster. But still the number are nowhere near 
1TB. 

Basically I wanted to point out discrepancy in name node status page and hadoop 
dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one 
reports it to be 35GB. What are the factors that can cause this difference? And 
why is just 35GB data causing DFS to hit its limits?




On 14-Apr-2014, at 8:31 am, Biswajit Nayak biswajit.na...@inmobi.com wrote:

 Hi Saumitra,
 
 Could you please check the non-dfs usage. They also contribute to filling up 
 the disk space. 
 
 
 
 ~Biswa
 -oThe important thing is not to stop questioning o-
 
 
 On Mon, Apr 14, 2014 at 1:24 AM, Saumitra saumitra.offic...@gmail.com wrote:
 Hello,
 
 We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We are 
 using default HDFS block size.
 
 We have noticed that disks of slaves are almost full. From name node’s status 
 page (namenode:50070), we could see that disks of live nodes are 90% full and 
 DFS Used% in cluster summary page  is ~1TB.
 
 However hadoop dfs -dus / shows that file system size is merely 38GB. 38GB 
 number looks to be correct because we keep only few Hive tables and hadoop’s 
 /tmp (distributed cache and job outputs) in HDFS. All other data is cleaned 
 up. I cross-checked this from hadoop dfs -ls. Also I think that there is no 
 internal fragmentation because the files in our Hive tables are well-chopped 
 in ~50MB chunks. Here are last few lines of hadoop fsck / -files -blocks
 
 Status: HEALTHY
  Total size:  38086441332 B
  Total dirs:  232
  Total files: 802
  Total blocks (validated):796 (avg. block size 47847288 B)
  Minimally replicated blocks: 796 (100.0 %)
  Over-replicated blocks:  0 (0.0 %)
  Under-replicated blocks: 6 (0.75376886 %)
  Mis-replicated blocks:   0 (0.0 %)
  Default replication factor:  2
  Average block replication:   3.0439699
  Corrupt blocks:  0
  Missing replicas:6 (0.24762692 %)
  Number of data-nodes:9
  Number of racks: 1
 FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
 
 
 My question is that why disks of slaves are getting full even though there 
 are only few files in DFS?
 
 
 _
 The information contained in this communication is intended solely for the 
 use of the individual or entity to whom it is addressed and others authorized 
 to receive it. It may contain confidential or legally privileged information. 
 If you are not the intended recipient you are hereby notified that any 
 disclosure, copying, distribution or taking any action in reliance on the 
 contents of this information is strictly prohibited and may be unlawful. If 
 you have received this communication in error, please notify us immediately 
 by responding to this email and then delete it from your system. The firm is 
 neither liable for the proper and complete transmission of the information 
 contained in this communication nor for any delay in its receipt.

Re: HDFS file system size issue

Re: HDFS file system size issue

Re: HDFS file system size issue

Re: HDFS file system size issue

HDFS file system size issue

Re: HDFS file system size issue

Re: HDFS file system size issue

7 matches

Site Navigation

Mail list logo

Footer information