HDFS disk space requirement

2013-01-10 Thread Panshul Whisper
Hello, I have a hadoop cluster of 5 nodes with a total of available HDFS space 130 GB with replication set to 5. I have a file of 115 GB, which needs to be copied to the HDFS and processed. Do I need to have anymore HDFS space for performing all processing without running into any problems? or is

Re: HDFS disk space requirement

2013-01-10 Thread பாலாஜி நாராயணன்
If the replication factor is 5 you will need at least 5x the space if the file. So this is not going tobe enough. On Thursday, January 10, 2013, Panshul Whisper wrote: Hello, I have a hadoop cluster of 5 nodes with a total of available HDFS space 130 GB with replication set to 5. I have a

Re: HDFS disk space requirement

2013-01-10 Thread Ravi Mutyala
If the file is a txt file, you could get a good compression ratio. Changing the replication to 3 and the file will fit. But not sure what your usecase is what you want to achieve by putting this data there. Any transformation on this data and you would need more space to save the transformed data.

Re: HDFS disk space requirement

2013-01-10 Thread Panshul Whisper
Thank you for the response. Actually it is not a single file, I have JSON files that amount to 115 GB, these JSON files need to be processed and loaded into a Hbase data tables on the same cluster for later processing. Not considering the disk space required for the Hbase storage, If I reduce the

Re: HDFS disk space requirement

2013-01-10 Thread Alexander Pivovarov
finish elementary school first. (plus, minus operations at least) On Thu, Jan 10, 2013 at 7:23 PM, Panshul Whisper ouchwhis...@gmail.comwrote: Thank you for the response. Actually it is not a single file, I have JSON files that amount to 115 GB, these JSON files need to be processed and

Re: HDFS disk space requirement

2013-01-10 Thread shashwat shriparv
115 * 5 = 575 Minimum GB you need, keep in mind on minimal, and you will have other disk space needs too... ∞ Shashwat Shriparv On Fri, Jan 11, 2013 at 11:19 AM, Alexander Pivovarov apivova...@gmail.comwrote: finish elementary school first. (plus, minus operations at least) On Thu, Jan