into a lot of other limits first: hard drive space, regionserver memory, the
infamous ulimit/xciever :), etc...
Take care,
-stu
--- On Wed, 2/2/11, Dhruba Borthakur dhr...@gmail.com wrote:
From: Dhruba Borthakur dhr...@gmail.com
Subject: Re: HDFS without Hadoop: Why?
To: hdfs-user
- Large block size wastes space for small file. The minimum file size
is 1 block.
That's incorrect. If a file is smaller than the block size, it will only
consume as much space as there is data in the file.
- There are no hardlinks, softlinks, or quotas.
That's incorrect;
PM
To: hdfs-user@hadoop.apache.org
Subject: Re: HDFS without Hadoop: Why?
Large block size wastes space for small file. The minimum file size is 1
block.
That's incorrect. If a file is smaller than the block size, it will only
consume as much space as there is data in the file
track…?
Thanks Ian for the haystack link…very informative indeed.
-Chinmay
From: Stuart Smith [mailto:stu24m...@yahoo.com]
Sent: Wednesday, February 02, 2011 4:41 PM
To: hdfs-user@hadoop.apache.org
Subject: RE: HDFS without Hadoop: Why?
Hello,
I'm actually using hbase/hadoop/hdfs for lots
:* Stuart Smith [mailto:stu24m...@yahoo.com]
*Sent:* Wednesday, February 02, 2011 4:41 PM
*To:* hdfs-user@hadoop.apache.org
*Subject:* RE: HDFS without Hadoop: Why?
Hello,
I'm actually using hbase/hadoop/hdfs for lots of small files (with a
long tail of larger files). Well, millions
...@gmail.com
Subject: Re: HDFS without Hadoop: Why?
To: hdfs-user@hadoop.apache.org
Date: Wednesday, February 2, 2011, 9:00 PM
The Namenode uses around 160 bytes/file and 150 bytes/block in HDFS. This
is a very rough calculation.
dhruba
On Wed, Feb 2, 2011 at 5:11 PM, Dhodapkar, Chinmay
think it's 8 GB of RAM used...
Take care,
-stu
--- On Wed, 2/2/11, Gaurav Sharma gaurav.gs.sha...@gmail.com wrote:
From: Gaurav Sharma gaurav.gs.sha...@gmail.com
Subject: Re: HDFS without Hadoop: Why?
To: hdfs-user@hadoop.apache.org
Date: Wednesday, February 2, 2011, 9:31 PM
Stuart - if Dhruba
On Mon, Jan 31, 2011 at 6:34 PM, Sean Bigdatafun
sean.bigdata...@gmail.com wrote:
I feel this is a great discussion, so let's think of HDFS' customers.
(1) MapReduce --- definitely a perfect fit as Nathan has pointed out
I would add the caveat that this depends on your particular weighting
HBase is a database that runs on top of HDFS. So that's another one. It has an
append-only usage pattern, which makes it a good fit.
I don't see how not-so-commodity hardware could go without replication to
achieve the same as HDFS. It's not only about data safety, but also about
availability.
,
-stu
-Original Message-
From: Nathan Rutman nrut...@gmail.com
Date: Tue, 25 Jan 2011 17:31:25
To: hdfs-user@hadoop.apache.org
Reply-To: hdfs-user@hadoop.apache.org
Subject: Re: HDFS without Hadoop: Why?
On Jan 25, 2011, at 5:08 PM, stu24m...@yahoo.com wrote:
I don't think
-
From: Nathan Rutman nrut...@gmail.com
Date: Tue, 25 Jan 2011 17:31:25
To: hdfs-user@hadoop.apache.org
Reply-To: hdfs-user@hadoop.apache.org
Subject: Re: HDFS without Hadoop: Why?
On Jan 25, 2011, at 5:08 PM, stu24m...@yahoo.com wrote:
I don't think, as a recovery strategy, RAID scales
I believe for most people, the answer is Yes
-Original Message-
From: Nathan Rutman nrut...@gmail.com
Date: Wed, 26 Jan 2011 09:41:37
To: hdfs-user@hadoop.apache.org
Reply-To: hdfs-user@hadoop.apache.org
Subject: Re: HDFS without Hadoop: Why?
Ok. Is your statement, I use HDFS
I have a very general question on the usefulness of HDFS for purposes other
than running distributed compute jobs for Hadoop. Hadoop and HDFS seem very
popular these days, but the use of HDFS for other purposes (database backend,
records archiving, etc) confuses me, since there are other free
Hi Nathan,
I have a very general question on the usefulness of HDFS for purposes other
than running distributed compute jobs for Hadoop. Hadoop and HDFS seem very
popular these days, but the use of
HDFS for other purposes (database backend, records archiving, etc) confuses
me, since
On Jan 25, 2011, at 3:56 PM, Gerrit Jansen van Vuuren wrote:
Hi,
Why would 3x data seem wasteful?
This is exactly what you want. I would never store any serious business data
without some form of replication.
I agree that you want data backup, but 3x replication is the least efficient
.
Best,
-stu
-Original Message-
From: Nathan Rutman nrut...@gmail.com
Date: Tue, 25 Jan 2011 16:32:07
To: hdfs-user@hadoop.apache.org
Reply-To: hdfs-user@hadoop.apache.org
Subject: Re: HDFS without Hadoop: Why?
On Jan 25, 2011, at 3:56 PM, Gerrit Jansen van Vuuren wrote:
Hi
16 matches
Mail list logo