Re: HDFS without Hadoop: Why?

2011-02-03 Thread Nathan Rutman
into a lot of other limits first: hard drive space, regionserver memory, the infamous ulimit/xciever :), etc... Take care, -stu --- On Wed, 2/2/11, Dhruba Borthakur dhr...@gmail.com wrote: From: Dhruba Borthakur dhr...@gmail.com Subject: Re: HDFS without Hadoop: Why? To: hdfs-user

Re: HDFS without Hadoop: Why?

2011-02-02 Thread Jeff Hammerbacher
- Large block size wastes space for small file. The minimum file size is 1 block. That's incorrect. If a file is smaller than the block size, it will only consume as much space as there is data in the file. - There are no hardlinks, softlinks, or quotas. That's incorrect;

Re: HDFS without Hadoop: Why?

2011-02-02 Thread Ian Holsman
PM To: hdfs-user@hadoop.apache.org Subject: Re: HDFS without Hadoop: Why? Large block size wastes space for small file. The minimum file size is 1 block. That's incorrect. If a file is smaller than the block size, it will only consume as much space as there is data in the file

RE: HDFS without Hadoop: Why?

2011-02-02 Thread Dhodapkar, Chinmay
track…? Thanks Ian for the haystack link…very informative indeed. -Chinmay From: Stuart Smith [mailto:stu24m...@yahoo.com] Sent: Wednesday, February 02, 2011 4:41 PM To: hdfs-user@hadoop.apache.org Subject: RE: HDFS without Hadoop: Why? Hello, I'm actually using hbase/hadoop/hdfs for lots

Re: HDFS without Hadoop: Why?

2011-02-02 Thread Dhruba Borthakur
:* Stuart Smith [mailto:stu24m...@yahoo.com] *Sent:* Wednesday, February 02, 2011 4:41 PM *To:* hdfs-user@hadoop.apache.org *Subject:* RE: HDFS without Hadoop: Why? Hello, I'm actually using hbase/hadoop/hdfs for lots of small files (with a long tail of larger files). Well, millions

Re: HDFS without Hadoop: Why?

2011-02-02 Thread Gaurav Sharma
...@gmail.com Subject: Re: HDFS without Hadoop: Why? To: hdfs-user@hadoop.apache.org Date: Wednesday, February 2, 2011, 9:00 PM The Namenode uses around 160 bytes/file and 150 bytes/block in HDFS. This is a very rough calculation. dhruba On Wed, Feb 2, 2011 at 5:11 PM, Dhodapkar, Chinmay

Re: HDFS without Hadoop: Why?

2011-02-02 Thread Stuart Smith
think it's 8 GB of RAM used... Take care,   -stu --- On Wed, 2/2/11, Gaurav Sharma gaurav.gs.sha...@gmail.com wrote: From: Gaurav Sharma gaurav.gs.sha...@gmail.com Subject: Re: HDFS without Hadoop: Why? To: hdfs-user@hadoop.apache.org Date: Wednesday, February 2, 2011, 9:31 PM Stuart - if Dhruba

Re: HDFS without Hadoop: Why?

2011-01-31 Thread Nathan Rutman
On Mon, Jan 31, 2011 at 6:34 PM, Sean Bigdatafun sean.bigdata...@gmail.com wrote: I feel this is a great discussion, so let's think of HDFS' customers. (1) MapReduce --- definitely a perfect fit as Nathan has pointed out I would add the caveat that this depends on your particular weighting

Re: HDFS without Hadoop: Why?

2011-01-26 Thread Friso van Vollenhoven
HBase is a database that runs on top of HDFS. So that's another one. It has an append-only usage pattern, which makes it a good fit. I don't see how not-so-commodity hardware could go without replication to achieve the same as HDFS. It's not only about data safety, but also about availability.

Re: HDFS without Hadoop: Why?

2011-01-26 Thread Gerrit Jansen van Vuuren
, -stu -Original Message- From: Nathan Rutman nrut...@gmail.com Date: Tue, 25 Jan 2011 17:31:25 To: hdfs-user@hadoop.apache.org Reply-To: hdfs-user@hadoop.apache.org Subject: Re: HDFS without Hadoop: Why? On Jan 25, 2011, at 5:08 PM, stu24m...@yahoo.com wrote: I don't think

Re: HDFS without Hadoop: Why?

2011-01-26 Thread Nathan Rutman
- From: Nathan Rutman nrut...@gmail.com Date: Tue, 25 Jan 2011 17:31:25 To: hdfs-user@hadoop.apache.org Reply-To: hdfs-user@hadoop.apache.org Subject: Re: HDFS without Hadoop: Why? On Jan 25, 2011, at 5:08 PM, stu24m...@yahoo.com wrote: I don't think, as a recovery strategy, RAID scales

Re: HDFS without Hadoop: Why?

2011-01-26 Thread stu24mail
I believe for most people, the answer is Yes -Original Message- From: Nathan Rutman nrut...@gmail.com Date: Wed, 26 Jan 2011 09:41:37 To: hdfs-user@hadoop.apache.org Reply-To: hdfs-user@hadoop.apache.org Subject: Re: HDFS without Hadoop: Why? Ok. Is your statement, I use HDFS

HDFS without Hadoop: Why?

2011-01-25 Thread Nathan Rutman
I have a very general question on the usefulness of HDFS for purposes other than running distributed compute jobs for Hadoop. Hadoop and HDFS seem very popular these days, but the use of HDFS for other purposes (database backend, records archiving, etc) confuses me, since there are other free

RE: HDFS without Hadoop: Why?

2011-01-25 Thread Scott Golby
Hi Nathan, I have a very general question on the usefulness of HDFS for purposes other than running distributed compute jobs for Hadoop. Hadoop and HDFS seem very popular these days, but the use of HDFS for other purposes (database backend, records archiving, etc) confuses me, since

Re: HDFS without Hadoop: Why?

2011-01-25 Thread Nathan Rutman
On Jan 25, 2011, at 3:56 PM, Gerrit Jansen van Vuuren wrote: Hi, Why would 3x data seem wasteful? This is exactly what you want. I would never store any serious business data without some form of replication. I agree that you want data backup, but 3x replication is the least efficient

Re: HDFS without Hadoop: Why?

2011-01-25 Thread Nathan Rutman
. Best, -stu -Original Message- From: Nathan Rutman nrut...@gmail.com Date: Tue, 25 Jan 2011 16:32:07 To: hdfs-user@hadoop.apache.org Reply-To: hdfs-user@hadoop.apache.org Subject: Re: HDFS without Hadoop: Why? On Jan 25, 2011, at 3:56 PM, Gerrit Jansen van Vuuren wrote: Hi