Re: Using a hard drive instead of

2012-10-17 Thread Michael Segel
Meh. If you are worried about the memory constraints of a Linux system, I'd say go with MapR and their CLDB. I just did a quick look at Supermico servers and found that on a 2u server 768GB was the max. So how many blocks can you store in that much memory? I only have 10 fingers and toes so

Re: Using a hard drive instead of

2012-10-17 Thread Colin McCabe
Hi Mark, HDFS contains a write-ahead log which will protect you from power failure. It's called the edit log. If you want warm failover, you can use HDFS HA, which is available in recent versions of HDFS. Hope this helps. Colin On Wed, Oct 17, 2012 at 3:44 PM, Mark Kerzner wrote: > Colin, >

Re: Using a hard drive instead of

2012-10-17 Thread Mark Kerzner
Colin, swap space would give me very high memory, up to 1 TB, say, but it won't protect me from power failure. Very large clusters is only one application of this idea, warm failover is another. The drive is not theoretical, just look at the price tag :) Mark On Wed, Oct 17, 2012 at 5:37 PM, Col

Re: Using a hard drive instead of

2012-10-17 Thread Colin Patrick McCabe
The direct answer to your question us to use this theoretical super-fast hard drive as Linux swap space. The better answer is to use federation or another solution if your needs exceed those servable by a single NameNode. Cheers. Colin On Oct 11, 2012 9:00 PM, "Mark Kerzner" wrote: > Hi, > > Im

Re: Using a hard drive instead of

2012-10-13 Thread Mark Kerzner
ber 12, 2012 12:01 AM > > *Subject:* Re: Using a hard drive instead of > > This is why memory-mapped files were invented. > > On Thu, Oct 11, 2012 at 9:34 PM, Gaurav Sharma > wrote: > > If you don't mind sharing, what hard drive do you have with these > > propert

Re: Using a hard drive instead of

2012-10-12 Thread Ravi Prakash
use special hardware, thinking of which, really its not s special, so maybe worth trying). Has anyone tried this before? From: Lance Norskog To: user@hadoop.apache.org Sent: Friday, October 12, 2012 12:01 AM Subject: Re: Using a hard drive instead of

Re: Using a hard drive instead of

2012-10-11 Thread Lance Norskog
This is why memory-mapped files were invented. On Thu, Oct 11, 2012 at 9:34 PM, Gaurav Sharma wrote: > If you don't mind sharing, what hard drive do you have with these > properties: > -"performance of RAM" > -"can accommodate very many threads" > > > On Oct 11, 2012, at 21:27, Mark Kerzner wrot

Re: Using a hard drive instead of

2012-10-11 Thread Gaurav Sharma
If you don't mind sharing, what hard drive do you have with these properties: -"performance of RAM" -"can accommodate very many threads" On Oct 11, 2012, at 21:27, Mark Kerzner wrote: > Harsh, > > I agree with you about many small files, and I was giving this only in way of > example. However

Re: Using a hard drive instead of

2012-10-11 Thread Mark Kerzner
Harsh, I agree with you about many small files, and I was giving this only in way of example. However, the hard drive I am talking about can be 1-2 TB in size, and that's pretty good, you can't easily get that much memory. In addition, it would be more resistant to power failures than RAM. And yes

Re: Using a hard drive instead of

2012-10-11 Thread Harsh J
Hi Mark, Note that the NameNode does random memory access to serve back any information or mutate request you send to it, and that there can be several number of concurrent clients. So do you mean a 'very fast hard drive' thats faster than the RAM for random access itself? The NameNode does persis

Using a hard drive instead of

2012-10-11 Thread Mark Kerzner
Hi, Imagine I have a very fast hard drive that I want to use for the NameNode. That is, I want the NameNode to store its blocks information on this hard drive instead of in memory. Why would I do it? Scalability (no federation needed), many files are not a problem, and warm fail-over is automatic