Re: Cluster Machines

2009-11-04 Thread John Martyniak
Allen, That is kind of what I was thinking, to put both of them in the same partition, but I noticed that there was a configuration to change the directory of HBASE. So though that I would throw it out there. That would be good to meet up at Apachecon, unfortunately I am at an East Coast

Re: Cluster Machines

2009-11-04 Thread Allen Wittenauer
On 11/4/09 8:15 AM, "John Martyniak" wrote: > Kind of off topic, maybe more appropriately cross topic. > > So since we are using Hadoop, HDFS, and HBASE, should there be 3 > partitions? One for each system? Or can HBase and HDFS share a > partition. There will be a lot of HBASE data, and also

Re: Cluster Machines

2009-11-04 Thread John Martyniak
Allen, Kind of off topic, maybe more appropriately cross topic. So since we are using Hadoop, HDFS, and HBASE, should there be 3 partitions? One for each system? Or can HBase and HDFS share a partition. There will be a lot of HBASE data, and also a lot of HDFS data. -John On Nov 3, 2

Re: Cluster Machines

2009-11-04 Thread John Martyniak
Allen, Those are some nice "toys" to play with. I agree on the VM IO issues, but I am going to try and see, my cluster won't be that big to start so any IO issues might not manifest themselves until it is bigger. But we will see. So I will probably also do 100 or 200 GB for MR, and the re

Re: Cluster Machines

2009-11-03 Thread Allen Wittenauer
On 11/3/09 2:29 PM, "John Martyniak" wrote: > Would you mind telling me the kinds of configured servers that you are > running? Our 'real' grid is comprised of shiny Sun 4275s. But our 'non-real' grid is composed of two types of machines with radically different disk configurations (size *and* n

Re: Cluster Machines

2009-11-03 Thread John Martyniak
Allen, Thanks for the info. Very helpful. Would you mind telling me the kinds of configured servers that you are running? Also have you had any experience running namenodes or zookeeper on a VM? I have a couple of much larger boxes that are being used to run VMs, and was thinking of pu

Re: Cluster Machines

2009-11-03 Thread Allen Wittenauer
On 11/3/09 5:25 AM, "John Martyniak" wrote: > 1) Should each node have RAID 1, or is it sufficient to have HDFS take > care of that? Because for each node I could put a 80 GB drive for the > boot drive and leave one of the 1.5 TBs for the data drive that Hadoop > uses. HDFS will take care of

Cluster Machines

2009-11-03 Thread John Martyniak
Hi, I am in the process of setting up a Hadoop cluster, starting small at first but rapidly growing. I plan on running the following, Hadoop, HDFS, HBase, Nutch and Mahout. I am starting with 2 machines to get all of the mechanics worked out, and then growing the cluster. The first two