Has anyone had any experience building high-performance datanodes using SSD for
storage? Any gotchas or notes to contribute? We were talking out a crazy idea
for some stupid fast hadoop nodes using SSD storage last night and I wanted to
see if anyone had done something like this yet.
Thanks!
On May 18, 2011, at 4:54 PM, Aaron Eng wrote:
> Case in point, I noted a while ago that when you run the namenode -format
> command, it only accepts a capital Y (or lower case, can't remember), and it
> fails silently if you give the wrong case. I didn't particularly care enough
> to fix it, ha
ed OS drive. 48 of
those makes a nice speedy cluster.
-j
On May 10, 2011, at 1:57 PM, Allen Wittenauer wrote:
>
> On May 9, 2011, at 11:46 PM, Jonathan Disher wrote:
>
>> I cant speak for Will, but I'm actually going against recommendations, my
>> systems have thr
Will Maier wrote:
>
> Hi Jonathan-
>
> On Tue, May 10, 2011 at 05:50:03AM -0700, Jonathan Disher wrote:
>> I will preface this with a couple statements: a) it's almost 6am, and I've
>> been up all night b) I'm drugged up from an allergic reaction, so I may not
I will preface this with a couple statements: a) it's almost 6am, and I've been
up all night b) I'm drugged up from an allergic reaction, so I may not be
firing on all 64 bits.
Do I correctly understand the HDFS architecture in that the namenode is a
network bottleneck into the system? I.e., i
In a previous life, I've had extreme problems with XFS, including kernel panics
and data loss under high load.
Those were database servers, not Hadoop nodes, and it was a few years ago.
But, ext3/ext4 seems to be stable enough, and it's more widely supported, so
it's my preference.
-j
On May
Will Maier wrote:
> On Mon, May 09, 2011 at 05:07:29PM -0700, Jonathan Disher wrote:
> > Speak for yourself, I just built a bunch of 36 disk datanodes :)
>
> And I just unboxed 10 more 36 disk systems to join the two already in our
> cluster. We also have 20 systems with 24
In theory you could run multiple copies of the DataNode process, bound to
different IPs with different sets of disks in their config files. It's
exceedingly hackish.
-j
On May 9, 2011, at 4:24 AM, Ferdy Galema wrote:
> Is it possible to enforce a replication of 2 for a single node, so that
>
Speak for yourself, I just built a bunch of 36 disk datanodes :)
-j
On May 9, 2011, at 2:33 AM, Eric wrote:
> Just a small warning: I've seen kernel panics with the XFS kernel module once
> you have many disks (in my case: > 20 disks). This is an exotic amount of
> disks to put in one server s
e from the East coast to the West coast, and
sooner would be better than later :)
-j
On Apr 22, 2011, at 10:38 PM, Jean-Daniel Cryans wrote:
> See "Copying between versions of HDFS":
> http://hadoop.apache.org/common/docs/r0.20.2/distcp.html#cpver
>
> J-D
>
> On F
I have an existing cluster running hadoop-0.20.1, and I am migrating most of
the data to a new cluster running -0.20.2. I am seeing this in the namenode
logs when I try to run a distcp:
@40004db263bf29c77134 WARN ipc.Server: Incorrect header or version mismatch
from newNN:46111 got version
I am embarking on an archiving project, and just wondered if anyone had any
decent scripts/etc for syncing a lot of data between two HDFS instances. I
have my production hadoop cluster in VA, where we store a lot of data, and we
are bringing up our archive cluster here in CA, where we will keep
I've never seen an implementation of concat volumes that tolerate a disk
failure, just like RAID0.
Currently I have a 48 node cluster using Dell R710's with 12 disks - two 250GB
SATA drives in RAID1 for OS, and ten 1TB SATA disks as a JBOD (mounted on
/data/0 through /data/9) and listed separat
In my existing 48-node cluster, the engineer who originally designed it (no
longer here) did not specify logical racks in the HDFS configuration, instead
leaving everything in "default-rack". Now I have 4 physical racks of machines,
and I am becoming concerned about failure and near/far replica
ing syslog
> messages (Syslog4j, nagios, ganglia, etc.) and if you are lucky enough and
> the node doesn't hangs after the disk failure, you could shutdown it
> gracefully.
>
> esteban.
>
> On Mon, Jan 3, 2011 at 13:55, Jonathan Disher wrote:
> The problem is, what d
startup as well.
>
> Thanks,
> Eli
>
> On Sun, Jan 2, 2011 at 7:20 PM, Jonathan Disher wrote:
>> I see that there was a thread on this in December, but I can't retrieve it
>> to reply properly, oh well.
>>
>> So, I have a 30 node cluster (plus separa
I see that there was a thread on this in December, but I can't retrieve it to
reply properly, oh well.
So, I have a 30 node cluster (plus separate namenode, jobtracker, etc). Each
is a 12 disk machine - two mirrored 250GB OS disks, ten 1TB data disks in JBOD.
Original system config was six 1T
17 matches
Mail list logo