-
From: Matei Zaharia [mailto:ma...@cloudera.com]
Sent: Friday, April 03, 2009 11:21 AM
To: core-user@hadoop.apache.org
Subject: Re: Hadoop/HDFS for scientific simulation output data analysis
Hi Tiankai,
The one strange thing I see in your configuration as described is IO
buffer
size and IO bytes
, 6400 for the 256MB
file dataset, and so forth.
Tiankai
-Original Message-
From: Matei Zaharia [mailto:ma...@cloudera.com]
Sent: Friday, April 03, 2009 11:21 AM
To: core-user@hadoop.apache.org
Subject: Re: Hadoop/HDFS for scientific simulation output data analysis
Hi Tiankai
.
-Original Message-
From: Matei Zaharia [mailto:ma...@cloudera.com]
Sent: Friday, April 03, 2009 1:18 PM
To: core-user@hadoop.apache.org
Subject: Re: Hadoop/HDFS for scientific simulation output data analysis
Hadoop does checksums for each small chunk of the file (512 bytes by
default) and stores
On Apr 3, 2009, at 1:41 PM, Tu, Tiankai wrote:
By the way, what is the largest size---in terms of total bytes, number
of files, and number of nodes---in your applications? Thanks.
The largest Hadoop application that has been documented is the Yahoo
Webmap.
10,000 cores
500 TB shuffle
300