estDFSIO -write -nrFiles n
-fileSize 2048 -bufferSize 65536 -resFile resfile
*TestDFSIO* (/src/test/org/apache/hadoop/fs/TestDFSIO.java)
> Raghu.
>
> Samuel Guo wrote:
>
>> Hi all,
>>
>> I run the *TestDFSIO* benchmark on a simple cluster of 2 nodes.
>> The file size
Hi all,
I run the *TestDFSIO* benchmark on a simple cluster of 2 nodes.
The file size is the same in all cases 2GB.
The number of files tried is 1,2,4,8(only write).
The bufferSize is 65536 bytes.
The file replication is 1.
the results as below:
files 1 2 4 8
write -- Throughout(mb/s) 52.89 52.
Hi all,
I am reading the hadoop source code to study the design of the hadoop
distributed filesystem. And I think I've got some questions about the
file replication of HDFS.
I know the degree of replication of HDFS is configurable on a configure
file such as "hadoop-default.xml". The default degr
Ted Dunning 写道:
Check out the bailey and katta projects on sourceforge.
I get nothing when checking out the katta project on sourceforge :(
Also take a look at Nutch.
Hadoop is certainly good for indexing and it isn't that hard to put
distributed search alongside hadoop with indexes being p
map/reduce will be a suitable approach for indexing large doc
collections. but I don't know is it suitable for retrieval. you can see
*Nutch* for the distributed searching.
under the hadoop/contrib directory , there is a *Index* package. It may
be helpful :)
Matt Wood 写道:
Hello all,
I was
me known:) Thanks in advanced!
Best Wishes
Samuel Guo
Hi all,
Can anyone tell me : is there any api I can use to get the metadata info
such as the last modified time and etc. of a File in hdfs?
Thanks a lot :)
Best Wishes:)
Samuel Guo