Hi, The throughput of HDFS is good, because each read is basically a stream from several hard drives (each hard drive holds a different block of the file, and these blocks are distributed across many machines). That said, HDFS does not have very good latency, at least compared to local file systems.
When you write a file using the HDFS client (whether it be Java or bin/hadoop fs), the client and the name node coordinate to put your file on various nodes in the cluster. When you use that same client to read data, your client coordinates with the name node to get block locations for a given file and does a HTTP GET request to fetch those blocks from the nodes which store them. You could in theory get data off of the local file system on your data nodes, but this wouldn't make any sense, because the client does everything for you already. Hope this clears things up. Alex On Fri, Jun 5, 2009 at 12:53 AM, Sugandha Naolekar <sugandha....@gmail.com>wrote: > Hello! > > Placing any kind of data into HDFS and then getting it back, can this > activity be fast? Also, the node of which I have to place the data in HDFS, > is a remote node. So then, will I have to use RPC mechnaism or simply cna > get the locla filesystem of that node and do the things? > > -- > Regards! > Sugandha >