are you asking "why data read/write from/to hdfs blocks via mapreduce framework is done in streaming manner?"
On Wed, Mar 5, 2014 at 2:05 PM, Radhe Radhe <radhe.krishna.ra...@live.com>wrote: > Hi Shashwat, > > This is an excerpt from Hadoop The Definitive Guide--Tom White > Hadoop Streaming > Hadoop provides an API to MapReduce that allows you to write your map and > reduce > functions in languages *other than Java*. Hadoop Streaming uses Unix > standard streams > as the interface between Hadoop and your program, > > *so you can use any language thatcan read standard input and write to > standard output to write your MapReduceprogram*. > Streaming is naturally suited for text processing (although, as of version > 0.21.0, it can > handle binary streams, too), and when used in text mode, it has a > line-oriented view of > data. Map input data is passed over standard input to your map function, > which processes > it line by line and writes lines to standard output. A map output > key-value pair > is written as a single tab-delimited line. Input to the reduce function is > in the same > format—a tab-separated key-value pair—passed over standard input. The > reduce function > reads lines from standard input, which the framework guarantees are sorted > by > key, and writes its results to standard output. > > I think this is not what I am asking for. > > Thanks. > -RR > > ------------------------------ > From: dwivedishash...@gmail.com > Date: Wed, 5 Mar 2014 13:47:09 +0530 > Subject: Re: Streaming data access in HDFS: Design Feature > To: user@hadoop.apache.org > CC: radhe.krishna.ra...@live.com > > > Streaming means process it as its coming to HDFS, like where in hadoop > this hadoop streaming enable hadoop to receive data using executable of > different types > > i hope you have already read this : > http://hadoop.apache.org/docs/r0.18.1/streaming.html#Hadoop+Streaming > > > *Warm Regards_**∞_* > * Shashwat Shriparv* > [image: > http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]<http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9>[image: > https://twitter.com/shriparv] <https://twitter.com/shriparv>[image: > https://www.facebook.com/shriparv] <https://www.facebook.com/shriparv>[image: > http://google.com/+ShashwatShriparv] > <http://google.com/+ShashwatShriparv>[image: > http://www.youtube.com/user/sShriparv/videos]<http://www.youtube.com/user/sShriparv/videos>[image: > http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] <shrip...@yahoo.com> > > > > On Wed, Mar 5, 2014 at 1:38 PM, Radhe Radhe > <radhe.krishna.ra...@live.com>wrote: > > Hello All, > > Can anyone please explain what we mean by *Streaming data access in HDFS*. > > Data is usually copied to HDFS and in HDFS the data is splitted across > DataNodes in blocks. > Say for example, I have an input file of 10240 MB(10 GB) in size and a > block size of 64 MB. Then there will be 160 blocks. > These blocks will be distributed across DataNodes in blocks. > Now the Mappers will read data from these DataNodes keeping the *data > locality feature* in mind(i.e. blocks local to a DataNode will be read by > the map tasks running in that DataNode). > > Can you please point me where is the "Streaming data access in HDFS" is > coming into picture here? > > Thanks, > RR > > > -- Nitin Pawar