Will coding computational intensive algorithms using c/c++ and using them with streaming mode improve the performance ? Just curiosity.

Xiance

On Aug 13, 2008, at 10:56 AM, Gaurav Veda wrote:

Thank you all for the replies. They do clarify things!

Cheers,
Gaurav

On Tue, Aug 12, 2008 at 8:01 PM, Arun C Murthy <[EMAIL PROTECTED]> wrote:

On Aug 12, 2008, at 3:15 PM, Ashish Venugopal wrote:

There is definitely functionality in "normal" mode that is not available
in
streaming, like the ability to write counters to instruments jobs. I
personally just use streaming, so I am interested to see if there are
further key differences...

With hadoop-0.18 (under vote now) you get counters for streaming too:
http://issues.apache.org/jira/browse/HADOOP-1328

As others have pointed out, the fact that your input/output has to be
'textual' is a major difference - lots of applications need binary data.
This 'stringification' has serious performance implications too, some
benchmarks I did a while ago for Pig put this at nearly 3x.

Arun


Ashish

On Tue, Aug 12, 2008 at 3:09 PM, Gaurav Veda
<[EMAIL PROTECTED]<[EMAIL PROTECTED]>

wrote:

Hi All,

This might seem too silly, but I couldn't find a satisfactory answer to this yet. What are the advantages / disadvantages of using Hadoop Streaming over the normal mode (wherein you write your own mapper and
reducer in Java)? From what I gather, the real advantage of Hadoop
Streaming is that you can use any executable (in c / perl / python
etc) as a mapper / reducer.
A slight disadvantage is that the default is to read (write) from the standard input (output) ... though one can specify their own Input and
Output format (and package it with the default hadoop streaming jar
file).

My point is, why should I ever use the normal mode? Streaming seems
just as good. Is there a performance problem or do I have only limited control over my job if I use the streaming mode or some other issue?

Thanks!
Gaurav
--
Share what you know, learn what you don't !






--
Share what you know, learn what you don't !

Reply via email to