I/O is on the DFS. In the case of HBase, this is not taken into account.
Instead we just start scanners for each map on each region.

J-D

On Fri, Sep 19, 2008 at 1:22 PM, Ding, Hui <[EMAIL PROTECTED]> wrote:

> Thanks for this suggestion on the shell, I will take a look into that.
> But I still don't understand why streaming won't work very well, it is
> able
> To do m/r jobs using the supplied exec right? So all the map/reduce
> programs take input/output from their own local filesystem or from the
> hdfs?
>
> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of
> Jean-Daniel Cryans
> Sent: Thursday, September 18, 2008 6:30 PM
> To: [email protected]
> Subject: [LIKELY JUNK]Re: Running map/reduce written in Ruby on Hbase
>
> Hui Ding,
>
> This wouldn't work very well. Streaming is defined so that you pass
> programs
> (any) that can take in input and an output in the filesystem, not HBase
> tables. You should instead try to use JRuby like we do for the shell. It
> requires some more setup, but since it all runs inside the JVM it
> eventually
> works.
>
> I see that more and more users are interested in using JRuby/Jython for
> MR
> jobs and I know that some companies already uses a wrapper for that
> ("Happy"
> anyone?). I'm sure many would be insterested in seeing this kind of
> work.
>
> J-D
>
> On Thu, Sep 18, 2008 at 7:57 PM, Ding, Hui <[EMAIL PROTECTED]> wrote:
>
> > Hi all,
> >
> > I wanted to run some map/reduce job but I'd like to do that in Ruby,
> is
> > this possible with Hadoop Streaming?
> > My understanding is that I will provide mapper/reducer in Ruby and
> > supply that to Hadoop Streamining, and since hbase can be a
> source/sink
> > of map/reduce, I should be able to access the tables, right?
> >
> > And as far as setup is concered, I just need to have a ruby
> interpreter
> > set up on each of the machine in the cluster?
> >
> > Thanks a lot!
> >
>

Reply via email to