You might also check out Dumbo, which is a Hadoop Python module. <http://www.audioscrobbler.net/development/dumbo/>
Alex On Tue, May 19, 2009 at 10:35 AM, s d <s.d.sau...@gmail.com> wrote: > Thanks. > So in the overall scheme of things, what is the general feeling about using > python for this? I like the ease of deploying and reading python compared > with Java but want to make sure using python over hadoop is scalable & is > standard practice and not something done only for prototyping and small > scale tests. > > > On Tue, May 19, 2009 at 9:48 AM, Alex Loddengaard <a...@cloudera.com> > wrote: > > > Streaming is slightly slower than native Java jobs. Otherwise Python > works > > great in streaming. > > > > Alex > > > > On Tue, May 19, 2009 at 8:36 AM, s d <s.d.sau...@gmail.com> wrote: > > > > > Hi, > > > How robust is using hadoop with python over the streaming protocol? Any > > > disadvantages (performance? flexibility?) ? It just strikes me that > > python > > > is so much more convenient when it comes to deploying and crunching > text > > > files. > > > Thanks, > > > > > >