I used streaming and php before to work with processing data with a data set
of about 1TB with out any problems at all.
Billy
"s d" <s.d.sau...@gmail.com> wrote in message
news:24b53fa00905191035w41b115c1q94502ee82be43...@mail.gmail.com...
Thanks.
So in the overall scheme of things, what is the general feeling about
using
python for this? I like the ease of deploying and reading python compared
with Java but want to make sure using python over hadoop is scalable & is
standard practice and not something done only for prototyping and small
scale tests.
On Tue, May 19, 2009 at 9:48 AM, Alex Loddengaard
<a...@cloudera.com> wrote:
Streaming is slightly slower than native Java jobs. Otherwise Python
works
great in streaming.
Alex
On Tue, May 19, 2009 at 8:36 AM, s d
<s.d.sau...@gmail.com> wrote:
> Hi,
> How robust is using hadoop with python over the streaming protocol? Any
> disadvantages (performance? flexibility?) ? It just strikes me that
python
> is so much more convenient when it comes to deploying and crunching
> text
> files.
> Thanks,
>