[How to optimize MapReduce performance using Pig]

Florencia Satwika Fri, 06 Sep 2013 08:27:41 -0700

Thank you for the time  to read my mail.

I've been in a simple reserach about Big Data since three months ago.
I'm using Pig now to do the Map and reduce computation over my data.
I have about 50million records in HBase, and I want to process that all
with Pig.
Unfortunately, the time that Pig took to process that much data tend to be
so long, it was about almost two and half hour, just for a simply query
(data retrieval).
When I compare it with RDBMS, the difference is so significant.
I have to finish my research in no more than a week (it's about 5 days
more).


The problem is, I want to make it looks that Hadoop and his friends (HBase
and Pig) are a good solution to process huge amount of structured data
(since the data thatr i want to process is structured).
How can I tune the overall performance then?

Thank you very much for the attention and any kind of feedback.
:)

Regards,

Florencia.
(Student in High School of Statistics, Department of Computational
Statistic).

[How to optimize MapReduce performance using Pig]

Reply via email to