See answers.

J-D

On Thu, Sep 3, 2009 at 11:07 AM, Xine Jar<[email protected]> wrote:
> Hallo,
> I have a cluster of 6 nodes (Namenode, Jobtracker, an hbase master, and
> three regionservers) running hadoop-0.19.1 and hbase-0.19.3. I have created
> an hbase table "mytable" and have written a program to read the value in
> each line of the table and get the overall average of the values.
>
> I have few quick clarification questions to pose.
>
> Q1- "MyTable" has one family column and has a size of 400MB. according to
> the default value of hbase.hregion.max.filesize I
>      have EXPECTED that it should be split into two regions 256MB and
> 144MB. But the UI on port 60010 showed that the
>      "mytable" has 3 regions (107MB+89MB+223MB). Why 3 not 2?

HBase doesn't split exactly when it reaches 256MB and exactly at
128MB. You could easily have a daughter split getting 180MB and
getting another split giving you a 89MB region and another one at
223MB.

>
> Q2- The UI of the hbase master on port 60010, showed the three regions of
> "mytable" each with a start key and end key. I
>      noticed  as well that the three regions are stored on the same
> regionserver.The other regionservers stored the ROOT and
>      the META. Shouldn't the regions of "mytable" be equally distributed
> and stored on all region servers?

Nope, if we did that we would be constantly moving the regions around
at the slightest change. We included some "sloppiness" so when you
have a very small region count it's obvious but when you get around
the real numbers (more than 100) you will see it rather well
distributed. It's not perfect tho, we can still improve a lot of
stuff.

>
> Q3-The job is taking around 1 minute to finish, I can see that the reduce
> function is very slow, could you give me some hints
>      how can I make it faster? In which case should I think about splitting
> the Job into 2? Something else I have to try to
>      enhance the performance?

What is your reduce even doing? How many reduce tasks do you set? What
would be in your opinion a fast enough reduce phase?

>
> Regards,
> CJ
>

Reply via email to