On Wed, Sep 22, 2010 at 10:39 AM, Sujee Maniyam <[email protected]> wrote:
> Jack,
> sounds like a cool project indeed.  Few questions for you...
>
>
> 1) how do you setup 50+ servers.  What I mean is, installing OS,
> installing all software.  Setting up user accounts.  Setting up SSH
> keys ..etc
>     DO you use any software for this?

Just ssh keys, using cfengine, and other things.


>
> 2) Is Hbase going to be the 'primary' storage for images?  Meaning,
> your front-end reads/writes to Hbase?
> Do you also maintain a 'file storage' as a backup / alternative?

There will be front end servers that will cache and serve files off
their disks, while hbase is going to keep the images highly available
for some products as well as keeping them safe.


> 3) Do you only store the 'main image' in Hbase?  How about
> thumbnails, medium size, large size cousins?

Those will be generated dynamically.

>
> 4) Is this a dedicated Hbase cluster, or you are building this on top
> of your existing Hadoop cluster.  Will this be sharing  resources with
> MR jobs that you already run?

Its dedicated.

>
> 5) I notice you have 8G RAM for region servers.  Hbase is very memory
> hungry and specially dealing with large data sizes, I'd imagine you'd
> need 24G-32G (as it was previously mentioned)

This is not going to happen for two reasons, a) its too expensive and
motherboard does not support it b) We are aiming for large dataset
with 5 to 10% concentrated hits.

> 6) how long does it take for all regions to be available after a 'cold
> start' of Hbase?

6-7 mins with dual core, 1 minute with 8 core.

> 7) I'd be interested to know how do you do 'standby servers' for
> HMaster and Hadoop Namenode

Just two 8 core boxes, running both namenode, secondary namenode and masters.

-Jack


> have fun
>
> regards
> Sujee
>
> http://sujee.net
>

Reply via email to