Hi,

I am currently doing some research on distributed database that can be
scaled easily in terms of storage capacity.

The reason is to use it on the brazilian federal project called
"portal do aluno" wich will have around 10 million kids accessing it
monthly. The idea is to build a portal similar to facebook/orkut with
the main objective to spread knowledge amoung kids (6 -13 years old).

well, now the problem:

Those kids will generate a lot of data which include photos, videos,
presentations, school tasks among others. In order to have a 100%
available system and also to scale this amount of data (initial
estimative is 10  TB at the full use of the portal), a distributed
storage engine seems to be the solution.

On the avialable solutions, i liked voldemort because it seems not to have a
SPOF (single point of
failure) when compared to HBase. However HBase seems to integrate with more
tools and sub-projects.

my question is concerned to the fact of storing such big items (2 MB
photo for example) with HBase. I read on on blogs that HBase has a high
latency which leads it to
be inappropriate to serve dynamic pages. Will the performance of HBase
decrease even more if large binary objects are stored on it?

Other question i have is related to the fact of modelling the data
using key/value pattern. With relational database it is just follow cake
recipe and it´s done. Do we have such recipe for key/value? Currently
a lot of code was done with relational database postgreSQL using
hibernate to mapping the objects.


i will appreciate any comments




-- 
"A realidade de cada lugar e de cada época é uma alucinação coletiva."


Bloom, Howard

Reply via email to