Hi,

We use HBase 0.19Rc2, our data(~800GB) resides in one table( is it bad?),
schema of table is pretty simple - it's two column families, one is keys and
second is value, each key could have one or more values(~100). To query
values used some file with keys(for instance about 10M keys), so the purpose
is to read all values for each one of keys, where expected performance is
about 1 hour. By the way data output is not too big ~2Gb.

Thanks,
Gennady



On Thu, Jan 22, 2009 at 7:46 PM, stack <[email protected]> wrote:

> Genady wrote:
>
>> Hi,
>>
>>
>> Just wondering if somebody could recommend a random read strategy for
>> searching a big group of the keys(100M) in hadoop/hbase cluster, using one
>> client is very slow, separating an input to smaller groups and running
>> each
>> one with a different client is certainly improves performance, but maximum
>> speed I'm getting is ~3300 read/sec. I've tried to use map reduce and to
>> run
>> search as map-reduce ask and to run HBase reads from map or reduce, but
>> HBase is start to fail. So hardware upgrade and creating HBase in memory
>> tables is only direction here?
>>
>>
>>
> Tell us more about your table schema, data sizes, and the types of query.
>  What performance do you need from hbase?  Do your rows have many columns
> and you are trying to get all columns when you query for example?  Are you
> on 0.19.0 Genady (sorry if you've answered this question in the near past)?
> St.Ack
>

Reply via email to