Hi,

At this time, truly massive massive rows such as the one you described
may behave non-optimally in hbase. While in previous versions of
HBase, reading an entire row required you to be able to actually read
and send the entire row in one go, there is a new API that allows you
to get effectively stream rows.  There are still some read paths that
may read more data than necessary, so your performance milage may
vary.



On Sun, Mar 7, 2010 at 3:56 AM, Ahmed Suhail Manzoor
<[email protected]> wrote:
> Hi,
>
> This might prove to be a blatantly obvious questions but wouldn't it make
> sense to store large files directly in HDFS and keep the metadata about the
> file in HBase? One could for instance serialize set the details of the hdfs
> file in a java object and store that in hbase. This object could export the
> reading of the hdfs file for instance so that one is left with clean code.
> Anything wrong in implementing things this way?
>
> Cheers
> su./hail
>
> On 07/03/2010 09:21, tsuna wrote:
>>
>> On Sat, Mar 6, 2010 at 9:14 PM, steven zhuang
>> <[email protected]>  wrote:
>>
>>>
>>>          I have a table which may contain super big rows, e.g. with
>>> millions of cells in one row, 1.5GB in size.
>>>
>>>          now I have problem at emitting data into the table, probably
>>> because of these super big rows are too large for my regionserver(with
>>> only
>>> 1GB heap)
>>>
>>
>> A row can't be split and whatever you do that needs that row (like
>> reading it) requires that HBase loads the entire row in memory.  If
>> the row is 1.5GB and your regionserver has only 1G of memory, it won't
>> be able to use that row.
>>
>> I'm not 100% sure about that because I'm still a HBase n00b too, but
>> that's my understanding.
>>
>>
>
>

Reply via email to