Hello,
I'm new to HBase, so excuse me if I make odd questions.
I'm evaluating HBase from its documentation, and am attracted by its
broad functionality such as transaction support, secondary index, REST
API, MapReduce integration, etc. When I recommended HBase to my
colleagues for the internal pr
Have a look at the bigtable paper, it should help you understand
somewhat why things are the way they are.
The versioning of HBase is integral to the storage mechanism behind it
(and also cassandra and all bigtable like systems). HBase stores it's
data on HDFS which has immutable files. Thus "ove
In addition to what Ryan said, even if the default maximum number of
versions for a cell is 3 doesn't mean that you end up wasting space.
If you only ever write one version, that's what you end up paying for.
--
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com
d disk storage space by storing only one
version.
Any opinion and information is appreciated.
Regards
Takayuki
- Original Message -
From: "Ryan Rawson"
To:
Sent: Friday, May 07, 2010 11:42 AM
Subject: Re: How is column timestamp useful?
> Have a look at the bigtable paper, it
Hadoop philosophy is to deploy on low cost disks and keep 3 copies of data
for redundancy. This ensures that the costs are very low- perhaps 5 to 10
times lower than what large Enterprises are paying for expensive SAN
configurations.
This does not mean one needs to waste storage- If you store fil
Hello, Sigoure-san
> In addition to what Ryan said, even if the default maximum number of
> versions for a cell is 3 doesn't mean that you end up wasting space.
> If you only ever write one version, that's what you end up paying
for.
I expect so if the data is inserted and never updated. What if
o not provide versioning. So I felt that many people do not
have to use versioning and the default maximum versions of HBase had
better be 1.
Regards
Takayuki
- Original Message -
From: "Kevin Apte"
To:
Sent: Friday, May 07, 2010 1:51 PM
Subject: Re: How is column timestamp use
こんにちは :)
On Thu, May 6, 2010 at 9:56 PM, Takayuki Tsunakawa
wrote:
> In use case 1, I don't understand why three versions of each web page
> need to be saved, so this is not a helpful example.
Because they want to be able to access multiple versions of the same
web page. It's useful for various
to use versioning and the default maximum versions of HBase had
> better be 1.
>
> Regards
> Takayuki
>
>
> - Original Message -
> From: "Kevin Apte"
> To:
> Sent: Friday, May 07, 2010 1:51 PM
> Subject: Re: How is column timestamp useful?
>
&
> better be
>
> Regards
> Takayuki
>
>
> - Original Message -
> From: "Kevin Apte"
> To:
> Sent: Friday, May 07, 2010 1:51 PM
> Subject: Re: How is column timestamp useful?
>
>
> > Hadoop philosophy is to deploy on low cost disks and keep 3 cop
SimpleDB, Microsoft Azure Table, and Google App Engine
>> Datastore do not provide versioning. So I felt that many people do not
>> have to use versioning and the default maximum versions of HBase had
>> better be
>>
>> Regards
>> Takayuki
>>
>>
>
All,
Thank you for giving lots of opinions and information. I'll try to
persuade my colleagues as follows:
I couldn't find any good examples where versioning should be
definitely utilized. However, HBase community members gave me the idea
on how versioning is useful.
1. Recover data lost by accid
On Fri, May 7, 2010 at 12:03 AM, Takayuki Tsunakawa
wrote:
> If versioning is not necessary from your requirement, you can ignore
> timestamps (do not have to specify timestamp in API call).
Yes, it's actually recommended to not manually specify timestamps in
API calls, particularly when insertin
2:04 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: How is column timestamp useful?
>
> All,
>
> Thank you for giving lots of opinions and information. I'll try to
> persuade my colleagues as follows:
>
> I couldn't find any good examples where ve
From: "tsuna"
> > If saving memory (=keep memtable as small as possible) is
important,
> > you can set the maximum number of versions to 1.
> I don't think you'll be saving much memory anyway. As Ryan already
> pointed out, when you overwrite a cell, a new version is created,
> regardless of the
Hello, Gray-san
Thank you. Your explanation was helpful.
Regards
Takayuki
- Original Message -
From: "Jonathan Gray"
To:
Sent: Saturday, May 08, 2010 1:54 AM
Subject: RE: How is column timestamp useful?
I would argue that the primary reasons for versioning has nothing
16 matches
Mail list logo