How is column timestamp useful?

2010-05-06 Thread Takayuki Tsunakawa
Hello, I'm new to HBase, so excuse me if I make odd questions. I'm evaluating HBase from its documentation, and am attracted by its broad functionality such as transaction support, secondary index, REST API, MapReduce integration, etc. When I recommended HBase to my colleagues for the internal pr

Re: How is column timestamp useful?

2010-05-06 Thread Ryan Rawson
Have a look at the bigtable paper, it should help you understand somewhat why things are the way they are. The versioning of HBase is integral to the storage mechanism behind it (and also cassandra and all bigtable like systems). HBase stores it's data on HDFS which has immutable files. Thus "ove

Re: How is column timestamp useful?

2010-05-06 Thread tsuna
In addition to what Ryan said, even if the default maximum number of versions for a cell is 3 doesn't mean that you end up wasting space. If you only ever write one version, that's what you end up paying for. -- Benoit "tsuna" Sigoure Software Engineer @ www.StumbleUpon.com

Re: How is column timestamp useful?

2010-05-06 Thread Takayuki Tsunakawa
d disk storage space by storing only one version. Any opinion and information is appreciated. Regards Takayuki - Original Message - From: "Ryan Rawson" To: Sent: Friday, May 07, 2010 11:42 AM Subject: Re: How is column timestamp useful? > Have a look at the bigtable paper, it

Re: How is column timestamp useful?

2010-05-06 Thread Kevin Apte
Hadoop philosophy is to deploy on low cost disks and keep 3 copies of data for redundancy. This ensures that the costs are very low- perhaps 5 to 10 times lower than what large Enterprises are paying for expensive SAN configurations. This does not mean one needs to waste storage- If you store fil

Re: How is column timestamp useful?

2010-05-06 Thread Takayuki Tsunakawa
Hello, Sigoure-san > In addition to what Ryan said, even if the default maximum number of > versions for a cell is 3 doesn't mean that you end up wasting space. > If you only ever write one version, that's what you end up paying for. I expect so if the data is inserted and never updated. What if

Re: How is column timestamp useful?

2010-05-06 Thread Takayuki Tsunakawa
o not provide versioning. So I felt that many people do not have to use versioning and the default maximum versions of HBase had better be 1. Regards Takayuki - Original Message - From: "Kevin Apte" To: Sent: Friday, May 07, 2010 1:51 PM Subject: Re: How is column timestamp use

Re: How is column timestamp useful?

2010-05-06 Thread tsuna
こんにちは :) On Thu, May 6, 2010 at 9:56 PM, Takayuki Tsunakawa wrote: > In use case 1, I don't understand why three versions of each web page > need to be saved, so this is not a helpful example. Because they want to be able to access multiple versions of the same web page. It's useful for various

Re: How is column timestamp useful?

2010-05-06 Thread Ryan Rawson
to use versioning and the default maximum versions of HBase had > better be 1. > > Regards > Takayuki > > > - Original Message - > From: "Kevin Apte" > To: > Sent: Friday, May 07, 2010 1:51 PM > Subject: Re: How is column timestamp useful? > &

Re: How is column timestamp useful?

2010-05-06 Thread Kevin Apte
> better be > > Regards > Takayuki > > > - Original Message - > From: "Kevin Apte" > To: > Sent: Friday, May 07, 2010 1:51 PM > Subject: Re: How is column timestamp useful? > > > > Hadoop philosophy is to deploy on low cost disks and keep 3 cop

Re: How is column timestamp useful?

2010-05-06 Thread Ryan Rawson
SimpleDB, Microsoft Azure Table, and Google App Engine >> Datastore do not provide versioning. So I felt that many people do not >> have to use versioning and the default maximum versions of HBase had >> better be >> >> Regards >> Takayuki >> >> >

Re: How is column timestamp useful?

2010-05-06 Thread Takayuki Tsunakawa
All, Thank you for giving lots of opinions and information. I'll try to persuade my colleagues as follows: I couldn't find any good examples where versioning should be definitely utilized. However, HBase community members gave me the idea on how versioning is useful. 1. Recover data lost by accid

Re: How is column timestamp useful?

2010-05-07 Thread tsuna
On Fri, May 7, 2010 at 12:03 AM, Takayuki Tsunakawa wrote: > If versioning is not necessary from your requirement, you can ignore > timestamps (do not have to specify timestamp in API call). Yes, it's actually recommended to not manually specify timestamps in API calls, particularly when insertin

RE: How is column timestamp useful?

2010-05-07 Thread Jonathan Gray
2:04 AM > To: hbase-user@hadoop.apache.org > Subject: Re: How is column timestamp useful? > > All, > > Thank you for giving lots of opinions and information. I'll try to > persuade my colleagues as follows: > > I couldn't find any good examples where ve

Re: How is column timestamp useful?

2010-05-09 Thread Takayuki Tsunakawa
From: "tsuna" > > If saving memory (=keep memtable as small as possible) is important, > > you can set the maximum number of versions to 1. > I don't think you'll be saving much memory anyway. As Ryan already > pointed out, when you overwrite a cell, a new version is created, > regardless of the

Re: How is column timestamp useful?

2010-05-09 Thread Takayuki Tsunakawa
Hello, Gray-san Thank you. Your explanation was helpful. Regards Takayuki - Original Message - From: "Jonathan Gray" To: Sent: Saturday, May 08, 2010 1:54 AM Subject: RE: How is column timestamp useful? I would argue that the primary reasons for versioning has nothing