Big table has versions.  Does the big table paper actually describe
the behavior of inserting two identical keys at different times when
the table is set to show two versions?  If these keys were in two
separate map files/sstables then something would have to make a
decision to suppress one of them.  I am not sure the big table paper
got that specific.  You could suppress one of the keys, or just
consider them to be two versions.  We have been considering them to be
versions.

On Thu, Dec 22, 2011 at 4:20 PM, Aaron Cordova <[email protected]> wrote:
> _You_ can think of it that way, cause you're Adam Fucsh, distributed database 
> expert extraordinaire, but that's not how the BigTable data model was 
> described by the original authors - "BigTable is a sparse, sorted, 
> distributed, multidimensional map", and most users do understand Accumulo to 
> be a map of keys to values where the keys are made up of a row, colfam, 
> colqual, colvis, and timestamp and the values are arbitrary byte pairs.
>
> To start explaining to people that Accumulo is a multi-map, or to actually 
> make it into a multi-map (i.e. allowing identical keys, where a key includes 
> the timestamp), would be a mistake, in my opinion.
>
>
> On Dec 22, 2011, at 4:09 PM, Adam Fuchs wrote:
>
>> Sorry, I thought we were talking about users' perceptions of semantics.
>> Bigtable also supports holding multiple versions of key/value pairs, so it
>> can be thought of as having an underlying multi-map as well.
>>
>> Adam
>>
>>
>> On Thu, Dec 22, 2011 at 4:04 PM, Aaron Cordova <[email protected]> wrote:
>>
>>>
>>> On Dec 22, 2011, at 4:00 PM, Adam Fuchs wrote:
>>>
>>>> Timestamp doesn't usually make
>>>> it into the uniqueness concept, from a user's perspective, even though
>>> that
>>>> affects the sort order of Keys. In fact, most users let Accumulo set the
>>>> timestamp for them. I think your definition of uniqueness takes timestamp
>>>> into account, and from that perspective what we're doing is sort of like
>>>> providing a finer grained timestamp instead of using one timestamp for an
>>>> entire Mutation (or for all Mutations that show up within a millisecond).
>>>
>>> Timestamps do define separate keys. This is not just my definition - this
>>> is in the BigTable design as well as Hbase's, and likely every other
>>> BigTable clone.
>>>
>>>
>>>
>

Reply via email to