Interested example for cell level TTL Michael. But one thing I want to say. In the above example, the versions for the corresponding CF should have been >1. In such case there wont be issue with major compaction right? When versions =1 yes, it will give non deterministic results.
-Anoop- On Sun, Apr 19, 2015 at 6:59 PM, Michael Segel <michael_se...@hotmail.com> wrote: > Actually I just thought of a better example… > > Credit Card Fraud detection. > Imagine you’re being sent to work on a project out of the country. > So suppose I head over across the pond and invaded Europe. ;-P > > I would want the credit card companies to not weigh a foreign transaction > heavily when determining fraud, so that if they know my location is in > London, then spending $$ on a dinner in London is not fraud. > > So I call ahead and tell my bank I’m going to be in Europe for XXX months.. > > > > > > As to why you would want to TTL on a column that doesn’t always use a > TTL? > > > > I used this example in a different post… > > > > Imagine you have a road link which has an attribute of speed. > > > > You could have construction, or variable speed limits. > > So you would want to change the speed limit with a TTL. > > > > Or you’re a retailer and you’re offering a 20% discount on a product for > a limited time only? > > > > Sure, these are bad examples because in reality the database is a sync > and the application would manage these type of issues. > > > > > >> On Apr 18, 2015, at 12:23 AM, lars hofhansl <la...@apache.org> wrote: > >> > >> The formatting did not come out right. Lemme try again... > >> > >> > >> Just came here to say that. From our (maybe not clearly enough) defined > semantics this how it should behave. > >> > >> It _is_ confusing, though, since compactions are - in a sense - just > optimizations that run in the background to prevent the number of HFiles to > be unbounded. > >> In this case the schedule of the compactions influences the outcome. > >> > >> Note that even tombstone markers can be confusing. Here's another > confusing example: > >> 1. delete (r1, f1, q1, T2) > >> 2. put (r1, f1, q1, v1, T1) > >> > >> If a compaction happens after #1 but before #2 the put will remain: > >> delete > >> compaction > >> put (remains visible) > >> > >> If the compaction happens after #2 the put will be affected by the > delete and hence removed: > >> delete > >> put > >> compaction (will remove the put) > >> > >> Notice though that both of these examples _are_ a bit weird. > >> Why would only a newer version of the cell have a TTL? > >> Why would you date a delete into the future? > >> > >> -- Lars > >> > >> > >> > >> > >> ________________________________ > >> From: lars hofhansl <la...@apache.org> > >> To: "dev@hbase.apache.org" <dev@hbase.apache.org> > >> Sent: Friday, April 17, 2015 10:18 PM > >> Subject: Re: Nondeterministic outcome based on cell TTL and major > compaction event order > >> > >> > >> Just came here to say that. From our (maybe not clearly enough) defined > semantics this how it should behave. > >> > >> It _is_ confusing, though, since compactions are - in a sense - just > optimizations that run in the background to prevent the number of HFiles to > be unbounded.In this case the schedule of the compactions influences the > outcome. > >> Note that even tombstone markers can be confusing. Here's another > confusing example:1. delete (r1, f1, q1, T2)2. put (r1, f1, q1, v1, T1) > >> If a compaction happens after #1 but before #2 the put will > remain:deletecompactionput (remains visible) > >> > >> If the compaction happens after #2 the put will be affected by the > delete and hence removed.deleteputcompaction (will remove the put) > >> > >> Notice though that both of these examples _are_ a bit weird.Why would > only a newer version of the cell have a TTL?Why would you date a delete > into the future? > >> -- Lars > >> > >> From: Sean Busbey <bus...@cloudera.com> > >> > >> > >> > >> To: dev <dev@hbase.apache.org> > >> Sent: Friday, April 17, 2015 4:52 PM > >> Subject: Re: Nondeterministic outcome based on cell TTL and major > compaction event order > >> > >> If you have max versions set to 1 (the default), then c1 should be > removed > >> at compaction time if c2 still exists then. > >> > >> -- > >> Sean > >> > >> > >> On Apr 17, 2015 6:41 PM, "Michael Segel" <michael_se...@hotmail.com> > wrote: > >> > >>> Ok, > >>> So then if you have a previous cell (c1) and you insert a new cell c2 > that > >>> has a TTL of lets say 5 mins, then c1 should always exist? > >>> That is my understanding but from Cosmin’s post, he’s saying its > >>> different. And that’s why I don’t understand. You couldn’t lose the > cell > >>> c1 at all. > >>> Compaction or no compaction. > >>> > >>> That’s why I’m confused. Current behavior doesn’t match the expected > >>> contract. > >>> > >>> -Mike > >>> > >>>> On Apr 17, 2015, at 4:37 PM, Andrew Purtell <apurt...@apache.org> > wrote: > >>>> > >>>> The way TTLs work today is they define the interval of time a cell > >>>> exists - exactly as that. There is no tombstone laid like a normal > >>>> delete. Once the TTL elapses the cell just ceases to exist to normal > >>>> scanners. The interaction of expired cells, multiple versions, minimum > >>>> versions, raw scanners, etc. can be confusing. We can absolutely > >>>> revisit this. > >>>> > >>>> A cell with an expired TTL could be treated as the combination of > >>>> tombstone and the most recent value it lays over. This is not how the > >>>> implementation works today, but could be changed for an upcoming major > >>>> version like 2.0 if there's consensus to do it. > >>>> > >>>> > >>>>> On Apr 10, 2015, at 7:26 AM, Cosmin Lehene <cleh...@adobe.com> > wrote: > >>>>> > >>>>> I've been initially puzzled by this, although I realize how it's > likely > >>> as designed. > >>>>> > >>>>> > >>>>> The cell TTL expiration and compactions events can lead to either > some > >>> (the older) data left or no data at all for a particular (row, family, > >>> qualifier, ts) coordinate. > >>>>> > >>>>> > >>>>> > >>>>> Write (r1, f1, q1, v1, 1) > >>>>> > >>>>> Write (r1, f1, q1, v1, 2) - TTL=1 minute > >>>>> > >>>>> > >>>>> Scenario 1: > >>>>> > >>>>> > >>>>> If a major compaction happens within a minute > >>>>> > >>>>> > >>>>> it will remove (r1, f1, q1, v1, 1) > >>>>> > >>>>> then after a minute (r1, f1, q1, v1, 2) will expire > >>>>> > >>>>> no data left > >>>>> > >>>>> > >>>>> Scenario 2: > >>>>> > >>>>> > >>>>> A minute passes > >>>>> > >>>>> (r1, f1, q1, v1, 2) expires > >>>>> > >>>>> Compaction runs.. > >>>>> > >>>>> (r1, f1, q1, v1, 1) remains > >>>>> > >>>>> > >>>>> > >>>>> This seems, by and large expected behavior, but it still seems > >>> "uncomfortable" that the (overall) outcome is not decided by me, but > by a > >>> chance of event ordering. > >>>>> > >>>>> > >>>>> I wonder we'd want this to behave differently (perhaps it has been > >>> discussed already), but if not, it's worth a more detailed > documentation in > >>> the book. > >>>>> > >>>>> > >>>>> What do you think? > >>>>> > >>>>> > >>>>> Cosmin > >>>>> > >>>>> > >>>>> > >>>>> > >>>> > >>>> -- > >>>> Best regards, > >>>> > >>>> - Andy > >>>> > >>>> Problems worthy of attack prove their worth by hitting back. - Piet > >>>> Hein (via Tom White) > >>>> > >>> > >>> The opinions expressed here are mine, while they may reflect a > cognitive > >>> thought, that is purely accidental. > >>> Use at your own risk. > >>> Michael Segel > >>> michael_segel (AT) hotmail.com > >>> > >>> > >>> > >>> > >>> > >>> > >> > > > > The opinions expressed here are mine, while they may reflect a cognitive > thought, that is purely accidental. > > Use at your own risk. > > Michael Segel > > michael_segel (AT) hotmail.com > > > > > > > > > > > > > > The opinions expressed here are mine, while they may reflect a cognitive > thought, that is purely accidental. > Use at your own risk. > Michael Segel > michael_segel (AT) hotmail.com > > > > > >