Lars, But I see you're looking at 0.98. There indeed we are called > refreshAgeOfLastAppliedOp every time we call getStats(), which would > increase that metric even when there is nothing to ship. That looks like a > bug. >
yeah. that is what I am thinking. Let's have a quick fix on it. I will provide a patch later today. (was working on HBASE-9531 when the problem surfaces). Demai On Wed, Jul 30, 2014 at 10:41 AM, lars hofhansl <[email protected]> wrote: > I was also confusing ageOfLastShippedEdit with ageOfLastAppliedEdit. I had > fixed ageOfLastShippedEdit on the source in 0.94. > Looks like 0.94 is doing the right thing with ageOfLastAppliedEdit, but > 0.98+ is not. > > -- Lars > > ------------------------------ > *From:* lars hofhansl <[email protected]> > *To:* Demai Ni <[email protected]>; "[email protected]" < > [email protected]> > *Sent:* Wednesday, July 30, 2014 10:30 AM > > *Subject:* Re: meaning for AgeOfLastAppliedOp in Replication MetricsSink > > > When the 'the time an edit entered the system' doesn't change (in the > case > of no Sink Op enter for a period of time), the age will keep growing > since current time moving forward, which gives a false impression that > an edit sitting in the queue for very long time. isn't it? > > > I think until the item is shipped it should be counted as waiting. I.e. > the time this reports is the time between an edit entered the system and > the time it finally gets shipped to the replication > sink.refreshAgeOfLastAppliedOp() should only be called when something is > actually being shipped, not periodically (I just fixed that in 0.94 > HBASE-11143). > > > > But I see you're looking at 0.98. There indeed we are called > refreshAgeOfLastAppliedOp every time we call getStats(), which would > increase that metric even when there is nothing to ship. That looks like a > bug. > > -- Lars > > > > ________________________________ > > From: Demai Ni <[email protected]> > To: "[email protected]" <[email protected]>; lars hofhansl < > [email protected]> > Sent: Wednesday, July 30, 2014 8:49 AM > Subject: Re: meaning for AgeOfLastAppliedOp in Replication MetricsSink > > > > Lars, thanks for your input. > > > This metric indicates the time an edit sat in the "replication queue" > before it got replicated. > yeah, I am with you on this. > > > With that definition it is doing the right thing: Reporting current time - > the time an edit entered the system (it's WAL time) > > > When the 'the time an edit entered the system' doesn't change (in the > case of no Sink Op enter for a period of time), the age will keep growing > since current time moving forward, which gives a false impression that an > edit sitting in the queue for very long time. isn't it? > > > > > On Tue, Jul 29, 2014 at 10:54 PM, lars hofhansl <[email protected]> wrote: > > This metric indicates the time an edit sat in the "replication queue" > before it got replicated. > >With that definition it is doing the right thing: Reporting current time > - the time an edit entered the system (it's WAL time) > > > > > >-- Lars > > > > > > > >________________________________ > > From: Demai Ni <[email protected]> > >To: "[email protected]" <[email protected]> > >Sent: Tuesday, July 29, 2014 3:48 PM > >Subject: meaning for AgeOfLastAppliedOp in Replication MetricsSink > > > > > > > >hi, > > > >A quick question to clarify this AgeOfLastAppliedOp in MetricsSink.java. > I > >assume it is used as an indicator about how long for a Sink OP to be > >applied; but instead, it is more like to show how long since the last Sink > >OP applied > > > > /** > > * Set the age of the last applied operation > > * > > * @param timestamp The timestamp of the last operation applied. > > * @return the age that was set > > */ > > public long setAgeOfLastAppliedOp(long timestamp) { > > lastTimestampForAge = timestamp; > > long age = System.currentTimeMillis() - lastTimestampForAge; > > rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); > > return age; > > } > > > >In the following scenario: > >1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is > >set for example 100ms; > >2) and then NO new Sink op occur. > >3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of > >return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, which > >doesn't make sense, right? > > > >should we put a check for (lastTimestampForAge != timestamp) before > refresh > >the age? > > > >Thanks > > > >Demai > > >
