Hi Guojun,

Thanks for the clarification.

Best Regards
Jing


On Tue, Jun 27, 2023 at 9:19 PM Guojun Li <[email protected]> wrote:

> Hi Jing,
>
> > 1.  What is the relationship between Metrics and ***Metrics?
>
> Yes, you're right. Metrics is used to store the reporters and metric
> groups, and is responsible for reporting metrics to the external backend
> periodically. ***Metrics is a grouping of corresponding metric set, we'll
> define and init the metrics in it.
>
> > 2. There is no group hierarchy, i.e. sub-groups. Will it be supported
> later or will we stick to the flat group design in the future?
>
> In my opinion, the flat metric group is enough for Paimon. I don't see the
> need for current designed Paimon metrics, like CommitMetrics / ScanMetrics
> / CompactionMetrics are flat, which is not complicated as Flink metrics
> system, so we simplified the design of MetricGroup. How do you think?
>
> Best,
> Guojun
>
> On Mon, Jun 26, 2023 at 11:24 PM jing ge <[email protected]> wrote:
>
> > Hi Guojun,
> >
> > Sorry for the late reply and thanks for the proposal. The PIP looks good
> to
> > me. I just have two questions to make sure we are on the same page.
> >
> > 1. What is the relationship between Metrics and ***Metrics? It seems that
> > e.g. CommitMetrics is only responsible for built-in metrics write/init
> > while Metrics is used for read and update access centrally for all
> > metricGroups and metrics, i.e. getter methods. Did I understand
> correctly?
> > 2. Current group design is flat. There is no group hierarchy, i.e.
> > sub-groups. Will it be supported later or will we stick to the flat group
> > design in the future?
> >
> > Best Regards
> > Jing
> >
> >
> > On Sun, Jun 25, 2023 at 10:33 AM Guojun Li <[email protected]>
> > wrote:
> >
> > > Hi, devs,
> > >
> > > Thank you for your feedback on PIP-3[1], I'd like to start a vote on it
> > > since the discussion on this thread[2] came to an agreement.
> > >
> > > The vote will last for at least 72 hours (Jun 29th, 2:00 GMT, excluding
> > > weekend days) unless there is an objection or insufficient vote.
> > >
> > > Best,
> > > Guojun
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/PAIMON/PIP-3%3A+Introduce+metrics+for+Paimon
> > > [2]https://lists.apache.org/thread/h2lgqgrvcptzj3c8q4cjzv5jopgtwx9o
> > >
> > > On Wed, Jun 21, 2023 at 5:53 PM Caizhi Weng <[email protected]>
> > wrote:
> > >
> > > > Hi Guojun!
> > > >
> > > > I've checked the latest PIP-3 and it now looks fine to me overall.
> > > Although
> > > > there are still some minor issues, for example it is unclear to me
> how
> > > > MetricGroup is created and is tagged. But this detail can be further
> > > > discussed during the implementation. From my point of view this PIP
> is
> > > now
> > > > fine enough to go on a vote.
> > > >
> > > > Guojun Li <[email protected]> 于2023年6月21日周三 15:23写道:
> > > >
> > > > > Hi caizhi,
> > > > >
> > > > > I've updated APIs in PIP3:
> > > > > 1. Instance of Metrics class is a singleton which maintains the
> > > reporters
> > > > > and Metric groups.
> > > > > 2. Instantiation of MetricGroup will add the current group to
> Metrics
> > > > > instance.
> > > > > 3. The metrics list instance, like CommitMetrics instance, will
> > > maintain
> > > > > the corresponding metricGroup, and can register metrics by the
> > gauge(),
> > > > > counter() ... method.
> > > > >
> > > > > Best,
> > > > > Guojun
> > > > >
> > > > > On Mon, Jun 19, 2023 at 11:30 AM Caizhi Weng <[email protected]
> >
> > > > wrote:
> > > > >
> > > > > > Hi!
> > > > > >
> > > > > > 1. Could you add some demonstration code about how to create and
> > use
> > > a
> > > > > > metric? You CommitMetrics code example uses Metrics#gauge method,
> > but
> > > > > > Metrics class does not have such method.
> > > > > >
> > > > > > 2. You've added Metrics#registerMetrics(String name, Metric
> metric,
> > > int
> > > > > > groupIndex), which seems really weird to me. When registering a
> > > metric
> > > > we
> > > > > > have to remember the index of the group? Why not providing the
> > group
> > > > > > directly, or just register the metric in the group?
> > > > > >
> > > > > >
> > > > > > Caizhi Weng <[email protected]> 于2023年6月16日周五 16:01写道:
> > > > > >
> > > > > > > Hi Guojun!
> > > > > > >
> > > > > > > I see that you've submitted the 3rd edition of PIP-3. Most
> > problems
> > > > > have
> > > > > > > been resolved, however I still have these doubts:
> > > > > > >
> > > > > > > 1. Is `Metrics` class a singleton? If yes, it should contain
> > > multiple
> > > > > > > `MetricGroup` as its member. Currently each `Metrics` class has
> > > only
> > > > > one
> > > > > > > `MetricGroup`. If it is not a singleton, how many instances
> will
> > > > there
> > > > > > be?
> > > > > > > When and where are the instances created?
> > > > > > > 2. In the `MetricGroup` class you support adding tags to one
> > single
> > > > > > > metric. In my opinion, tags should be added to groups, not
> > metrics.
> > > > > > Metrics
> > > > > > > created from the same group should have the same set of tags,
> so
> > > the
> > > > > > > creator of metrics do not need to concern about what tags to
> add.
> > > > > > >
> > > > > > > Guojun Li <[email protected]> 于2023年6月14日周三 07:15写道:
> > > > > > >
> > > > > > >> Yes, thanks Caizhi, it looks very nice.
> > > > > > >>
> > > > > > >> I'm going to make the metrics name more reasonable. Metrics
> > group
> > > is
> > > > > > also
> > > > > > >> a
> > > > > > >> good idea to take into account.
> > > > > > >>
> > > > > > >> On Thu, Jun 8, 2023 at 4:50 PM Caizhi Weng <
> > [email protected]>
> > > > > > wrote:
> > > > > > >>
> > > > > > >> > Hi Guojun!
> > > > > > >> >
> > > > > > >> > I've read the modified document. I still have the following
> > > > > concerns:
> > > > > > >> >
> > > > > > >> > 1. Some metric names are still misleading. For example,
> > > > > > >> "totalTablesFiles"
> > > > > > >> > is the "number of total data files currently maintained on
> > > > storage",
> > > > > > >> > however "numTotalManifests" is the "number of scanned
> > manifests
> > > > > files
> > > > > > in
> > > > > > >> > the last scan planning", not the total number of manifest
> > files
> > > on
> > > > > > >> storage.
> > > > > > >> > Please keep metric names consistent, even across different
> > > metric
> > > > > > types.
> > > > > > >> >
> > > > > > >> > 2. There is no grouping in the current design. Let's
> consider
> > > all
> > > > > > >> metrics
> > > > > > >> > related to compaction. As one parallelism can hold multiple
> > > > writers,
> > > > > > it
> > > > > > >> is
> > > > > > >> > very likely that compactions for different buckets may
> happen
> > in
> > > > the
> > > > > > >> same
> > > > > > >> > parallelism. How are you going to distinguish the same
> > > compaction
> > > > > > metric
> > > > > > >> > for different buckets? Flink has metric groups which
> separate
> > > > > > different
> > > > > > >> > subtasks of the same operator.
> > > > > > >> >
> > > > > > >> > Guojun Li <[email protected]> 于2023年6月2日周五 18:25写道:
> > > > > > >> >
> > > > > > >> > > Thanks caizhi for the detailed review.
> > > > > > >> > >
> > > > > > >> > > I've updated the pip-3 based on your comments, they're
> > useful
> > > to
> > > > > > >> improve
> > > > > > >> > > this proposal and make it clear.
> > > > > > >> > > 1. I added the Histogram and HistogramStatistic
> interfaces.
> > > > > > >> > > 2. Added histogram metrics to measure the distribution of
> > > > commit /
> > > > > > >> scan /
> > > > > > >> > > compaction duration.
> > > > > > >> > > 3. Update the descriptions for some metrics, make them
> more
> > > > clear.
> > > > > > >> > >
> > > > > > >> > > And any other ideas from devs? Looking forward to your
> > > feedback.
> > > > > > >> > >
> > > > > > >> > > Best,
> > > > > > >> > > Guojun
> > > > > > >> > >
> > > > > > >> > > On Mon, May 29, 2023 at 1:51 PM Guojun Li <
> > > > > [email protected]>
> > > > > > >> > wrote:
> > > > > > >> > >
> > > > > > >> > > > Thanks Caizhi for your comments. I'll think about
> > > introducing
> > > > > > >> Histogram
> > > > > > >> > > > metric type and making the metrics description more
> clear.
> > > > > > >> > > >
> > > > > > >> > > > Best,
> > > > > > >> > > > Guojun
> > > > > > >> > > >
> > > > > > >> > > > On Thu, May 25, 2023 at 7:13 PM Caizhi Weng <
> > > > > [email protected]
> > > > > > >
> > > > > > >> > > wrote:
> > > > > > >> > > >
> > > > > > >> > > >> I also forgot to mention that Flink metrics [1] is a
> good
> > > > > > >> reference.
> > > > > > >> > It
> > > > > > >> > > >> clearly describes what Gauge, Counter and Histogram
> > should
> > > do
> > > > > and
> > > > > > >> has
> > > > > > >> > a
> > > > > > >> > > >> list of metrics with clear definition. For example they
> > > have
> > > > a
> > > > > > >> metric
> > > > > > >> > > >> named
> > > > > > >> > > >> "lastCheckpointDuration" (which is a Gauge) and not
> just
> > > > > > >> > > >> "checkpointDuration".
> > > > > > >> > > >>
> > > > > > >> > > >> [1]
> > > > > > >> > > >>
> > > > > > >> >
> > > > > >
> > > https://nightlies.apache.org/flink/flink-docs-master/docs/ops/metrics/
> > > > > > >> > > >>
> > > > > > >> > > >> Caizhi Weng <[email protected]> 于2023年5月25日周四
> > 19:10写道:
> > > > > > >> > > >>
> > > > > > >> > > >> > Hi Guojun!
> > > > > > >> > > >> >
> > > > > > >> > > >> > Thanks for raising this discussion. After reading the
> > > > > document,
> > > > > > >> I'd
> > > > > > >> > > like
> > > > > > >> > > >> > to raise a few opinions.
> > > > > > >> > > >> >
> > > > > > >> > > >> >
> > > > > > >> > > >> >
> > > > > > >> > > >>
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> --------------------------------------------------------------------------------
> > > > > > >> > > >> >
> > > > > > >> > > >> > 1. About the overall types of metrics.
> > > > > > >> > > >> >
> > > > > > >> > > >> > You suggested two types of metrics, namely Gauge and
> > > > > Counter. A
> > > > > > >> > gauge
> > > > > > >> > > >> > indicates the value of a metric at that instance, for
> > > > example
> > > > > > >> "the
> > > > > > >> > > >> duration
> > > > > > >> > > >> > of the last compaction". A counter indicates an
> > > accumulated
> > > > > > value
> > > > > > >> > of a
> > > > > > >> > > >> > metric overtime, for example "the total number of
> > records
> > > > > > >> written".
> > > > > > >> > > >> >
> > > > > > >> > > >> > However there are times when we want to study the
> > history
> > > > and
> > > > > > see
> > > > > > >> > the
> > > > > > >> > > >> > trend. For example if we notice that the last
> > compaction
> > > is
> > > > > > >> slower
> > > > > > >> > > than
> > > > > > >> > > >> > normal, we might want to know "the average duration
> of
> > > the
> > > > > last
> > > > > > >> few
> > > > > > >> > > >> > compactions" to make sure that this slow down is just
> > by
> > > > > chance
> > > > > > >> or
> > > > > > >> > is
> > > > > > >> > > a
> > > > > > >> > > >> > common case. Both gauge and counter cannot meet this
> > > need.
> > > > > What
> > > > > > >> we
> > > > > > >> > > need
> > > > > > >> > > >> is
> > > > > > >> > > >> > another type of metrics called Histogram. This
> metrics
> > > type
> > > > > > >> should
> > > > > > >> > be
> > > > > > >> > > >> able
> > > > > > >> > > >> > to record the last few values of a metric and
> calculate
> > > > their
> > > > > > >> > > statistics
> > > > > > >> > > >> > such as average, min/max and more.
> > > > > > >> > > >> >
> > > > > > >> > > >> >
> > > > > > >> > > >> >
> > > > > > >> > > >>
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> --------------------------------------------------------------------------------
> > > > > > >> > > >> >
> > > > > > >> > > >> > 2. About the types of some specific metrics.
> > > > > > >> > > >> >
> > > > > > >> > > >> > It seems to me that in your proposal, you cannot
> > clearly
> > > > tell
> > > > > > the
> > > > > > >> > > >> > difference between Gauges and Counters, and sometimes
> > it
> > > is
> > > > > not
> > > > > > >> > clear
> > > > > > >> > > >> that
> > > > > > >> > > >> > what a specific metric means. Let me give you some
> > > example.
> > > > > > >> > > >> >
> > > > > > >> > > >> > Metric Name: commitDuration
> > > > > > >> > > >> >
> > > > > > >> > > >> > Description:  Commit
> > > > > > >> > > >> >
> > > > > > >> > > >> > Type: Gauge
> > > > > > >> > > >> >
> > > > > > >> > > >> > Update at: Timer starts before commit starting,
> update
> > > > commit
> > > > > > >> > duration
> > > > > > >> > > >> >> after commit finished
> > > > > > >> > > >> >
> > > > > > >> > > >> >
> > > > > > >> > > >> > At first glance, "Commit" is not a valid description.
> > > From
> > > > > the
> > > > > > >> > metric
> > > > > > >> > > >> name
> > > > > > >> > > >> > I guess you'd like to record the length of duration
> for
> > > > > > >> committing
> > > > > > >> > > >> > snapshots. As its type is a Gauge, I suppose you want
> > to
> > > > > record
> > > > > > >> the
> > > > > > >> > > >> *last*
> > > > > > >> > > >> > commit. Please make things clear what exactly you
> want
> > to
> > > > > > record.
> > > > > > >> > And
> > > > > > >> > > >> as I
> > > > > > >> > > >> > said in the last section, it might be better to
> > > introduce a
> > > > > > >> > Histogram
> > > > > > >> > > >> type
> > > > > > >> > > >> > so that we can study the average duration of the last
> > few
> > > > > > >> commits.
> > > > > > >> > > >> >
> > > > > > >> > > >> > Metric Name: numTableFilesAdded
> > > > > > >> > > >> >
> > > > > > >> > > >> > Description: Number of added table files in this
> commit
> > > > > > >> > > >> >
> > > > > > >> > > >> > Type: Counter
> > > > > > >> > > >> >
> > > > > > >> > > >> > Update at: Collecting changes from committables
> > > > > > >> > > >> >
> > > > > > >> > > >> >
> > > > > > >> > > >> > By "this commit" I suppose you mean *last* commit. If
> > you
> > > > the
> > > > > > >> type
> > > > > > >> > > >> should
> > > > > > >> > > >> > be a Gauge, not a Counter. By using Counter you may
> > want
> > > to
> > > > > > >> record
> > > > > > >> > the
> > > > > > >> > > >> *total
> > > > > > >> > > >> > number* of files added during all commits.
> > > > > > >> > > >> >
> > > > > > >> > > >> > Metric Name: numFilesCompactedBefore
> > > > > > >> > > >> >
> > > > > > >> > > >> > Description: Number of deleted files in compaction
> > > > > > >> > > >> >
> > > > > > >> > > >> > Type: Counter
> > > > > > >> > > >> >
> > > > > > >> > > >> > Update at: Triggering compaction
> > > > > > >> > > >> >
> > > > > > >> > > >> >
> > > > > > >> > > >> > It is also unclear to me what you're going to record.
> > Are
> > > > you
> > > > > > >> going
> > > > > > >> > > the
> > > > > > >> > > >> > record the number of deleted files during the *last*
> > > > > > compaction?
> > > > > > >> Or
> > > > > > >> > > >> you'd
> > > > > > >> > > >> > like to record the *total number* of deleted files
> > during
> > > > all
> > > > > > >> > > >> compaction?
> > > > > > >> > > >> >
> > > > > > >> > > >> > I'm afraid most of your proposed metrics seem unclear
> > to
> > > > me.
> > > > > > >> Please
> > > > > > >> > > >> > rewrite the proposal to make them clear.
> > > > > > >> > > >> >
> > > > > > >> > > >> > Guojun Li <[email protected]> 于2023年5月25日周四
> > > 16:59写道:
> > > > > > >> > > >> >
> > > > > > >> > > >> >> Hi, Paimon Devs,
> > > > > > >> > > >> >>      I’d like to start a discussion about PIP-3[1].
> > > > > > >> > > >> >>      In this PIP, I'm talking about how to support
> > > metrics
> > > > > for
> > > > > > >> > > paimon,
> > > > > > >> > > >> >> what
> > > > > > >> > > >> >> metrics paimon needs. Look forward to your question
> > and
> > > > > > >> > suggestions.
> > > > > > >> > > >> >>
> > > > > > >> > > >> >> Best,
> > > > > > >> > > >> >> Guojun
> > > > > > >> > > >> >>
> > > > > > >> > > >> >> [1]
> > > > > > >> > > >> >>
> > > > > > >> > > >> >>
> > > > > > >> > > >>
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/PAIMON/PIP-3%3A+Introduce+metrics+for+Paimon
> > > > > > >> > > >> >>
> > > > > > >> > > >> >
> > > > > > >> > > >>
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to