Hello everyone and happy holidays,

The changes below are ready for review!
Benchmarks are also inside.

Expose all table metrics in virtual tables
https://issues.apache.org/jira/browse/CASSANDRA-14572
https://github.com/apache/cassandra/pull/2958/files

On Tue, 12 Dec 2023 at 22:05, Maxim Muzafarov <mmu...@apache.org> wrote:
>
> Hello everyone,
>
>
> I still think Cassandra will benefit from having this idea implemented
> and used through the source code, so I've done another round of
> rethinking this concept and it seems I've found a solution. As a
> result, we can significantly reduce the cost of implementing and
> maintaining both new and existing virtual tables and make our users
> happier by seeing everything they need through virtual tables.
>
> So, I think we should limit the scope of the original proposal to the 
> following:
> ## A framework for exposing any internal data collection to virtual
> tables ONLY. ##
>
> As a proof of concept, I took the CASSANDRA-14572 "Expose all table
> metrics in virtual table" JIRA ticket, which provides a good
> opportunity to demonstrate how to export all metrics to VTs at once
> without having boilerplate implementations. Currently, we already have
> CQLMetricsTable, BatchMetricsTable, etc. that expose metrics to VTs in
> a pretty similar way, and most of the metrics groups are located under
> the org.apache.cassandra.metrics package still lacks their
> representation as VTs either. I've used the MetricRegistry collection
> as a view of registered metrics to export them to VT using the
> prototype accordingly.
>
> The prototype is complete. You can run a node locally and check the
> available virtual tables with cqlsh, or you can check the changes
> using the following link to the PR:
> https://github.com/apache/cassandra/pull/2958/files
>
>
> Below are some key points about the design itself:
>
> 1. All new virtual tables with metrics have "metric" as a prefix so
> that they are fairly easy to find using TAB on the cqlsh command line.
> Metrics are split into virtual tables as they are listed in the
> org.apache.cassandra.metrics e.g. metrics_cql, metrics_tcm etc. In
> addition, they are also grouped by metric type e.g.
> metric_type_histogram, metric_type_counter etc. There is a table
> called "metric_all_metric_groups" with all available metric groups.
>
> 2. To create a new virtual table representation of an internal
> collection a developer needs to do two things: create a virtual table
> row representation, and register it using
> CollectionVirtualTableAdapter, which acts as an adapter between
> internal data and a virtual table. Here's how I did it for the thread
> pools VT, this is a fully backward compatible change:
> https://github.com/apache/cassandra/pull/2958/files#diff-5fda13a633723cdf232bba465e6fb7ab74cdc02f7ba55dae4d1cf494ffb71abaR61
>
> 3. The "metrics_keyspace" virtual table ended up being quite large
> since it contains all the metrics for all available keyspaces on a
> local node, so the default implementation provided by
> AbstractVirtualTable is not suitable for the proposal. The
> AbstractVirtualTable materializes a full data collection on the heap
> using SimpleDataSet, regardless of the portion of data that is being
> queried. In this case, we have to use an iterative approach, as the
> CollectionVirtualTableAdapter does (the problem was discussed in
> CASSANDRA-14629 and is now a part of the solution). This also helps to
> keep the memory footprint low.
>
> 4. Another valuable change is the CassandraMetricsRegistry itself. The
> problem here is that the metrics and their aliases are currently
> exported to JMX, but the implemented virtual tables export the metrics
> in their way and most of the cases don't respect the metric aliases
> which are registered in the MetricsRegistry. This should be fixed as a
> part of the CASSANDRA-14572 to avoid ambiguity for all known metrics
> once and for all.
>
> Here are the links to the issue and the PR:
> https://issues.apache.org/jira/browse/CASSANDRA-14572
> https://github.com/apache/cassandra/pull/2958/files
>
>
> I'm excited about how these changes look right now, so please share
> your feedback and thoughts.
> The PR lacks good test coverage, I'll fix it as soon as we have a
> clear vision of the design (or much sooner) :-)
>
> On Mon, 30 Jan 2023 at 17:43, David Capwell <dcapw...@apple.com> wrote:
> >
> > I *think* this is arguably true for a vtable / CQL-based solution as well 
> > from the "you don't know how people are using your API" perspective.
> >
> >
> > Very fair point and think that justifies a different thread to talk about 
> > backwards compatibility for our tables (virtual and not); maybe we can lump 
> > together the JMX backwards compatibility conversation as well in that new 
> > thread.
> >
> > On Jan 28, 2023, at 4:09 PM, Josh McKenzie <jmcken...@apache.org> wrote:
> >
> > First off - thanks so much for putting in this effort Maxim! This is 
> > excellent work.
> >
> > Some thoughts on the CEP and responses in thread:
> >
> > Considering that JMX is usually not used and disabled in production 
> > environments for various performance and security reasons, the operator may 
> > not see the same picture from various of Dropwizard's metrics exporters and 
> > integrations as Cassandra's JMX metrics provide [1][2].
> >
> > I don't think this assertion is true. Cassandra is running in a lot of 
> > places in the world, and JMX has been in this ecosystem for a long time; we 
> > need data that is basically impossible to get to claim "JMX is usually not 
> > used in C* environments in prod".
> >
> > I also wonder about if we should care about JMX?  I know many wish to 
> > migrate (its going to be a very long time) away from JMX, so do we need a 
> > wrapper to make JMX and vtables consistent?
> >
> > If we can move away from a bespoke vtable or JMX based implementation and 
> > instead have a templatized solution each of these is generated from, that 
> > to me is the superior option. There's little harm in adding new JMX 
> > endpoints (or hell, other metrics framework integration?) as a byproduct of 
> > adding new vtable exposed metrics; we have the same maintenance obligation 
> > to them as we have to the vtables and if it generates from the same base 
> > data, we shouldn't have any further maintenance burden due to its presence 
> > right?
> >
> > we wish to move away from JMX
> >
> > I do, and you do, and many people do, but I don't believe all people on the 
> > project do. The last time this came up in slack the conclusion was "Josh 
> > should go draft a CEP to chart out a path to moving off JMX while 
> > maintaining backwards-compat w/existing JMX metrics for environments that 
> > are using them" (so I'm excited to see this CEP pop up before I got to it! 
> > ;)). Moving to a system that gives us a 0-cost way to keep JMX and vtable 
> > in sync over time on new metrics seems like a nice compromise for folks 
> > that have built out JMX-based maintenance infra on top of C*. Plus removing 
> > the boilerplate toil on vtables. win-win.
> >
> > If we add a column to the end of the JMX row did we just break users?
> >
> > I *think* this is arguably true for a vtable / CQL-based solution as well 
> > from the "you don't know how people are using your API" perspective. Unless 
> > we have clear guidelines about discretely selecting the columns you want 
> > from a vtable and trust users to follow them, if people have brittle greedy 
> > parsers pulling in all data from vtables we could very well break them as 
> > well by adding a new column right? Could be wrong here; I haven't written 
> > anything that consumes vtable metric data and maybe the obvious idiom in 
> > the face of that is robust in the presence of column addition. /shrug
> >
> > It's certainly more flexible and simpler to write to w/out detonating 
> > compared to JMX, but it's still an API we'd be revving.
> >
> > On Sat, Jan 28, 2023, at 4:24 PM, Ekaterina Dimitrova wrote:
> >
> > Overall I have similar thoughts and questions as David.
> >
> > I just wanted to add a reminder about this thread from last summer[1]. We 
> > already have issues with the alignment of JMX and Settings Virtual Table. I 
> > guess this is how Maxim got inspired to suggest this framework proposal 
> > which I want to thank him for! (I noticed he assigned CASSANDRA-15254)
> >
> > Not to open the Pandora box, but to me the most important thing here is to 
> > come into agreement about the future of JMX and what we will do or not as a 
> > community. Also, how much time people are able to invest. I guess this will 
> > influence any directions to be taken here.
> >
> > [1]
> > https://lists.apache.org/thread/8mjcwdyqoobpvw2262bqmskkhs76pp69
> >
> >
> > On Thu, 26 Jan 2023 at 12:41, David Capwell <dcapw...@apple.com> wrote:
> >
> > I took a look and I see the result is an interface that looks like the 
> > vtable interface, that is then used by vtables and JMX?  My first thought 
> > is why not just use the vtable logic?
> >
> > I also wonder about if we should care about JMX?  I know many wish to 
> > migrate (its going to be a very long time) away from JMX, so do we need a 
> > wrapper to make JMX and vtables consistent?  I am cool with something like 
> > the following
> >
> > registerWithJMX(jmxName, query(“SELECT * FROM system_views.streaming”));
> >
> >
> > So if we want to have a JMX view that matches the table then that’s cool by 
> > me, but one thing that has been brought up in reviews is backwards 
> > compatibility with regard to adding columns… If we add a column to the end 
> > of the JMX row did we just break users?
> >
> > Considering that JMX is usually not used and disabled in production 
> > environments for various performance and security reasons, the operator may 
> > not see the same picture from various of Dropwizard's metrics exporters
> >
> > If this is a real problem people are hitting, we can always add the ability 
> > to push metrics to common systems with a pluggable way to add non-standard 
> > solutions.  Dropwizard already support this so would be low hanging fruit 
> > to address this.
> >
> > To make the proposed changes backwards compatible with the previous version 
> > of Cassandra, all MBeans and Virtual Tables we already have will remain 
> > unchanged
> >
> >
> > If this is for new JMX endpoints moving forward, I am not sure of the 
> > benefit for the same reason listed above; we wish to move away from JMX
> >
> > On Jan 25, 2023, at 10:51 AM, Maxim Muzafarov <mmu...@apache.org> wrote:
> >
> > Hello Cassandra Community,
> >
> >
> > I've been faced with a number of inconsistencies in the user APIs of
> > the internal data collections representation exposed through the
> > Cassandra monitoring interfaces that need to be fully aligned from an
> > operator perspective. First of all, I'm highlighting JMX, Dropwizard
> > Metrics, and Virtual Tables user interfaces. In order to address all
> > these inconsistencies, I have created a draft enhancement proposal
> > that describes everything I have found and how we can fix it once and
> > for all.
> >
> > I'd like to hear your opinion and thoughts on it. Please take a look:
> > https://docs.google.com/document/d/1j4J3bPWjQkAU9x4G-zxKObxPrKg36jLRT6xpUoNJa8Q
> >
> >
> > --
> > Maxim Muzafarov
> >
> >

Reply via email to