[ https://issues.apache.org/jira/browse/CASSANDRA-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17487391#comment-17487391 ]
David Capwell edited comment on CASSANDRA-17292 at 2/5/22, 1:30 AM: -------------------------------------------------------------------- bq. streaming is equally as much compaction as it is network, as it also controls the disk Most things we do involves the disk... At the moment streaming and compaction are configured separately, so the fact they touch the disk doesn't mean they should be together, I don't follow your argument. bq. If we control this under query why not also row_cache and key_cache I can buy arguments for "query" or "storage", does this mean that this type of grouping is broken? I don't see why, most configs clearly belong to a group, and the minority of cases are blurry (can be argued for 2 groups) or there are no clear groups (such as cluster_name); these are outliers were we can debate on a per-basis, I just don't follow the argument that they invalidate this style of grouping as a whole. To me, I would expect storage.row_cache as I normally see caches implemented at the storage layer, but in Cassandra we do this CQL (SinglePartitionReadCommand); but if we do actually implement pluggable storage, where will this be? Do we even want these caches if RocksDB is the storage backend? If the answer is no (I would think not as RocksDB provides its own caches) then its clearly tied to storage, so storage.row_cache is the most ideal place. bq. back_pressure {code} $ grep -r back_pressure src/ src//java/org/apache/cassandra/config/Config.java: public volatile boolean back_pressure_enabled = false; src//java/org/apache/cassandra/config/Config.java: public volatile ParameterizedClass back_pressure_strategy; {code} heh... dead code... We do have a network based back pressure, and different features may be able to inform/work with it to maintain stability, so I always saw our current one as a network feature, but I could see different arguments. If we want to have a discussion on where that makes the most sense or if it should be its own top level thing, I feel thats productive. bq. or other query execution topics? I believe thats my point, group the query related topics together... bq. Much IMO better to have e.g. [enable: {user_defined_functions: true, materialized_views: true} I find discoverability is much harder in this model. If you are asking how to configure something do you say "I want to walk through all limits in isolation and provide values, then move to enable flags, then rate limiters" or do you say "I want to configure compaction"? I have never worked on a project where I didn't ask how to configure a feature or a subsystem and instead wanted to look at all rate limiters together... If I want to configure the rate limiters in compaction I would look at the compaction configs, looking at the rate limiter configs can be confusing as you don't know if the property you see is actually related to compaction {code} rate_limit: compaction_throughput: 10mb/s validation_throughput: 10mb/s {code} if you are looking at that and new to Cassandra, will you think validation is related to compaction? What about repair? What is a "validation" and why would I put a rate limiter on it? Grouping based off limits/flags/etc. looses context of what a property relates to, so I personally find this more confusing than things are today. was (Author: dcapwell): bq. streaming is equally as much compaction as it is network, as it also controls the disk Most things we do involves the disk... At the moment streaming and compaction are configured separately, so the fact they touch the disk doesn't mean they should be together, I don't follow your argument. bq. If we control this under query why not also row_cache and key_cache I can buy arguments for "query" or "storage", does this mean that this type of grouping is broken? I don't see why, most configs clearly belong to a group, and the minority of cases are blurry (can be argued for 2 groups) or there are no clear groups (such as cluster_name); these are outliers were we can debate on a per-basis, I just don't follow the argument that they invalidate this style of grouping as a whole. To me, I would expect storage.row_cache as I normally see caches implemented at the storage layer, but in Cassandra we do this CQL (SinglePartitionReadCommand); but if we do actually implement pluggable storage, where will this be? Do we even want these caches if RocksDB is the storage backend? If the answer is no (I would think not as RocksDB provides its own caches) then its clearly tied to storage, so storage.row_cache is the most ideal place. bq. back_pressure {code} $ grep -r back_pressure src/ src//java/org/apache/cassandra/config/Config.java: public volatile boolean back_pressure_enabled = false; src//java/org/apache/cassandra/config/Config.java: public volatile ParameterizedClass back_pressure_strategy; {code} heh... dead code... We do have a network based back pressure, and different features may be able to inform/work with it to maintain stability, so I always saw our current one as a network feature, but I could see different arguments. If we want to have a discussion on where that makes the most sense or if it should be its own top level thing, I feel thats productive. bq. or other query execution topics? I believe thats my point, group the query related topics together... bq. Much IMO better to have e.g. [enable: {user_defined_functions: true, materialized_views: true} I find discoverability is much harder in this model. If you are asking how to configure something do you say "I want to walk through all limits in isolation and provide values, then move to enable flags, then rate limiters" or do you say "I want to configure compaction"? I have never worked on a project where I didn't ask how to configure a feature or a subsystem and instead wanted to look at all rate limiters together... If I want to configure the rate limiters in compaction I would look at the compaction configs, looking at the rate limiter configs can be confusing as you don't know if the property you see is actually related to compaction {code} > Move cassandra.yaml toward a nested structure around major database concepts > ---------------------------------------------------------------------------- > > Key: CASSANDRA-17292 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17292 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config > Reporter: Caleb Rackliffe > Assignee: Caleb Rackliffe > Priority: Normal > Fix For: 5.x > > > Recent mailing list conversation (see "[DISCUSS] Nested YAML configs for new > features") has made it clear we will gravitate toward appropriately nested > structures for new parameters in {{cassandra.yaml}}, but from the scattered > conversation across a few Guardrails tickets (see CASSANDRA-17212 and > CASSANDRA-17148) and CASSANDRA-15234, there is also a general desire to > eventually extend this to the rest of {{cassandra.yaml}}. The benefits of > this change include those we gain by doing it for new features (single point > of interest for feature documentation, typed configuration objects, logical > grouping for additional parameters added over time, discoverability, etc.), > but one a larger scale. > This may overlap with ongoing work, including the Guardrails epic. Ideally, > even a rough cut of a design here would allow that to move forward in a > timely and coherent manner (with less long-term refactoring pain). > While these would have to be adjusted to CASSANDRA-15234 (probably after it > merges), there have been two proposals floated already for what this might > look like: > From [~maedhroz] - > https://github.com/maedhroz/cassandra/commit/49e83c70eba3357978d1081ecf500bbbdee960d8 > From [~benedict] - > https://github.com/belliottsmith/cassandra/commits/CASSANDRA-15234-grouping-ideas -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org