[jira] [Updated] (CASSANDRA-17424) Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in StorageProxy
[ https://issues.apache.org/jira/browse/CASSANDRA-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe updated CASSANDRA-17424: Fix Version/s: 4.1 (was: 4.x) Since Version: 4.1 Source Control Link: https://github.com/apache/cassandra/commit/57ab3afcf16970047d3df4656241cf0705e94bee Resolution: Fixed Status: Resolved (was: Ready to Commit) Committed as https://github.com/apache/cassandra/commit/57ab3afcf16970047d3df4656241cf0705e94bee > Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in > StorageProxy > -- > > Key: CASSANDRA-17424 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17424 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.1 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > In CASSANDRA-10023, we added two new metrics to both {{ClientRequestMetrics}} > and {{ClientWriteRequestMetrics}} to represent requests where the driver > either does or does not make a correct token-aware choice of coordinator. > (Auditing driver behavior is listed as the primary goal of that Jira.) > There are, however, a few concerns we should address before this releases in > 4.1: > 1.) With paging enabled and a LIMIT < fetch size, {{IN}} queries can hit > {{fetchRows()}} multiple times, so the number of local + remote requests > isn’t the same as the number of queries marked in {{ClientRequestMetrics}} in > {{readRegular()}}. > 2.) {{IN}} queries will potentially mark a bunch of “remote” requests even if > one key in the {{IN}} set is “local”. > 3.) Something similar happens with mutations. If {{StorageProxy#mutate()}} > receives multiple mutations, we’ll mark against one of these new metrics in > {{ClientWriteRequestMetrics}} for each mutation, while > {{ClientWriteRequestMetrics}} will only register the actual client request > once. > For cases 2 and 3, we may mark both local and remote requests for the same > overall client request, which introduces ambiguity if these are intended to > help audit driver coordinator selection behavior. There are a few options: > a.) We can accept the ambiguity, but then we haven’t really accomplished the > goal of CASSANDRA-10023 for some request types. > b.) We can simply not record any of these metrics for requests where multiple > partitions/tokens are involved. > c.) We can be lenient, marking requests as “local” if any of the > partitions/tokens involved in the client request are, in fact, local. > “c” feels like the option that preserves as much functionality as possible > without being ambiguous, but problem #2 above is still tricky, given the way > IN and GROUP BY queries behave w/ paging. (Perhaps ambiguity in that case is > acceptable?) > In addition to the general ambiguity around the above… > 4.) There is excessive object creation involved (on a hot path) in our > determination of whether a request is local or remote. We should be able to > mitigate this by getting rid of > {{AbstractReadExecutor#getContactedReplicas()}} and relying on > {{ReplicaPlan#lookup()}} rather than creating strings. (Even for writes, we > should be able to push down marking into performWrite(), where the write > ReplicaPlan is already available.) -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17424) Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in StorageProxy
[ https://issues.apache.org/jira/browse/CASSANDRA-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-17424: Status: Ready to Commit (was: Review In Progress) +1 > Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in > StorageProxy > -- > > Key: CASSANDRA-17424 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17424 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.x > > Time Spent: 1h > Remaining Estimate: 0h > > In CASSANDRA-10023, we added two new metrics to both {{ClientRequestMetrics}} > and {{ClientWriteRequestMetrics}} to represent requests where the driver > either does or does not make a correct token-aware choice of coordinator. > (Auditing driver behavior is listed as the primary goal of that Jira.) > There are, however, a few concerns we should address before this releases in > 4.1: > 1.) With paging enabled and a LIMIT < fetch size, {{IN}} queries can hit > {{fetchRows()}} multiple times, so the number of local + remote requests > isn’t the same as the number of queries marked in {{ClientRequestMetrics}} in > {{readRegular()}}. > 2.) {{IN}} queries will potentially mark a bunch of “remote” requests even if > one key in the {{IN}} set is “local”. > 3.) Something similar happens with mutations. If {{StorageProxy#mutate()}} > receives multiple mutations, we’ll mark against one of these new metrics in > {{ClientWriteRequestMetrics}} for each mutation, while > {{ClientWriteRequestMetrics}} will only register the actual client request > once. > For cases 2 and 3, we may mark both local and remote requests for the same > overall client request, which introduces ambiguity if these are intended to > help audit driver coordinator selection behavior. There are a few options: > a.) We can accept the ambiguity, but then we haven’t really accomplished the > goal of CASSANDRA-10023 for some request types. > b.) We can simply not record any of these metrics for requests where multiple > partitions/tokens are involved. > c.) We can be lenient, marking requests as “local” if any of the > partitions/tokens involved in the client request are, in fact, local. > “c” feels like the option that preserves as much functionality as possible > without being ambiguous, but problem #2 above is still tricky, given the way > IN and GROUP BY queries behave w/ paging. (Perhaps ambiguity in that case is > acceptable?) > In addition to the general ambiguity around the above… > 4.) There is excessive object creation involved (on a hot path) in our > determination of whether a request is local or remote. We should be able to > mitigate this by getting rid of > {{AbstractReadExecutor#getContactedReplicas()}} and relying on > {{ReplicaPlan#lookup()}} rather than creating strings. (Even for writes, we > should be able to push down marking into performWrite(), where the write > ReplicaPlan is already available.) -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17424) Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in StorageProxy
[ https://issues.apache.org/jira/browse/CASSANDRA-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Meredith updated CASSANDRA-17424: - Reviewers: Jon Meredith, Marcus Eriksson (was: Jon Meredith, Jon Meredith, Marcus Eriksson) > Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in > StorageProxy > -- > > Key: CASSANDRA-17424 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17424 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.x > > Time Spent: 1h > Remaining Estimate: 0h > > In CASSANDRA-10023, we added two new metrics to both {{ClientRequestMetrics}} > and {{ClientWriteRequestMetrics}} to represent requests where the driver > either does or does not make a correct token-aware choice of coordinator. > (Auditing driver behavior is listed as the primary goal of that Jira.) > There are, however, a few concerns we should address before this releases in > 4.1: > 1.) With paging enabled and a LIMIT < fetch size, {{IN}} queries can hit > {{fetchRows()}} multiple times, so the number of local + remote requests > isn’t the same as the number of queries marked in {{ClientRequestMetrics}} in > {{readRegular()}}. > 2.) {{IN}} queries will potentially mark a bunch of “remote” requests even if > one key in the {{IN}} set is “local”. > 3.) Something similar happens with mutations. If {{StorageProxy#mutate()}} > receives multiple mutations, we’ll mark against one of these new metrics in > {{ClientWriteRequestMetrics}} for each mutation, while > {{ClientWriteRequestMetrics}} will only register the actual client request > once. > For cases 2 and 3, we may mark both local and remote requests for the same > overall client request, which introduces ambiguity if these are intended to > help audit driver coordinator selection behavior. There are a few options: > a.) We can accept the ambiguity, but then we haven’t really accomplished the > goal of CASSANDRA-10023 for some request types. > b.) We can simply not record any of these metrics for requests where multiple > partitions/tokens are involved. > c.) We can be lenient, marking requests as “local” if any of the > partitions/tokens involved in the client request are, in fact, local. > “c” feels like the option that preserves as much functionality as possible > without being ambiguous, but problem #2 above is still tricky, given the way > IN and GROUP BY queries behave w/ paging. (Perhaps ambiguity in that case is > acceptable?) > In addition to the general ambiguity around the above… > 4.) There is excessive object creation involved (on a hot path) in our > determination of whether a request is local or remote. We should be able to > mitigate this by getting rid of > {{AbstractReadExecutor#getContactedReplicas()}} and relying on > {{ReplicaPlan#lookup()}} rather than creating strings. (Even for writes, we > should be able to push down marking into performWrite(), where the write > ReplicaPlan is already available.) -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17424) Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in StorageProxy
[ https://issues.apache.org/jira/browse/CASSANDRA-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-17424: Attachment: (was: bugreport-blackjack-QODS30.163-7-27-2022-03-13-13-48-43.png) > Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in > StorageProxy > -- > > Key: CASSANDRA-17424 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17424 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.x > > Time Spent: 1h > Remaining Estimate: 0h > > In CASSANDRA-10023, we added two new metrics to both {{ClientRequestMetrics}} > and {{ClientWriteRequestMetrics}} to represent requests where the driver > either does or does not make a correct token-aware choice of coordinator. > (Auditing driver behavior is listed as the primary goal of that Jira.) > There are, however, a few concerns we should address before this releases in > 4.1: > 1.) With paging enabled and a LIMIT < fetch size, {{IN}} queries can hit > {{fetchRows()}} multiple times, so the number of local + remote requests > isn’t the same as the number of queries marked in {{ClientRequestMetrics}} in > {{readRegular()}}. > 2.) {{IN}} queries will potentially mark a bunch of “remote” requests even if > one key in the {{IN}} set is “local”. > 3.) Something similar happens with mutations. If {{StorageProxy#mutate()}} > receives multiple mutations, we’ll mark against one of these new metrics in > {{ClientWriteRequestMetrics}} for each mutation, while > {{ClientWriteRequestMetrics}} will only register the actual client request > once. > For cases 2 and 3, we may mark both local and remote requests for the same > overall client request, which introduces ambiguity if these are intended to > help audit driver coordinator selection behavior. There are a few options: > a.) We can accept the ambiguity, but then we haven’t really accomplished the > goal of CASSANDRA-10023 for some request types. > b.) We can simply not record any of these metrics for requests where multiple > partitions/tokens are involved. > c.) We can be lenient, marking requests as “local” if any of the > partitions/tokens involved in the client request are, in fact, local. > “c” feels like the option that preserves as much functionality as possible > without being ambiguous, but problem #2 above is still tricky, given the way > IN and GROUP BY queries behave w/ paging. (Perhaps ambiguity in that case is > acceptable?) > In addition to the general ambiguity around the above… > 4.) There is excessive object creation involved (on a hot path) in our > determination of whether a request is local or remote. We should be able to > mitigate this by getting rid of > {{AbstractReadExecutor#getContactedReplicas()}} and relying on > {{ReplicaPlan#lookup()}} rather than creating strings. (Even for writes, we > should be able to push down marking into performWrite(), where the write > ReplicaPlan is already available.) -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17424) Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in StorageProxy
[ https://issues.apache.org/jira/browse/CASSANDRA-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-17424: Reviewers: Jon Meredith, Jon Meredith, Marcus Eriksson (was: Jon Meredith, Jon Meredith) > Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in > StorageProxy > -- > > Key: CASSANDRA-17424 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17424 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.x > > Attachments: > bugreport-blackjack-QODS30.163-7-27-2022-03-13-13-48-43.png > > Time Spent: 1h > Remaining Estimate: 0h > > In CASSANDRA-10023, we added two new metrics to both {{ClientRequestMetrics}} > and {{ClientWriteRequestMetrics}} to represent requests where the driver > either does or does not make a correct token-aware choice of coordinator. > (Auditing driver behavior is listed as the primary goal of that Jira.) > There are, however, a few concerns we should address before this releases in > 4.1: > 1.) With paging enabled and a LIMIT < fetch size, {{IN}} queries can hit > {{fetchRows()}} multiple times, so the number of local + remote requests > isn’t the same as the number of queries marked in {{ClientRequestMetrics}} in > {{readRegular()}}. > 2.) {{IN}} queries will potentially mark a bunch of “remote” requests even if > one key in the {{IN}} set is “local”. > 3.) Something similar happens with mutations. If {{StorageProxy#mutate()}} > receives multiple mutations, we’ll mark against one of these new metrics in > {{ClientWriteRequestMetrics}} for each mutation, while > {{ClientWriteRequestMetrics}} will only register the actual client request > once. > For cases 2 and 3, we may mark both local and remote requests for the same > overall client request, which introduces ambiguity if these are intended to > help audit driver coordinator selection behavior. There are a few options: > a.) We can accept the ambiguity, but then we haven’t really accomplished the > goal of CASSANDRA-10023 for some request types. > b.) We can simply not record any of these metrics for requests where multiple > partitions/tokens are involved. > c.) We can be lenient, marking requests as “local” if any of the > partitions/tokens involved in the client request are, in fact, local. > “c” feels like the option that preserves as much functionality as possible > without being ambiguous, but problem #2 above is still tricky, given the way > IN and GROUP BY queries behave w/ paging. (Perhaps ambiguity in that case is > acceptable?) > In addition to the general ambiguity around the above… > 4.) There is excessive object creation involved (on a hot path) in our > determination of whether a request is local or remote. We should be able to > mitigate this by getting rid of > {{AbstractReadExecutor#getContactedReplicas()}} and relying on > {{ReplicaPlan#lookup()}} rather than creating strings. (Even for writes, we > should be able to push down marking into performWrite(), where the write > ReplicaPlan is already available.) -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17424) Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in StorageProxy
[ https://issues.apache.org/jira/browse/CASSANDRA-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jesus antonio lopez lopez updated CASSANDRA-17424: -- Attachment: bugreport-blackjack-QODS30.163-7-27-2022-03-13-13-48-43.png > Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in > StorageProxy > -- > > Key: CASSANDRA-17424 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17424 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.x > > Attachments: > bugreport-blackjack-QODS30.163-7-27-2022-03-13-13-48-43.png > > Time Spent: 1h > Remaining Estimate: 0h > > In CASSANDRA-10023, we added two new metrics to both {{ClientRequestMetrics}} > and {{ClientWriteRequestMetrics}} to represent requests where the driver > either does or does not make a correct token-aware choice of coordinator. > (Auditing driver behavior is listed as the primary goal of that Jira.) > There are, however, a few concerns we should address before this releases in > 4.1: > 1.) With paging enabled and a LIMIT < fetch size, {{IN}} queries can hit > {{fetchRows()}} multiple times, so the number of local + remote requests > isn’t the same as the number of queries marked in {{ClientRequestMetrics}} in > {{readRegular()}}. > 2.) {{IN}} queries will potentially mark a bunch of “remote” requests even if > one key in the {{IN}} set is “local”. > 3.) Something similar happens with mutations. If {{StorageProxy#mutate()}} > receives multiple mutations, we’ll mark against one of these new metrics in > {{ClientWriteRequestMetrics}} for each mutation, while > {{ClientWriteRequestMetrics}} will only register the actual client request > once. > For cases 2 and 3, we may mark both local and remote requests for the same > overall client request, which introduces ambiguity if these are intended to > help audit driver coordinator selection behavior. There are a few options: > a.) We can accept the ambiguity, but then we haven’t really accomplished the > goal of CASSANDRA-10023 for some request types. > b.) We can simply not record any of these metrics for requests where multiple > partitions/tokens are involved. > c.) We can be lenient, marking requests as “local” if any of the > partitions/tokens involved in the client request are, in fact, local. > “c” feels like the option that preserves as much functionality as possible > without being ambiguous, but problem #2 above is still tricky, given the way > IN and GROUP BY queries behave w/ paging. (Perhaps ambiguity in that case is > acceptable?) > In addition to the general ambiguity around the above… > 4.) There is excessive object creation involved (on a hot path) in our > determination of whether a request is local or remote. We should be able to > mitigate this by getting rid of > {{AbstractReadExecutor#getContactedReplicas()}} and relying on > {{ReplicaPlan#lookup()}} rather than creating strings. (Even for writes, we > should be able to push down marking into performWrite(), where the write > ReplicaPlan is already available.) -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17424) Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in StorageProxy
[ https://issues.apache.org/jira/browse/CASSANDRA-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Meredith updated CASSANDRA-17424: - Reviewers: Jon Meredith, Jon Meredith Status: Review In Progress (was: Patch Available) > Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in > StorageProxy > -- > > Key: CASSANDRA-17424 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17424 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.x > > Time Spent: 0.5h > Remaining Estimate: 0h > > In CASSANDRA-10023, we added two new metrics to both {{ClientRequestMetrics}} > and {{ClientWriteRequestMetrics}} to represent requests where the driver > either does or does not make a correct token-aware choice of coordinator. > (Auditing driver behavior is listed as the primary goal of that Jira.) > There are, however, a few concerns we should address before this releases in > 4.1: > 1.) With paging enabled and a LIMIT < fetch size, {{IN}} queries can hit > {{fetchRows()}} multiple times, so the number of local + remote requests > isn’t the same as the number of queries marked in {{ClientRequestMetrics}} in > {{readRegular()}}. > 2.) {{IN}} queries will potentially mark a bunch of “remote” requests even if > one key in the {{IN}} set is “local”. > 3.) Something similar happens with mutations. If {{StorageProxy#mutate()}} > receives multiple mutations, we’ll mark against one of these new metrics in > {{ClientWriteRequestMetrics}} for each mutation, while > {{ClientWriteRequestMetrics}} will only register the actual client request > once. > For cases 2 and 3, we may mark both local and remote requests for the same > overall client request, which introduces ambiguity if these are intended to > help audit driver coordinator selection behavior. There are a few options: > a.) We can accept the ambiguity, but then we haven’t really accomplished the > goal of CASSANDRA-10023 for some request types. > b.) We can simply not record any of these metrics for requests where multiple > partitions/tokens are involved. > c.) We can be lenient, marking requests as “local” if any of the > partitions/tokens involved in the client request are, in fact, local. > “c” feels like the option that preserves as much functionality as possible > without being ambiguous, but problem #2 above is still tricky, given the way > IN and GROUP BY queries behave w/ paging. (Perhaps ambiguity in that case is > acceptable?) > In addition to the general ambiguity around the above… > 4.) There is excessive object creation involved (on a hot path) in our > determination of whether a request is local or remote. We should be able to > mitigate this by getting rid of > {{AbstractReadExecutor#getContactedReplicas()}} and relying on > {{ReplicaPlan#lookup()}} rather than creating strings. (Even for writes, we > should be able to push down marking into performWrite(), where the write > ReplicaPlan is already available.) -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17424) Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in StorageProxy
[ https://issues.apache.org/jira/browse/CASSANDRA-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe updated CASSANDRA-17424: Test and Documentation Plan: n/a Status: Patch Available (was: In Progress) |trunk| |[patch|https://github.com/apache/cassandra/pull/1501]| |[CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra?branch=CASSANDRA-17424&filter=all]| I pushed up a first attempt at this. I've left the ambiguity around IN/GROUP BY alone for now, although I'm more than willing to revisit that if there's enough feedback. (See the PR for some inline notes...) CC [~marcuse] [~brandon.williams] [~stefan.miklosovic] > Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in > StorageProxy > -- > > Key: CASSANDRA-17424 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17424 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.x > > Time Spent: 10m > Remaining Estimate: 0h > > In CASSANDRA-10023, we added two new metrics to both {{ClientRequestMetrics}} > and {{ClientWriteRequestMetrics}} to represent requests where the driver > either does or does not make a correct token-aware choice of coordinator. > (Auditing driver behavior is listed as the primary goal of that Jira.) > There are, however, a few concerns we should address before this releases in > 4.1: > 1.) With paging enabled and a LIMIT < fetch size, {{IN}} queries can hit > {{fetchRows()}} multiple times, so the number of local + remote requests > isn’t the same as the number of queries marked in {{ClientRequestMetrics}} in > {{readRegular()}}. > 2.) {{IN}} queries will potentially mark a bunch of “remote” requests even if > one key in the {{IN}} set is “local”. > 3.) Something similar happens with mutations. If {{StorageProxy#mutate()}} > receives multiple mutations, we’ll mark against one of these new metrics in > {{ClientWriteRequestMetrics}} for each mutation, while > {{ClientWriteRequestMetrics}} will only register the actual client request > once. > For cases 2 and 3, we may mark both local and remote requests for the same > overall client request, which introduces ambiguity if these are intended to > help audit driver coordinator selection behavior. There are a few options: > a.) We can accept the ambiguity, but then we haven’t really accomplished the > goal of CASSANDRA-10023 for some request types. > b.) We can simply not record any of these metrics for requests where multiple > partitions/tokens are involved. > c.) We can be lenient, marking requests as “local” if any of the > partitions/tokens involved in the client request are, in fact, local. > “c” feels like the option that preserves as much functionality as possible > without being ambiguous, but problem #2 above is still tricky, given the way > IN and GROUP BY queries behave w/ paging. (Perhaps ambiguity in that case is > acceptable?) > In addition to the general ambiguity around the above… > 4.) There is excessive object creation involved (on a hot path) in our > determination of whether a request is local or remote. We should be able to > mitigate this by getting rid of > {{AbstractReadExecutor#getContactedReplicas()}} and relying on > {{ReplicaPlan#lookup()}} rather than creating strings. (Even for writes, we > should be able to push down marking into performWrite(), where the write > ReplicaPlan is already available.) -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17424) Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in StorageProxy
[ https://issues.apache.org/jira/browse/CASSANDRA-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe updated CASSANDRA-17424: Bug Category: Parent values: Correctness(12982)Level 1 values: API / Semantic Implementation(12988) Complexity: Normal Discovered By: Code Inspection Fix Version/s: 4.x Severity: Normal Status: Open (was: Triage Needed) > Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in > StorageProxy > -- > > Key: CASSANDRA-17424 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17424 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.x > > > In CASSANDRA-10023, we added two new metrics to both {{ClientRequestMetrics}} > and {{ClientWriteRequestMetrics}} to represent requests where the driver > either does or does not make a correct token-aware choice of coordinator. > (Auditing driver behavior is listed as the primary goal of that Jira.) > There are, however, a few concerns we should address before this releases in > 4.1: > 1.) With paging enabled and a LIMIT < fetch size, {{IN}} queries can hit > {{fetchRows()}} multiple times, so the number of local + remote requests > isn’t the same as the number of queries marked in {{ClientRequestMetrics}} in > {{readRegular()}}. > 2.) {{IN}} queries will potentially mark a bunch of “remote” requests even if > one key in the {{IN}} set is “local”. > 3.) Something similar happens with mutations. If {{StorageProxy#mutate()}} > receives multiple mutations, we’ll mark against one of these new metrics in > {{ClientWriteRequestMetrics}} for each mutation, while > {{ClientWriteRequestMetrics}} will only register the actual client request > once. > For cases 2 and 3, we may mark both local and remote requests for the same > overall client request, which introduces ambiguity if these are intended to > help audit driver coordinator selection behavior. There are a few options: > a.) We can accept the ambiguity, but then we haven’t really accomplished the > goal of CASSANDRA-10023 for some request types. > b.) We can simply not record any of these metrics for requests where multiple > partitions/tokens are involved. > c.) We can be lenient, marking requests as “local” if any of the > partitions/tokens involved in the client request are, in fact, local. > “c” feels like the option that preserves as much functionality as possible > without being ambiguous, but problem #2 above is still tricky, given the way > IN and GROUP BY queries behave w/ paging. (Perhaps ambiguity in that case is > acceptable?) > In addition to the general ambiguity around the above… > 4.) There is excessive object creation involved (on a hot path) in our > determination of whether a request is local or remote. We should be able to > mitigate this by getting rid of > {{AbstractReadExecutor#getContactedReplicas()}} and relying on > {{ReplicaPlan#lookup()}} rather than creating strings. (Even for writes, we > should be able to push down marking into performWrite(), where the write > ReplicaPlan is already available.) -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org