[jira] [Updated] (CASSANDRA-17424) Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in StorageProxy

2022-03-29 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-17424:

  Fix Version/s: 4.1
 (was: 4.x)
  Since Version: 4.1
Source Control Link: 
https://github.com/apache/cassandra/commit/57ab3afcf16970047d3df4656241cf0705e94bee
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Committed as 
https://github.com/apache/cassandra/commit/57ab3afcf16970047d3df4656241cf0705e94bee

> Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in 
> StorageProxy
> --
>
> Key: CASSANDRA-17424
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17424
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.1
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In CASSANDRA-10023, we added two new metrics to both {{ClientRequestMetrics}} 
> and {{ClientWriteRequestMetrics}} to represent requests where the driver 
> either does or does not make a correct token-aware choice of coordinator. 
> (Auditing driver behavior is listed as the primary goal of that Jira.)
> There are, however, a few concerns we should address before this releases in 
> 4.1:
> 1.) With paging enabled and a LIMIT < fetch size, {{IN}} queries can hit 
> {{fetchRows()}} multiple times, so the number of local + remote requests 
> isn’t the same as the number of queries marked in {{ClientRequestMetrics}} in 
> {{readRegular()}}.
> 2.) {{IN}} queries will potentially mark a bunch of “remote” requests even if 
> one key in the {{IN}} set is “local”.
> 3.) Something similar happens with mutations. If {{StorageProxy#mutate()}} 
> receives multiple mutations, we’ll mark against one of these new metrics in 
> {{ClientWriteRequestMetrics}} for each mutation, while 
> {{ClientWriteRequestMetrics}} will only register the actual client request 
> once.
> For cases 2 and 3, we may mark both local and remote requests for the same 
> overall client request, which introduces ambiguity if these are intended to 
> help audit driver coordinator selection behavior. There are a few options:
> a.) We can accept the ambiguity, but then we haven’t really accomplished the 
> goal of CASSANDRA-10023 for some request types.
> b.) We can simply not record any of these metrics for requests where multiple 
> partitions/tokens are involved.
> c.) We can be lenient, marking requests as “local” if any of the 
> partitions/tokens involved in the client request are, in fact, local.
> “c” feels like the option that preserves as much functionality as possible 
> without being ambiguous, but problem #2 above is still tricky, given the way 
> IN and GROUP BY queries behave w/ paging. (Perhaps ambiguity in that case is 
> acceptable?)
> In addition to the general ambiguity around the above…
> 4.) There is excessive object creation involved (on a hot path) in our 
> determination of whether a request is local or remote. We should be able to 
> mitigate this by getting rid of 
> {{AbstractReadExecutor#getContactedReplicas()}} and relying on 
> {{ReplicaPlan#lookup()}} rather than creating strings. (Even for writes, we 
> should be able to push down marking into performWrite(), where the write 
> ReplicaPlan is already available.)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17424) Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in StorageProxy

2022-03-28 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-17424:

Status: Ready to Commit  (was: Review In Progress)

+1

> Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in 
> StorageProxy
> --
>
> Key: CASSANDRA-17424
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17424
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In CASSANDRA-10023, we added two new metrics to both {{ClientRequestMetrics}} 
> and {{ClientWriteRequestMetrics}} to represent requests where the driver 
> either does or does not make a correct token-aware choice of coordinator. 
> (Auditing driver behavior is listed as the primary goal of that Jira.)
> There are, however, a few concerns we should address before this releases in 
> 4.1:
> 1.) With paging enabled and a LIMIT < fetch size, {{IN}} queries can hit 
> {{fetchRows()}} multiple times, so the number of local + remote requests 
> isn’t the same as the number of queries marked in {{ClientRequestMetrics}} in 
> {{readRegular()}}.
> 2.) {{IN}} queries will potentially mark a bunch of “remote” requests even if 
> one key in the {{IN}} set is “local”.
> 3.) Something similar happens with mutations. If {{StorageProxy#mutate()}} 
> receives multiple mutations, we’ll mark against one of these new metrics in 
> {{ClientWriteRequestMetrics}} for each mutation, while 
> {{ClientWriteRequestMetrics}} will only register the actual client request 
> once.
> For cases 2 and 3, we may mark both local and remote requests for the same 
> overall client request, which introduces ambiguity if these are intended to 
> help audit driver coordinator selection behavior. There are a few options:
> a.) We can accept the ambiguity, but then we haven’t really accomplished the 
> goal of CASSANDRA-10023 for some request types.
> b.) We can simply not record any of these metrics for requests where multiple 
> partitions/tokens are involved.
> c.) We can be lenient, marking requests as “local” if any of the 
> partitions/tokens involved in the client request are, in fact, local.
> “c” feels like the option that preserves as much functionality as possible 
> without being ambiguous, but problem #2 above is still tricky, given the way 
> IN and GROUP BY queries behave w/ paging. (Perhaps ambiguity in that case is 
> acceptable?)
> In addition to the general ambiguity around the above…
> 4.) There is excessive object creation involved (on a hot path) in our 
> determination of whether a request is local or remote. We should be able to 
> mitigate this by getting rid of 
> {{AbstractReadExecutor#getContactedReplicas()}} and relying on 
> {{ReplicaPlan#lookup()}} rather than creating strings. (Even for writes, we 
> should be able to push down marking into performWrite(), where the write 
> ReplicaPlan is already available.)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17424) Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in StorageProxy

2022-03-21 Thread Jon Meredith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Meredith updated CASSANDRA-17424:
-
Reviewers: Jon Meredith, Marcus Eriksson  (was: Jon Meredith, Jon Meredith, 
Marcus Eriksson)

> Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in 
> StorageProxy
> --
>
> Key: CASSANDRA-17424
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17424
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In CASSANDRA-10023, we added two new metrics to both {{ClientRequestMetrics}} 
> and {{ClientWriteRequestMetrics}} to represent requests where the driver 
> either does or does not make a correct token-aware choice of coordinator. 
> (Auditing driver behavior is listed as the primary goal of that Jira.)
> There are, however, a few concerns we should address before this releases in 
> 4.1:
> 1.) With paging enabled and a LIMIT < fetch size, {{IN}} queries can hit 
> {{fetchRows()}} multiple times, so the number of local + remote requests 
> isn’t the same as the number of queries marked in {{ClientRequestMetrics}} in 
> {{readRegular()}}.
> 2.) {{IN}} queries will potentially mark a bunch of “remote” requests even if 
> one key in the {{IN}} set is “local”.
> 3.) Something similar happens with mutations. If {{StorageProxy#mutate()}} 
> receives multiple mutations, we’ll mark against one of these new metrics in 
> {{ClientWriteRequestMetrics}} for each mutation, while 
> {{ClientWriteRequestMetrics}} will only register the actual client request 
> once.
> For cases 2 and 3, we may mark both local and remote requests for the same 
> overall client request, which introduces ambiguity if these are intended to 
> help audit driver coordinator selection behavior. There are a few options:
> a.) We can accept the ambiguity, but then we haven’t really accomplished the 
> goal of CASSANDRA-10023 for some request types.
> b.) We can simply not record any of these metrics for requests where multiple 
> partitions/tokens are involved.
> c.) We can be lenient, marking requests as “local” if any of the 
> partitions/tokens involved in the client request are, in fact, local.
> “c” feels like the option that preserves as much functionality as possible 
> without being ambiguous, but problem #2 above is still tricky, given the way 
> IN and GROUP BY queries behave w/ paging. (Perhaps ambiguity in that case is 
> acceptable?)
> In addition to the general ambiguity around the above…
> 4.) There is excessive object creation involved (on a hot path) in our 
> determination of whether a request is local or remote. We should be able to 
> mitigate this by getting rid of 
> {{AbstractReadExecutor#getContactedReplicas()}} and relying on 
> {{ReplicaPlan#lookup()}} rather than creating strings. (Even for writes, we 
> should be able to push down marking into performWrite(), where the write 
> ReplicaPlan is already available.)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17424) Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in StorageProxy

2022-03-21 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-17424:

Attachment: (was: 
bugreport-blackjack-QODS30.163-7-27-2022-03-13-13-48-43.png)

> Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in 
> StorageProxy
> --
>
> Key: CASSANDRA-17424
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17424
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In CASSANDRA-10023, we added two new metrics to both {{ClientRequestMetrics}} 
> and {{ClientWriteRequestMetrics}} to represent requests where the driver 
> either does or does not make a correct token-aware choice of coordinator. 
> (Auditing driver behavior is listed as the primary goal of that Jira.)
> There are, however, a few concerns we should address before this releases in 
> 4.1:
> 1.) With paging enabled and a LIMIT < fetch size, {{IN}} queries can hit 
> {{fetchRows()}} multiple times, so the number of local + remote requests 
> isn’t the same as the number of queries marked in {{ClientRequestMetrics}} in 
> {{readRegular()}}.
> 2.) {{IN}} queries will potentially mark a bunch of “remote” requests even if 
> one key in the {{IN}} set is “local”.
> 3.) Something similar happens with mutations. If {{StorageProxy#mutate()}} 
> receives multiple mutations, we’ll mark against one of these new metrics in 
> {{ClientWriteRequestMetrics}} for each mutation, while 
> {{ClientWriteRequestMetrics}} will only register the actual client request 
> once.
> For cases 2 and 3, we may mark both local and remote requests for the same 
> overall client request, which introduces ambiguity if these are intended to 
> help audit driver coordinator selection behavior. There are a few options:
> a.) We can accept the ambiguity, but then we haven’t really accomplished the 
> goal of CASSANDRA-10023 for some request types.
> b.) We can simply not record any of these metrics for requests where multiple 
> partitions/tokens are involved.
> c.) We can be lenient, marking requests as “local” if any of the 
> partitions/tokens involved in the client request are, in fact, local.
> “c” feels like the option that preserves as much functionality as possible 
> without being ambiguous, but problem #2 above is still tricky, given the way 
> IN and GROUP BY queries behave w/ paging. (Perhaps ambiguity in that case is 
> acceptable?)
> In addition to the general ambiguity around the above…
> 4.) There is excessive object creation involved (on a hot path) in our 
> determination of whether a request is local or remote. We should be able to 
> mitigate this by getting rid of 
> {{AbstractReadExecutor#getContactedReplicas()}} and relying on 
> {{ReplicaPlan#lookup()}} rather than creating strings. (Even for writes, we 
> should be able to push down marking into performWrite(), where the write 
> ReplicaPlan is already available.)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17424) Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in StorageProxy

2022-03-21 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-17424:

Reviewers: Jon Meredith, Jon Meredith, Marcus Eriksson  (was: Jon Meredith, 
Jon Meredith)

> Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in 
> StorageProxy
> --
>
> Key: CASSANDRA-17424
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17424
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.x
>
> Attachments: 
> bugreport-blackjack-QODS30.163-7-27-2022-03-13-13-48-43.png
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In CASSANDRA-10023, we added two new metrics to both {{ClientRequestMetrics}} 
> and {{ClientWriteRequestMetrics}} to represent requests where the driver 
> either does or does not make a correct token-aware choice of coordinator. 
> (Auditing driver behavior is listed as the primary goal of that Jira.)
> There are, however, a few concerns we should address before this releases in 
> 4.1:
> 1.) With paging enabled and a LIMIT < fetch size, {{IN}} queries can hit 
> {{fetchRows()}} multiple times, so the number of local + remote requests 
> isn’t the same as the number of queries marked in {{ClientRequestMetrics}} in 
> {{readRegular()}}.
> 2.) {{IN}} queries will potentially mark a bunch of “remote” requests even if 
> one key in the {{IN}} set is “local”.
> 3.) Something similar happens with mutations. If {{StorageProxy#mutate()}} 
> receives multiple mutations, we’ll mark against one of these new metrics in 
> {{ClientWriteRequestMetrics}} for each mutation, while 
> {{ClientWriteRequestMetrics}} will only register the actual client request 
> once.
> For cases 2 and 3, we may mark both local and remote requests for the same 
> overall client request, which introduces ambiguity if these are intended to 
> help audit driver coordinator selection behavior. There are a few options:
> a.) We can accept the ambiguity, but then we haven’t really accomplished the 
> goal of CASSANDRA-10023 for some request types.
> b.) We can simply not record any of these metrics for requests where multiple 
> partitions/tokens are involved.
> c.) We can be lenient, marking requests as “local” if any of the 
> partitions/tokens involved in the client request are, in fact, local.
> “c” feels like the option that preserves as much functionality as possible 
> without being ambiguous, but problem #2 above is still tricky, given the way 
> IN and GROUP BY queries behave w/ paging. (Perhaps ambiguity in that case is 
> acceptable?)
> In addition to the general ambiguity around the above…
> 4.) There is excessive object creation involved (on a hot path) in our 
> determination of whether a request is local or remote. We should be able to 
> mitigate this by getting rid of 
> {{AbstractReadExecutor#getContactedReplicas()}} and relying on 
> {{ReplicaPlan#lookup()}} rather than creating strings. (Even for writes, we 
> should be able to push down marking into performWrite(), where the write 
> ReplicaPlan is already available.)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17424) Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in StorageProxy

2022-03-21 Thread jesus antonio lopez lopez (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jesus antonio lopez lopez updated CASSANDRA-17424:
--
Attachment: bugreport-blackjack-QODS30.163-7-27-2022-03-13-13-48-43.png

> Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in 
> StorageProxy
> --
>
> Key: CASSANDRA-17424
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17424
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.x
>
> Attachments: 
> bugreport-blackjack-QODS30.163-7-27-2022-03-13-13-48-43.png
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In CASSANDRA-10023, we added two new metrics to both {{ClientRequestMetrics}} 
> and {{ClientWriteRequestMetrics}} to represent requests where the driver 
> either does or does not make a correct token-aware choice of coordinator. 
> (Auditing driver behavior is listed as the primary goal of that Jira.)
> There are, however, a few concerns we should address before this releases in 
> 4.1:
> 1.) With paging enabled and a LIMIT < fetch size, {{IN}} queries can hit 
> {{fetchRows()}} multiple times, so the number of local + remote requests 
> isn’t the same as the number of queries marked in {{ClientRequestMetrics}} in 
> {{readRegular()}}.
> 2.) {{IN}} queries will potentially mark a bunch of “remote” requests even if 
> one key in the {{IN}} set is “local”.
> 3.) Something similar happens with mutations. If {{StorageProxy#mutate()}} 
> receives multiple mutations, we’ll mark against one of these new metrics in 
> {{ClientWriteRequestMetrics}} for each mutation, while 
> {{ClientWriteRequestMetrics}} will only register the actual client request 
> once.
> For cases 2 and 3, we may mark both local and remote requests for the same 
> overall client request, which introduces ambiguity if these are intended to 
> help audit driver coordinator selection behavior. There are a few options:
> a.) We can accept the ambiguity, but then we haven’t really accomplished the 
> goal of CASSANDRA-10023 for some request types.
> b.) We can simply not record any of these metrics for requests where multiple 
> partitions/tokens are involved.
> c.) We can be lenient, marking requests as “local” if any of the 
> partitions/tokens involved in the client request are, in fact, local.
> “c” feels like the option that preserves as much functionality as possible 
> without being ambiguous, but problem #2 above is still tricky, given the way 
> IN and GROUP BY queries behave w/ paging. (Perhaps ambiguity in that case is 
> acceptable?)
> In addition to the general ambiguity around the above…
> 4.) There is excessive object creation involved (on a hot path) in our 
> determination of whether a request is local or remote. We should be able to 
> mitigate this by getting rid of 
> {{AbstractReadExecutor#getContactedReplicas()}} and relying on 
> {{ReplicaPlan#lookup()}} rather than creating strings. (Even for writes, we 
> should be able to push down marking into performWrite(), where the write 
> ReplicaPlan is already available.)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17424) Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in StorageProxy

2022-03-21 Thread Jon Meredith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Meredith updated CASSANDRA-17424:
-
Reviewers: Jon Meredith, Jon Meredith
   Status: Review In Progress  (was: Patch Available)

> Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in 
> StorageProxy
> --
>
> Key: CASSANDRA-17424
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17424
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In CASSANDRA-10023, we added two new metrics to both {{ClientRequestMetrics}} 
> and {{ClientWriteRequestMetrics}} to represent requests where the driver 
> either does or does not make a correct token-aware choice of coordinator. 
> (Auditing driver behavior is listed as the primary goal of that Jira.)
> There are, however, a few concerns we should address before this releases in 
> 4.1:
> 1.) With paging enabled and a LIMIT < fetch size, {{IN}} queries can hit 
> {{fetchRows()}} multiple times, so the number of local + remote requests 
> isn’t the same as the number of queries marked in {{ClientRequestMetrics}} in 
> {{readRegular()}}.
> 2.) {{IN}} queries will potentially mark a bunch of “remote” requests even if 
> one key in the {{IN}} set is “local”.
> 3.) Something similar happens with mutations. If {{StorageProxy#mutate()}} 
> receives multiple mutations, we’ll mark against one of these new metrics in 
> {{ClientWriteRequestMetrics}} for each mutation, while 
> {{ClientWriteRequestMetrics}} will only register the actual client request 
> once.
> For cases 2 and 3, we may mark both local and remote requests for the same 
> overall client request, which introduces ambiguity if these are intended to 
> help audit driver coordinator selection behavior. There are a few options:
> a.) We can accept the ambiguity, but then we haven’t really accomplished the 
> goal of CASSANDRA-10023 for some request types.
> b.) We can simply not record any of these metrics for requests where multiple 
> partitions/tokens are involved.
> c.) We can be lenient, marking requests as “local” if any of the 
> partitions/tokens involved in the client request are, in fact, local.
> “c” feels like the option that preserves as much functionality as possible 
> without being ambiguous, but problem #2 above is still tricky, given the way 
> IN and GROUP BY queries behave w/ paging. (Perhaps ambiguity in that case is 
> acceptable?)
> In addition to the general ambiguity around the above…
> 4.) There is excessive object creation involved (on a hot path) in our 
> determination of whether a request is local or remote. We should be able to 
> mitigate this by getting rid of 
> {{AbstractReadExecutor#getContactedReplicas()}} and relying on 
> {{ReplicaPlan#lookup()}} rather than creating strings. (Even for writes, we 
> should be able to push down marking into performWrite(), where the write 
> ReplicaPlan is already available.)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17424) Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in StorageProxy

2022-03-14 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-17424:

Test and Documentation Plan: n/a
 Status: Patch Available  (was: In Progress)

|trunk|
|[patch|https://github.com/apache/cassandra/pull/1501]|
|[CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra?branch=CASSANDRA-17424&filter=all]|

I pushed up a first attempt at this. I've left the ambiguity around IN/GROUP BY 
alone for now, although I'm more than willing to revisit that if there's enough 
feedback. (See the PR for some inline notes...)

CC [~marcuse] [~brandon.williams] [~stefan.miklosovic]

> Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in 
> StorageProxy
> --
>
> Key: CASSANDRA-17424
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17424
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In CASSANDRA-10023, we added two new metrics to both {{ClientRequestMetrics}} 
> and {{ClientWriteRequestMetrics}} to represent requests where the driver 
> either does or does not make a correct token-aware choice of coordinator. 
> (Auditing driver behavior is listed as the primary goal of that Jira.)
> There are, however, a few concerns we should address before this releases in 
> 4.1:
> 1.) With paging enabled and a LIMIT < fetch size, {{IN}} queries can hit 
> {{fetchRows()}} multiple times, so the number of local + remote requests 
> isn’t the same as the number of queries marked in {{ClientRequestMetrics}} in 
> {{readRegular()}}.
> 2.) {{IN}} queries will potentially mark a bunch of “remote” requests even if 
> one key in the {{IN}} set is “local”.
> 3.) Something similar happens with mutations. If {{StorageProxy#mutate()}} 
> receives multiple mutations, we’ll mark against one of these new metrics in 
> {{ClientWriteRequestMetrics}} for each mutation, while 
> {{ClientWriteRequestMetrics}} will only register the actual client request 
> once.
> For cases 2 and 3, we may mark both local and remote requests for the same 
> overall client request, which introduces ambiguity if these are intended to 
> help audit driver coordinator selection behavior. There are a few options:
> a.) We can accept the ambiguity, but then we haven’t really accomplished the 
> goal of CASSANDRA-10023 for some request types.
> b.) We can simply not record any of these metrics for requests where multiple 
> partitions/tokens are involved.
> c.) We can be lenient, marking requests as “local” if any of the 
> partitions/tokens involved in the client request are, in fact, local.
> “c” feels like the option that preserves as much functionality as possible 
> without being ambiguous, but problem #2 above is still tricky, given the way 
> IN and GROUP BY queries behave w/ paging. (Perhaps ambiguity in that case is 
> acceptable?)
> In addition to the general ambiguity around the above…
> 4.) There is excessive object creation involved (on a hot path) in our 
> determination of whether a request is local or remote. We should be able to 
> mitigate this by getting rid of 
> {{AbstractReadExecutor#getContactedReplicas()}} and relying on 
> {{ReplicaPlan#lookup()}} rather than creating strings. (Even for writes, we 
> should be able to push down marking into performWrite(), where the write 
> ReplicaPlan is already available.)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17424) Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in StorageProxy

2022-03-08 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-17424:

 Bug Category: Parent values: Correctness(12982)Level 1 values: API / 
Semantic Implementation(12988)
   Complexity: Normal
Discovered By: Code Inspection
Fix Version/s: 4.x
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Performance and Semantic Concerns w/ Metrics for Local vs. Remote Requests in 
> StorageProxy
> --
>
> Key: CASSANDRA-17424
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17424
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.x
>
>
> In CASSANDRA-10023, we added two new metrics to both {{ClientRequestMetrics}} 
> and {{ClientWriteRequestMetrics}} to represent requests where the driver 
> either does or does not make a correct token-aware choice of coordinator. 
> (Auditing driver behavior is listed as the primary goal of that Jira.)
> There are, however, a few concerns we should address before this releases in 
> 4.1:
> 1.) With paging enabled and a LIMIT < fetch size, {{IN}} queries can hit 
> {{fetchRows()}} multiple times, so the number of local + remote requests 
> isn’t the same as the number of queries marked in {{ClientRequestMetrics}} in 
> {{readRegular()}}.
> 2.) {{IN}} queries will potentially mark a bunch of “remote” requests even if 
> one key in the {{IN}} set is “local”.
> 3.) Something similar happens with mutations. If {{StorageProxy#mutate()}} 
> receives multiple mutations, we’ll mark against one of these new metrics in 
> {{ClientWriteRequestMetrics}} for each mutation, while 
> {{ClientWriteRequestMetrics}} will only register the actual client request 
> once.
> For cases 2 and 3, we may mark both local and remote requests for the same 
> overall client request, which introduces ambiguity if these are intended to 
> help audit driver coordinator selection behavior. There are a few options:
> a.) We can accept the ambiguity, but then we haven’t really accomplished the 
> goal of CASSANDRA-10023 for some request types.
> b.) We can simply not record any of these metrics for requests where multiple 
> partitions/tokens are involved.
> c.) We can be lenient, marking requests as “local” if any of the 
> partitions/tokens involved in the client request are, in fact, local.
> “c” feels like the option that preserves as much functionality as possible 
> without being ambiguous, but problem #2 above is still tricky, given the way 
> IN and GROUP BY queries behave w/ paging. (Perhaps ambiguity in that case is 
> acceptable?)
> In addition to the general ambiguity around the above…
> 4.) There is excessive object creation involved (on a hot path) in our 
> determination of whether a request is local or remote. We should be able to 
> mitigate this by getting rid of 
> {{AbstractReadExecutor#getContactedReplicas()}} and relying on 
> {{ReplicaPlan#lookup()}} rather than creating strings. (Even for writes, we 
> should be able to push down marking into performWrite(), where the write 
> ReplicaPlan is already available.)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org