[jira] [Updated] (CASSANDRA-8831) Create a system table to expose prepared statements
[ https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-8831: Resolution: Fixed Fix Version/s: (was: 3.x) 3.10 Status: Resolved (was: Patch Available) Thanks! Committed with addition to NEWS.txt as [997cb663e8c8f164873515f81bb779e435aead6d|https://github.com/apache/cassandra/commit/997cb663e8c8f164873515f81bb779e435aead6d] to [trunk|https://github.com/apache/cassandra/tree/trunk] > Create a system table to expose prepared statements > --- > > Key: CASSANDRA-8831 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8831 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Robert Stupp > Labels: client-impacting, docs-impacting > Fix For: 3.10 > > > Because drivers abstract from users the handling of up/down nodes, they have > to deal with the fact that when a node is restarted (or join), it won't know > any prepared statement. Drivers could somewhat ignore that problem and wait > for a query to return an error (that the statement is unknown by the node) to > re-prepare the query on that node, but it's relatively inefficient because > every time a node comes back up, you'll get bad latency spikes due to some > queries first failing, then being re-prepared and then only being executed. > So instead, drivers (at least the java driver but I believe others do as > well) pro-actively re-prepare statements when a node comes up. It solves the > latency problem, but currently every driver instance blindly re-prepare all > statements, meaning that in a large cluster with many clients there is a lot > of duplication of work (it would be enough for a single client to prepare the > statements) and a bigger than necessary load on the node that started. > An idea to solve this it to have a (cheap) way for clients to check if some > statements are prepared on the node. There is different options to provide > that but what I'd suggest is to add a system table to expose the (cached) > prepared statements because: > # it's reasonably straightforward to implement: we just add a line to the > table when a statement is prepared and remove it when it's evicted (we > already have eviction listeners). We'd also truncate the table on startup but > that's easy enough). We can even switch it to a "virtual table" if/when > CASSANDRA-7622 lands but it's trivial to do with a normal table in the > meantime. > # it doesn't require a change to the protocol or something like that. It > could even be done in 2.1 if we wish to. > # exposing prepared statements feels like a genuinely useful information to > have (outside of the problem exposed here that is), if only for > debugging/educational purposes. > The exposed table could look something like: > {noformat} > CREATE TABLE system.prepared_statements ( >keyspace_name text, >table_name text, >prepared_id blob, >query_string text, >PRIMARY KEY (keyspace_name, table_name, prepared_id) > ) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8831) Create a system table to expose prepared statements
[ https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-8831: Status: Patch Available (was: Open) > Create a system table to expose prepared statements > --- > > Key: CASSANDRA-8831 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8831 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Robert Stupp > Labels: client-impacting, docs-impacting > Fix For: 3.x > > > Because drivers abstract from users the handling of up/down nodes, they have > to deal with the fact that when a node is restarted (or join), it won't know > any prepared statement. Drivers could somewhat ignore that problem and wait > for a query to return an error (that the statement is unknown by the node) to > re-prepare the query on that node, but it's relatively inefficient because > every time a node comes back up, you'll get bad latency spikes due to some > queries first failing, then being re-prepared and then only being executed. > So instead, drivers (at least the java driver but I believe others do as > well) pro-actively re-prepare statements when a node comes up. It solves the > latency problem, but currently every driver instance blindly re-prepare all > statements, meaning that in a large cluster with many clients there is a lot > of duplication of work (it would be enough for a single client to prepare the > statements) and a bigger than necessary load on the node that started. > An idea to solve this it to have a (cheap) way for clients to check if some > statements are prepared on the node. There is different options to provide > that but what I'd suggest is to add a system table to expose the (cached) > prepared statements because: > # it's reasonably straightforward to implement: we just add a line to the > table when a statement is prepared and remove it when it's evicted (we > already have eviction listeners). We'd also truncate the table on startup but > that's easy enough). We can even switch it to a "virtual table" if/when > CASSANDRA-7622 lands but it's trivial to do with a normal table in the > meantime. > # it doesn't require a change to the protocol or something like that. It > could even be done in 2.1 if we wish to. > # exposing prepared statements feels like a genuinely useful information to > have (outside of the problem exposed here that is), if only for > debugging/educational purposes. > The exposed table could look something like: > {noformat} > CREATE TABLE system.prepared_statements ( >keyspace_name text, >table_name text, >prepared_id blob, >query_string text, >PRIMARY KEY (keyspace_name, table_name, prepared_id) > ) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8831) Create a system table to expose prepared statements
[ https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-8831: Status: Open (was: Patch Available) > Create a system table to expose prepared statements > --- > > Key: CASSANDRA-8831 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8831 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Robert Stupp > Labels: client-impacting, docs-impacting > Fix For: 3.x > > > Because drivers abstract from users the handling of up/down nodes, they have > to deal with the fact that when a node is restarted (or join), it won't know > any prepared statement. Drivers could somewhat ignore that problem and wait > for a query to return an error (that the statement is unknown by the node) to > re-prepare the query on that node, but it's relatively inefficient because > every time a node comes back up, you'll get bad latency spikes due to some > queries first failing, then being re-prepared and then only being executed. > So instead, drivers (at least the java driver but I believe others do as > well) pro-actively re-prepare statements when a node comes up. It solves the > latency problem, but currently every driver instance blindly re-prepare all > statements, meaning that in a large cluster with many clients there is a lot > of duplication of work (it would be enough for a single client to prepare the > statements) and a bigger than necessary load on the node that started. > An idea to solve this it to have a (cheap) way for clients to check if some > statements are prepared on the node. There is different options to provide > that but what I'd suggest is to add a system table to expose the (cached) > prepared statements because: > # it's reasonably straightforward to implement: we just add a line to the > table when a statement is prepared and remove it when it's evicted (we > already have eviction listeners). We'd also truncate the table on startup but > that's easy enough). We can even switch it to a "virtual table" if/when > CASSANDRA-7622 lands but it's trivial to do with a normal table in the > meantime. > # it doesn't require a change to the protocol or something like that. It > could even be done in 2.1 if we wish to. > # exposing prepared statements feels like a genuinely useful information to > have (outside of the problem exposed here that is), if only for > debugging/educational purposes. > The exposed table could look something like: > {noformat} > CREATE TABLE system.prepared_statements ( >keyspace_name text, >table_name text, >prepared_id blob, >query_string text, >PRIMARY KEY (keyspace_name, table_name, prepared_id) > ) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8831) Create a system table to expose prepared statements
[ https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-8831: Status: Patch Available (was: Open) Rebased the branch and addressed the review comments in a separate commit. Also triggered new CI runs. ||trunk|[branch|https://github.com/apache/cassandra/compare/trunk...snazy:8831-pstmts]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-8831-pstmts-testall/lastSuccessfulBuild/]|[dtest|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-8831-pstmts-dtest/lastSuccessfulBuild/] > Create a system table to expose prepared statements > --- > > Key: CASSANDRA-8831 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8831 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Robert Stupp > Labels: client-impacting, docs-impacting > Fix For: 3.x > > > Because drivers abstract from users the handling of up/down nodes, they have > to deal with the fact that when a node is restarted (or join), it won't know > any prepared statement. Drivers could somewhat ignore that problem and wait > for a query to return an error (that the statement is unknown by the node) to > re-prepare the query on that node, but it's relatively inefficient because > every time a node comes back up, you'll get bad latency spikes due to some > queries first failing, then being re-prepared and then only being executed. > So instead, drivers (at least the java driver but I believe others do as > well) pro-actively re-prepare statements when a node comes up. It solves the > latency problem, but currently every driver instance blindly re-prepare all > statements, meaning that in a large cluster with many clients there is a lot > of duplication of work (it would be enough for a single client to prepare the > statements) and a bigger than necessary load on the node that started. > An idea to solve this it to have a (cheap) way for clients to check if some > statements are prepared on the node. There is different options to provide > that but what I'd suggest is to add a system table to expose the (cached) > prepared statements because: > # it's reasonably straightforward to implement: we just add a line to the > table when a statement is prepared and remove it when it's evicted (we > already have eviction listeners). We'd also truncate the table on startup but > that's easy enough). We can even switch it to a "virtual table" if/when > CASSANDRA-7622 lands but it's trivial to do with a normal table in the > meantime. > # it doesn't require a change to the protocol or something like that. It > could even be done in 2.1 if we wish to. > # exposing prepared statements feels like a genuinely useful information to > have (outside of the problem exposed here that is), if only for > debugging/educational purposes. > The exposed table could look something like: > {noformat} > CREATE TABLE system.prepared_statements ( >keyspace_name text, >table_name text, >prepared_id blob, >query_string text, >PRIMARY KEY (keyspace_name, table_name, prepared_id) > ) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8831) Create a system table to expose prepared statements
[ https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-8831: Status: Open (was: Patch Available) > Create a system table to expose prepared statements > --- > > Key: CASSANDRA-8831 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8831 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Robert Stupp > Labels: client-impacting, docs-impacting > Fix For: 3.x > > > Because drivers abstract from users the handling of up/down nodes, they have > to deal with the fact that when a node is restarted (or join), it won't know > any prepared statement. Drivers could somewhat ignore that problem and wait > for a query to return an error (that the statement is unknown by the node) to > re-prepare the query on that node, but it's relatively inefficient because > every time a node comes back up, you'll get bad latency spikes due to some > queries first failing, then being re-prepared and then only being executed. > So instead, drivers (at least the java driver but I believe others do as > well) pro-actively re-prepare statements when a node comes up. It solves the > latency problem, but currently every driver instance blindly re-prepare all > statements, meaning that in a large cluster with many clients there is a lot > of duplication of work (it would be enough for a single client to prepare the > statements) and a bigger than necessary load on the node that started. > An idea to solve this it to have a (cheap) way for clients to check if some > statements are prepared on the node. There is different options to provide > that but what I'd suggest is to add a system table to expose the (cached) > prepared statements because: > # it's reasonably straightforward to implement: we just add a line to the > table when a statement is prepared and remove it when it's evicted (we > already have eviction listeners). We'd also truncate the table on startup but > that's easy enough). We can even switch it to a "virtual table" if/when > CASSANDRA-7622 lands but it's trivial to do with a normal table in the > meantime. > # it doesn't require a change to the protocol or something like that. It > could even be done in 2.1 if we wish to. > # exposing prepared statements feels like a genuinely useful information to > have (outside of the problem exposed here that is), if only for > debugging/educational purposes. > The exposed table could look something like: > {noformat} > CREATE TABLE system.prepared_statements ( >keyspace_name text, >table_name text, >prepared_id blob, >query_string text, >PRIMARY KEY (keyspace_name, table_name, prepared_id) > ) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8831) Create a system table to expose prepared statements
[ https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-8831: Attachment: (was: 8831-v2.txt) > Create a system table to expose prepared statements > --- > > Key: CASSANDRA-8831 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8831 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Robert Stupp > Labels: client-impacting, docs-impacting > Fix For: 3.x > > > Because drivers abstract from users the handling of up/down nodes, they have > to deal with the fact that when a node is restarted (or join), it won't know > any prepared statement. Drivers could somewhat ignore that problem and wait > for a query to return an error (that the statement is unknown by the node) to > re-prepare the query on that node, but it's relatively inefficient because > every time a node comes back up, you'll get bad latency spikes due to some > queries first failing, then being re-prepared and then only being executed. > So instead, drivers (at least the java driver but I believe others do as > well) pro-actively re-prepare statements when a node comes up. It solves the > latency problem, but currently every driver instance blindly re-prepare all > statements, meaning that in a large cluster with many clients there is a lot > of duplication of work (it would be enough for a single client to prepare the > statements) and a bigger than necessary load on the node that started. > An idea to solve this it to have a (cheap) way for clients to check if some > statements are prepared on the node. There is different options to provide > that but what I'd suggest is to add a system table to expose the (cached) > prepared statements because: > # it's reasonably straightforward to implement: we just add a line to the > table when a statement is prepared and remove it when it's evicted (we > already have eviction listeners). We'd also truncate the table on startup but > that's easy enough). We can even switch it to a "virtual table" if/when > CASSANDRA-7622 lands but it's trivial to do with a normal table in the > meantime. > # it doesn't require a change to the protocol or something like that. It > could even be done in 2.1 if we wish to. > # exposing prepared statements feels like a genuinely useful information to > have (outside of the problem exposed here that is), if only for > debugging/educational purposes. > The exposed table could look something like: > {noformat} > CREATE TABLE system.prepared_statements ( >keyspace_name text, >table_name text, >prepared_id blob, >query_string text, >PRIMARY KEY (keyspace_name, table_name, prepared_id) > ) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8831) Create a system table to expose prepared statements
[ https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-8831: Attachment: (was: 8831-v1.txt) > Create a system table to expose prepared statements > --- > > Key: CASSANDRA-8831 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8831 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Robert Stupp > Labels: client-impacting, docs-impacting > Fix For: 3.x > > Attachments: 8831-v2.txt > > > Because drivers abstract from users the handling of up/down nodes, they have > to deal with the fact that when a node is restarted (or join), it won't know > any prepared statement. Drivers could somewhat ignore that problem and wait > for a query to return an error (that the statement is unknown by the node) to > re-prepare the query on that node, but it's relatively inefficient because > every time a node comes back up, you'll get bad latency spikes due to some > queries first failing, then being re-prepared and then only being executed. > So instead, drivers (at least the java driver but I believe others do as > well) pro-actively re-prepare statements when a node comes up. It solves the > latency problem, but currently every driver instance blindly re-prepare all > statements, meaning that in a large cluster with many clients there is a lot > of duplication of work (it would be enough for a single client to prepare the > statements) and a bigger than necessary load on the node that started. > An idea to solve this it to have a (cheap) way for clients to check if some > statements are prepared on the node. There is different options to provide > that but what I'd suggest is to add a system table to expose the (cached) > prepared statements because: > # it's reasonably straightforward to implement: we just add a line to the > table when a statement is prepared and remove it when it's evicted (we > already have eviction listeners). We'd also truncate the table on startup but > that's easy enough). We can even switch it to a "virtual table" if/when > CASSANDRA-7622 lands but it's trivial to do with a normal table in the > meantime. > # it doesn't require a change to the protocol or something like that. It > could even be done in 2.1 if we wish to. > # exposing prepared statements feels like a genuinely useful information to > have (outside of the problem exposed here that is), if only for > debugging/educational purposes. > The exposed table could look something like: > {noformat} > CREATE TABLE system.prepared_statements ( >keyspace_name text, >table_name text, >prepared_id blob, >query_string text, >PRIMARY KEY (keyspace_name, table_name, prepared_id) > ) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8831) Create a system table to expose prepared statements
[ https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-8831: Attachment: (was: 8831-3.0-v1.txt) > Create a system table to expose prepared statements > --- > > Key: CASSANDRA-8831 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8831 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Robert Stupp > Labels: client-impacting, docs-impacting > Fix For: 3.x > > Attachments: 8831-v2.txt > > > Because drivers abstract from users the handling of up/down nodes, they have > to deal with the fact that when a node is restarted (or join), it won't know > any prepared statement. Drivers could somewhat ignore that problem and wait > for a query to return an error (that the statement is unknown by the node) to > re-prepare the query on that node, but it's relatively inefficient because > every time a node comes back up, you'll get bad latency spikes due to some > queries first failing, then being re-prepared and then only being executed. > So instead, drivers (at least the java driver but I believe others do as > well) pro-actively re-prepare statements when a node comes up. It solves the > latency problem, but currently every driver instance blindly re-prepare all > statements, meaning that in a large cluster with many clients there is a lot > of duplication of work (it would be enough for a single client to prepare the > statements) and a bigger than necessary load on the node that started. > An idea to solve this it to have a (cheap) way for clients to check if some > statements are prepared on the node. There is different options to provide > that but what I'd suggest is to add a system table to expose the (cached) > prepared statements because: > # it's reasonably straightforward to implement: we just add a line to the > table when a statement is prepared and remove it when it's evicted (we > already have eviction listeners). We'd also truncate the table on startup but > that's easy enough). We can even switch it to a "virtual table" if/when > CASSANDRA-7622 lands but it's trivial to do with a normal table in the > meantime. > # it doesn't require a change to the protocol or something like that. It > could even be done in 2.1 if we wish to. > # exposing prepared statements feels like a genuinely useful information to > have (outside of the problem exposed here that is), if only for > debugging/educational purposes. > The exposed table could look something like: > {noformat} > CREATE TABLE system.prepared_statements ( >keyspace_name text, >table_name text, >prepared_id blob, >query_string text, >PRIMARY KEY (keyspace_name, table_name, prepared_id) > ) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8831) Create a system table to expose prepared statements
[ https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-8831: -- Labels: client-impacting docs-impacting (was: client-impacting doc-impacting) > Create a system table to expose prepared statements > --- > > Key: CASSANDRA-8831 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8831 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Robert Stupp > Labels: client-impacting, docs-impacting > Fix For: 3.x > > Attachments: 8831-3.0-v1.txt, 8831-v1.txt, 8831-v2.txt > > > Because drivers abstract from users the handling of up/down nodes, they have > to deal with the fact that when a node is restarted (or join), it won't know > any prepared statement. Drivers could somewhat ignore that problem and wait > for a query to return an error (that the statement is unknown by the node) to > re-prepare the query on that node, but it's relatively inefficient because > every time a node comes back up, you'll get bad latency spikes due to some > queries first failing, then being re-prepared and then only being executed. > So instead, drivers (at least the java driver but I believe others do as > well) pro-actively re-prepare statements when a node comes up. It solves the > latency problem, but currently every driver instance blindly re-prepare all > statements, meaning that in a large cluster with many clients there is a lot > of duplication of work (it would be enough for a single client to prepare the > statements) and a bigger than necessary load on the node that started. > An idea to solve this it to have a (cheap) way for clients to check if some > statements are prepared on the node. There is different options to provide > that but what I'd suggest is to add a system table to expose the (cached) > prepared statements because: > # it's reasonably straightforward to implement: we just add a line to the > table when a statement is prepared and remove it when it's evicted (we > already have eviction listeners). We'd also truncate the table on startup but > that's easy enough). We can even switch it to a "virtual table" if/when > CASSANDRA-7622 lands but it's trivial to do with a normal table in the > meantime. > # it doesn't require a change to the protocol or something like that. It > could even be done in 2.1 if we wish to. > # exposing prepared statements feels like a genuinely useful information to > have (outside of the problem exposed here that is), if only for > debugging/educational purposes. > The exposed table could look something like: > {noformat} > CREATE TABLE system.prepared_statements ( >keyspace_name text, >table_name text, >prepared_id blob, >query_string text, >PRIMARY KEY (keyspace_name, table_name, prepared_id) > ) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8831) Create a system table to expose prepared statements
[ https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-8831: --- Labels: client-impacting doc-impacting (was: ) > Create a system table to expose prepared statements > --- > > Key: CASSANDRA-8831 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8831 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Robert Stupp > Labels: client-impacting, doc-impacting > Fix For: 3.0 > > Attachments: 8831-3.0-v1.txt, 8831-v1.txt, 8831-v2.txt > > > Because drivers abstract from users the handling of up/down nodes, they have > to deal with the fact that when a node is restarted (or join), it won't know > any prepared statement. Drivers could somewhat ignore that problem and wait > for a query to return an error (that the statement is unknown by the node) to > re-prepare the query on that node, but it's relatively inefficient because > every time a node comes back up, you'll get bad latency spikes due to some > queries first failing, then being re-prepared and then only being executed. > So instead, drivers (at least the java driver but I believe others do as > well) pro-actively re-prepare statements when a node comes up. It solves the > latency problem, but currently every driver instance blindly re-prepare all > statements, meaning that in a large cluster with many clients there is a lot > of duplication of work (it would be enough for a single client to prepare the > statements) and a bigger than necessary load on the node that started. > An idea to solve this it to have a (cheap) way for clients to check if some > statements are prepared on the node. There is different options to provide > that but what I'd suggest is to add a system table to expose the (cached) > prepared statements because: > # it's reasonably straightforward to implement: we just add a line to the > table when a statement is prepared and remove it when it's evicted (we > already have eviction listeners). We'd also truncate the table on startup but > that's easy enough). We can even switch it to a "virtual table" if/when > CASSANDRA-7622 lands but it's trivial to do with a normal table in the > meantime. > # it doesn't require a change to the protocol or something like that. It > could even be done in 2.1 if we wish to. > # exposing prepared statements feels like a genuinely useful information to > have (outside of the problem exposed here that is), if only for > debugging/educational purposes. > The exposed table could look something like: > {noformat} > CREATE TABLE system.prepared_statements ( >keyspace_name text, >table_name text, >prepared_id blob, >query_string text, >PRIMARY KEY (keyspace_name, table_name, prepared_id) > ) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8831) Create a system table to expose prepared statements
[ https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-8831: - Reviewer: Sylvain Lebresne > Create a system table to expose prepared statements > --- > > Key: CASSANDRA-8831 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8831 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Robert Stupp > Fix For: 3.0 > > Attachments: 8831-3.0-v1.txt, 8831-v1.txt, 8831-v2.txt > > > Because drivers abstract from users the handling of up/down nodes, they have > to deal with the fact that when a node is restarted (or join), it won't know > any prepared statement. Drivers could somewhat ignore that problem and wait > for a query to return an error (that the statement is unknown by the node) to > re-prepare the query on that node, but it's relatively inefficient because > every time a node comes back up, you'll get bad latency spikes due to some > queries first failing, then being re-prepared and then only being executed. > So instead, drivers (at least the java driver but I believe others do as > well) pro-actively re-prepare statements when a node comes up. It solves the > latency problem, but currently every driver instance blindly re-prepare all > statements, meaning that in a large cluster with many clients there is a lot > of duplication of work (it would be enough for a single client to prepare the > statements) and a bigger than necessary load on the node that started. > An idea to solve this it to have a (cheap) way for clients to check if some > statements are prepared on the node. There is different options to provide > that but what I'd suggest is to add a system table to expose the (cached) > prepared statements because: > # it's reasonably straightforward to implement: we just add a line to the > table when a statement is prepared and remove it when it's evicted (we > already have eviction listeners). We'd also truncate the table on startup but > that's easy enough). We can even switch it to a "virtual table" if/when > CASSANDRA-7622 lands but it's trivial to do with a normal table in the > meantime. > # it doesn't require a change to the protocol or something like that. It > could even be done in 2.1 if we wish to. > # exposing prepared statements feels like a genuinely useful information to > have (outside of the problem exposed here that is), if only for > debugging/educational purposes. > The exposed table could look something like: > {noformat} > CREATE TABLE system.prepared_statements ( >keyspace_name text, >table_name text, >prepared_id blob, >query_string text, >PRIMARY KEY (keyspace_name, table_name, prepared_id) > ) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8831) Create a system table to expose prepared statements
[ https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-8831: Attachment: 8831-3.0-v1.txt Attached patch against trunk (3.0) > Create a system table to expose prepared statements > --- > > Key: CASSANDRA-8831 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8831 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Robert Stupp > Fix For: 3.0 > > Attachments: 8831-3.0-v1.txt, 8831-v1.txt, 8831-v2.txt > > > Because drivers abstract from users the handling of up/down nodes, they have > to deal with the fact that when a node is restarted (or join), it won't know > any prepared statement. Drivers could somewhat ignore that problem and wait > for a query to return an error (that the statement is unknown by the node) to > re-prepare the query on that node, but it's relatively inefficient because > every time a node comes back up, you'll get bad latency spikes due to some > queries first failing, then being re-prepared and then only being executed. > So instead, drivers (at least the java driver but I believe others do as > well) pro-actively re-prepare statements when a node comes up. It solves the > latency problem, but currently every driver instance blindly re-prepare all > statements, meaning that in a large cluster with many clients there is a lot > of duplication of work (it would be enough for a single client to prepare the > statements) and a bigger than necessary load on the node that started. > An idea to solve this it to have a (cheap) way for clients to check if some > statements are prepared on the node. There is different options to provide > that but what I'd suggest is to add a system table to expose the (cached) > prepared statements because: > # it's reasonably straightforward to implement: we just add a line to the > table when a statement is prepared and remove it when it's evicted (we > already have eviction listeners). We'd also truncate the table on startup but > that's easy enough). We can even switch it to a "virtual table" if/when > CASSANDRA-7622 lands but it's trivial to do with a normal table in the > meantime. > # it doesn't require a change to the protocol or something like that. It > could even be done in 2.1 if we wish to. > # exposing prepared statements feels like a genuinely useful information to > have (outside of the problem exposed here that is), if only for > debugging/educational purposes. > The exposed table could look something like: > {noformat} > CREATE TABLE system.prepared_statements ( >keyspace_name text, >table_name text, >prepared_id blob, >query_string text, >PRIMARY KEY (keyspace_name, table_name, prepared_id) > ) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8831) Create a system table to expose prepared statements
[ https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-8831: Fix Version/s: 3.0 > Create a system table to expose prepared statements > --- > > Key: CASSANDRA-8831 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8831 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Robert Stupp > Fix For: 3.0 > > Attachments: 8831-v1.txt, 8831-v2.txt > > > Because drivers abstract from users the handling of up/down nodes, they have > to deal with the fact that when a node is restarted (or join), it won't know > any prepared statement. Drivers could somewhat ignore that problem and wait > for a query to return an error (that the statement is unknown by the node) to > re-prepare the query on that node, but it's relatively inefficient because > every time a node comes back up, you'll get bad latency spikes due to some > queries first failing, then being re-prepared and then only being executed. > So instead, drivers (at least the java driver but I believe others do as > well) pro-actively re-prepare statements when a node comes up. It solves the > latency problem, but currently every driver instance blindly re-prepare all > statements, meaning that in a large cluster with many clients there is a lot > of duplication of work (it would be enough for a single client to prepare the > statements) and a bigger than necessary load on the node that started. > An idea to solve this it to have a (cheap) way for clients to check if some > statements are prepared on the node. There is different options to provide > that but what I'd suggest is to add a system table to expose the (cached) > prepared statements because: > # it's reasonably straightforward to implement: we just add a line to the > table when a statement is prepared and remove it when it's evicted (we > already have eviction listeners). We'd also truncate the table on startup but > that's easy enough). We can even switch it to a "virtual table" if/when > CASSANDRA-7622 lands but it's trivial to do with a normal table in the > meantime. > # it doesn't require a change to the protocol or something like that. It > could even be done in 2.1 if we wish to. > # exposing prepared statements feels like a genuinely useful information to > have (outside of the problem exposed here that is), if only for > debugging/educational purposes. > The exposed table could look something like: > {noformat} > CREATE TABLE system.prepared_statements ( >keyspace_name text, >table_name text, >prepared_id blob, >query_string text, >PRIMARY KEY (keyspace_name, table_name, prepared_id) > ) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8831) Create a system table to expose prepared statements
[ https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-8831: Attachment: 8831-v2.txt bq. {{use_keyspace_name}} vs. {{keyspace_name}} For 2.1 the {{USE}}d keyspace might not have any effect. But for 3.0 (UDFs) it has. Additionally the {{prepared_id}} is calculated upon the current keyspace and the statement. Your other comments make sense - so I'll adopt them. bq. writing them all every minute It just synchronizes against the table every minute - so it's basically more a _delayed/async write_. But you're right, only the first prepare would really hit the (mem)table, meaning that the actual write is cheap - so I'll remove the scheduled thing. Attached v2 of the patch incorporates the changes. bq. security concerns We could add a configuration field to turn this feature on and off. WDYT? > Create a system table to expose prepared statements > --- > > Key: CASSANDRA-8831 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8831 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Robert Stupp > Attachments: 8831-v1.txt, 8831-v2.txt > > > Because drivers abstract from users the handling of up/down nodes, they have > to deal with the fact that when a node is restarted (or join), it won't know > any prepared statement. Drivers could somewhat ignore that problem and wait > for a query to return an error (that the statement is unknown by the node) to > re-prepare the query on that node, but it's relatively inefficient because > every time a node comes back up, you'll get bad latency spikes due to some > queries first failing, then being re-prepared and then only being executed. > So instead, drivers (at least the java driver but I believe others do as > well) pro-actively re-prepare statements when a node comes up. It solves the > latency problem, but currently every driver instance blindly re-prepare all > statements, meaning that in a large cluster with many clients there is a lot > of duplication of work (it would be enough for a single client to prepare the > statements) and a bigger than necessary load on the node that started. > An idea to solve this it to have a (cheap) way for clients to check if some > statements are prepared on the node. There is different options to provide > that but what I'd suggest is to add a system table to expose the (cached) > prepared statements because: > # it's reasonably straightforward to implement: we just add a line to the > table when a statement is prepared and remove it when it's evicted (we > already have eviction listeners). We'd also truncate the table on startup but > that's easy enough). We can even switch it to a "virtual table" if/when > CASSANDRA-7622 lands but it's trivial to do with a normal table in the > meantime. > # it doesn't require a change to the protocol or something like that. It > could even be done in 2.1 if we wish to. > # exposing prepared statements feels like a genuinely useful information to > have (outside of the problem exposed here that is), if only for > debugging/educational purposes. > The exposed table could look something like: > {noformat} > CREATE TABLE system.prepared_statements ( >keyspace_name text, >table_name text, >prepared_id blob, >query_string text, >PRIMARY KEY (keyspace_name, table_name, prepared_id) > ) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8831) Create a system table to expose prepared statements
[ https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-8831: Attachment: 8831-v1.txt Alright - it's really not a big thing to code. [Git branch|https://github.com/snazy/cassandra/tree/8831-sys-pstmt] (based on 2.1) + attached patch 8831-v1.txt include unit test. Table has the layout you've proposed - except that I had to add the column {{use_keyspace_name}} has been added (since the {{USE}}d keyspace influences the CQL). I implemented it as "persists changes to the table every minute". > Create a system table to expose prepared statements > --- > > Key: CASSANDRA-8831 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8831 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne > Attachments: 8831-v1.txt > > > Because drivers abstract from users the handling of up/down nodes, they have > to deal with the fact that when a node is restarted (or join), it won't know > any prepared statement. Drivers could somewhat ignore that problem and wait > for a query to return an error (that the statement is unknown by the node) to > re-prepare the query on that node, but it's relatively inefficient because > every time a node comes back up, you'll get bad latency spikes due to some > queries first failing, then being re-prepared and then only being executed. > So instead, drivers (at least the java driver but I believe others do as > well) pro-actively re-prepare statements when a node comes up. It solves the > latency problem, but currently every driver instance blindly re-prepare all > statements, meaning that in a large cluster with many clients there is a lot > of duplication of work (it would be enough for a single client to prepare the > statements) and a bigger than necessary load on the node that started. > An idea to solve this it to have a (cheap) way for clients to check if some > statements are prepared on the node. There is different options to provide > that but what I'd suggest is to add a system table to expose the (cached) > prepared statements because: > # it's reasonably straightforward to implement: we just add a line to the > table when a statement is prepared and remove it when it's evicted (we > already have eviction listeners). We'd also truncate the table on startup but > that's easy enough). We can even switch it to a "virtual table" if/when > CASSANDRA-7622 lands but it's trivial to do with a normal table in the > meantime. > # it doesn't require a change to the protocol or something like that. It > could even be done in 2.1 if we wish to. > # exposing prepared statements feels like a genuinely useful information to > have (outside of the problem exposed here that is), if only for > debugging/educational purposes. > The exposed table could look something like: > {noformat} > CREATE TABLE system.prepared_statements ( >keyspace_name text, >table_name text, >prepared_id blob, >query_string text, >PRIMARY KEY (keyspace_name, table_name, prepared_id) > ) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)