[ 
https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14327249#comment-14327249
 ] 

Sylvain Lebresne commented on CASSANDRA-8831:
---------------------------------------------

As I said, there could be many ways to solve this. So yes, we could technically 
do that, but the interesting question is what would be the advantages and 
inconvenients of doing it compared to letting clients do it. Perhaps more 
importantly, saying that nodes aks their neighbors doesn't say *how* they do 
it. And while we could use gossip or a custom verb handler for that, it feels 
more complex than worth it to me and in both case strongly imply we'd get this 
in a major release (which both means that 2.1 is out of question, and that if 
we don't want to push that far away in the future it has to make 3.0, and I'm 
not sure adding more stuff that absolutely needs to make 3.0 is a very popular 
idea). So I continue to think that exposing this through a table is the best 
way to go: it's easy, it's something user may be interested in anyway and as a 
bonus, once we have CASSANDRA-7622, it could be relatively straighforward to 
expose more stuff in that table, like the number of times a given statement has 
been executed.

Once we've done that, we can ask ourselves whether we want to continue letting 
clients deal with prepared statements all by themselves or if we prefer making 
node query their neighbors table on startup, but imo that question comes next.  
And my personal hunch is that it's not worth bothering: it's true that it would 
avoid the problem of multiple clients concurrently checking if statements are 
prepared on a node, but I'm not sure that problem is such a big deal in 
practice, and on the other side, it feels to me that which statement is 
prepared on which node is intrinsically something clients decide (for instance, 
some statement may only be needed in some DCs, but C* has no way to know that) 
and so I'm not sure blindy replicating all statements from neighbors is 
necessarily what we want.


> Create a system table to expose prepared statements
> ---------------------------------------------------
>
>                 Key: CASSANDRA-8831
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8831
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>
> Because drivers abstract from users the handling of up/down nodes, they have 
> to deal with the fact that when a node is restarted (or join), it won't know 
> any prepared statement. Drivers could somewhat ignore that problem and wait 
> for a query to return an error (that the statement is unknown by the node) to 
> re-prepare the query on that node, but it's relatively inefficient because 
> every time a node comes back up, you'll get bad latency spikes due to some 
> queries first failing, then being re-prepared and then only being executed. 
> So instead, drivers (at least the java driver but I believe others do as 
> well) pro-actively re-prepare statements when a node comes up. It solves the 
> latency problem, but currently every driver instance blindly re-prepare all 
> statements, meaning that in a large cluster with many clients there is a lot 
> of duplication of work (it would be enough for a single client to prepare the 
> statements) and a bigger than necessary load on the node that started.
> An idea to solve this it to have a (cheap) way for clients to check if some 
> statements are prepared on the node. There is different options to provide 
> that but what I'd suggest is to add a system table to expose the (cached) 
> prepared statements because:
> # it's reasonably straightforward to implement: we just add a line to the 
> table when a statement is prepared and remove it when it's evicted (we 
> already have eviction listeners). We'd also truncate the table on startup but 
> that's easy enough). We can even switch it to a "virtual table" if/when 
> CASSANDRA-7622 lands but it's trivial to do with a normal table in the 
> meantime.
> # it doesn't require a change to the protocol or something like that. It 
> could even be done in 2.1 if we wish to.
> # exposing prepared statements feels like a genuinely useful information to 
> have (outside of the problem exposed here that is), if only for 
> debugging/educational purposes.
> The exposed table could look something like:
> {noformat}
> CREATE TABLE system.prepared_statements (
>    keyspace_name text,
>    table_name text,
>    prepared_id blob,
>    query_string text,
>    PRIMARY KEY (keyspace_name, table_name, prepared_id)
> )
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to