[jira] [Commented] (CASSANDRA-9136) Improve error handling when table is queried before the schema has fully propagated

2015-05-07 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532183#comment-14532183
 ] 

Sylvain Lebresne commented on CASSANDRA-9136:
-

+1

 Improve error handling when table is queried before the schema has fully 
 propagated
 ---

 Key: CASSANDRA-9136
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9136
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 3 Nodes GCE, N1-Standard-2, Ubuntu 12, 1 Node on 2.1.4, 
 2 on 2.0.14
Reporter: Russell Alexander Spitzer
Assignee: Tyler Hobbs
 Fix For: 2.1.x, 2.0.x

 Attachments: 9136-2.0-v2.txt, 9136-2.0.txt, 9136-2.1-v2.txt, 
 9136-2.1.txt


 This error occurs during a rolling upgrade between 2.0.14 and 2.1.4.
 h3. Repo
 With all the nodes on 2.0.14 make the following tables
 {code}
 CREATE KEYSPACE test WITH replication = {
   'class': 'SimpleStrategy',
   'replication_factor': '2'
 };
 USE test;
 CREATE TABLE compact (
   k int,
   c int,
   d int,
   PRIMARY KEY ((k), c)
 ) WITH COMPACT STORAGE;
 CREATE TABLE norm (
   k int,
   c int,
   d int,
   PRIMARY KEY ((k), c)
 ) ;
 {code}
 Then load some data into these tables. I used the python driver
 {code}
 from cassandra.cluster import Cluster
 s = Cluster().connect()
 for x in range (1000):
 for y in range (1000):
s.execute_async(INSERT INTO test.compact (k,c,d) VALUES 
 (%d,%d,%d)%(x,y,y))
s.execute_async(INSERT INTO test.norm (k,c,d) VALUES 
 (%d,%d,%d)%(x,y,y))
 {code}
 Upgrade one node from 2.0.14 - 2.1.4
 From the 2.1.4 node, create a new table.
 Query that table
 On the 2.0.14 nodes you get these exceptions because the schema didn't 
 propagate there.  This exception kills the TCP connection between the nodes.
 {code}
 ERROR [Thread-19] 2015-04-08 18:48:45,337 CassandraDaemon.java (line 258) 
 Exception in thread Thread[Thread-19,5,main]
 java.lang.NullPointerException
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:247)
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:156)
   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:149)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:131)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:74)
 {code}
 Run cqlsh on the upgraded node and queries will fail until the TCP connection 
 is established again, easiest to repo with CL = ALL
 {code}
 cqlsh SELECT count(*) FROM test.norm where k = 22 ;
 ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' 
 responses] message=Operation timed out - received only 1 responses. 
 info={'received_responses': 1, 'required_responses': 2, 'consistency': 'ALL'}
 cqlsh SELECT count(*) FROM test.norm where k = 21 ;
 ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' 
 responses] message=Operation timed out - received only 1 responses. 
 info={'received_responses': 1, 'required_responses': 2, 'consistency': 'ALL'}
 {code}
 So connection made:
 {code}
 DEBUG [Thread-227] 2015-04-09 05:09:02,718 IncomingTcpConnection.java (line 
 107) Set version for /10.240.14.115 to 8 (will use 7)
 {code}
 Connection broken by query of table before schema propagated:
 {code}
 ERROR [Thread-227] 2015-04-09 05:10:24,015 CassandraDaemon.java (line 258) 
 Exception in thread Thread[Thread-227,5,main]
 java.lang.NullPointerException
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:247)
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:156)
   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:149)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:131)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:74)
 {code}
 All query to that node will fail with timeouts now until...
 Connection re-established
 {code}
 DEBUG [Thread-228] 2015-04-09 05:11:00,323 IncomingTcpConnection.java (line 
 107) Set version for /10.240.14.115 to 8 (will use 7)
 {code}
 Now queries work again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9136) Improve error handling when table is queried before the schema has fully propagated

2015-05-06 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530074#comment-14530074
 ] 

Sylvain Lebresne commented on CASSANDRA-9136:
-

Patches looks good, though I would make the error messages more clear as to 
what is likely the issue, adding something along the lines of If the table was 
just created, this is likely due to its schema not having fully propagated yet, 
please wait for schema agreement on table creation.

 Improve error handling when table is queried before the schema has fully 
 propagated
 ---

 Key: CASSANDRA-9136
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9136
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 3 Nodes GCE, N1-Standard-2, Ubuntu 12, 1 Node on 2.1.4, 
 2 on 2.0.14
Reporter: Russell Alexander Spitzer
Assignee: Tyler Hobbs
 Fix For: 2.1.x, 2.0.x

 Attachments: 9136-2.0.txt, 9136-2.1.txt


 This error occurs during a rolling upgrade between 2.0.14 and 2.1.4.
 h3. Repo
 With all the nodes on 2.0.14 make the following tables
 {code}
 CREATE KEYSPACE test WITH replication = {
   'class': 'SimpleStrategy',
   'replication_factor': '2'
 };
 USE test;
 CREATE TABLE compact (
   k int,
   c int,
   d int,
   PRIMARY KEY ((k), c)
 ) WITH COMPACT STORAGE;
 CREATE TABLE norm (
   k int,
   c int,
   d int,
   PRIMARY KEY ((k), c)
 ) ;
 {code}
 Then load some data into these tables. I used the python driver
 {code}
 from cassandra.cluster import Cluster
 s = Cluster().connect()
 for x in range (1000):
 for y in range (1000):
s.execute_async(INSERT INTO test.compact (k,c,d) VALUES 
 (%d,%d,%d)%(x,y,y))
s.execute_async(INSERT INTO test.norm (k,c,d) VALUES 
 (%d,%d,%d)%(x,y,y))
 {code}
 Upgrade one node from 2.0.14 - 2.1.4
 From the 2.1.4 node, create a new table.
 Query that table
 On the 2.0.14 nodes you get these exceptions because the schema didn't 
 propagate there.  This exception kills the TCP connection between the nodes.
 {code}
 ERROR [Thread-19] 2015-04-08 18:48:45,337 CassandraDaemon.java (line 258) 
 Exception in thread Thread[Thread-19,5,main]
 java.lang.NullPointerException
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:247)
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:156)
   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:149)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:131)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:74)
 {code}
 Run cqlsh on the upgraded node and queries will fail until the TCP connection 
 is established again, easiest to repo with CL = ALL
 {code}
 cqlsh SELECT count(*) FROM test.norm where k = 22 ;
 ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' 
 responses] message=Operation timed out - received only 1 responses. 
 info={'received_responses': 1, 'required_responses': 2, 'consistency': 'ALL'}
 cqlsh SELECT count(*) FROM test.norm where k = 21 ;
 ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' 
 responses] message=Operation timed out - received only 1 responses. 
 info={'received_responses': 1, 'required_responses': 2, 'consistency': 'ALL'}
 {code}
 So connection made:
 {code}
 DEBUG [Thread-227] 2015-04-09 05:09:02,718 IncomingTcpConnection.java (line 
 107) Set version for /10.240.14.115 to 8 (will use 7)
 {code}
 Connection broken by query of table before schema propagated:
 {code}
 ERROR [Thread-227] 2015-04-09 05:10:24,015 CassandraDaemon.java (line 258) 
 Exception in thread Thread[Thread-227,5,main]
 java.lang.NullPointerException
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:247)
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:156)
   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:149)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:131)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:74)
 {code}
 All query to that node will fail with timeouts now until...
 Connection re-established
 {code}
 DEBUG [Thread-228] 2015-04-09 05:11:00,323 IncomingTcpConnection.java (line 
 107) Set version for /10.240.14.115 to 8 (will use 7)
 {code}
 Now queries work again.



--
This message 

[jira] [Commented] (CASSANDRA-9136) Improve error handling when table is queried before the schema has fully propagated

2015-05-01 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524258#comment-14524258
 ] 

Tyler Hobbs commented on CASSANDRA-9136:


Since we haven't decided what the best solution for recovery is and we all 
agree on the error message part, I've opened CASSANDRA-9289 to deal with 
recovery separately.  I'll have a patch and test for the error message shortly.

 Improve error handling when table is queried before the schema has fully 
 propagated
 ---

 Key: CASSANDRA-9136
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9136
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 3 Nodes GCE, N1-Standard-2, Ubuntu 12, 1 Node on 2.1.4, 
 2 on 2.0.14
Reporter: Russell Alexander Spitzer
Assignee: Tyler Hobbs
 Fix For: 2.1.x


 This error occurs during a rolling upgrade between 2.0.14 and 2.1.4.
 h3. Repo
 With all the nodes on 2.0.14 make the following tables
 {code}
 CREATE KEYSPACE test WITH replication = {
   'class': 'SimpleStrategy',
   'replication_factor': '2'
 };
 USE test;
 CREATE TABLE compact (
   k int,
   c int,
   d int,
   PRIMARY KEY ((k), c)
 ) WITH COMPACT STORAGE;
 CREATE TABLE norm (
   k int,
   c int,
   d int,
   PRIMARY KEY ((k), c)
 ) ;
 {code}
 Then load some data into these tables. I used the python driver
 {code}
 from cassandra.cluster import Cluster
 s = Cluster().connect()
 for x in range (1000):
 for y in range (1000):
s.execute_async(INSERT INTO test.compact (k,c,d) VALUES 
 (%d,%d,%d)%(x,y,y))
s.execute_async(INSERT INTO test.norm (k,c,d) VALUES 
 (%d,%d,%d)%(x,y,y))
 {code}
 Upgrade one node from 2.0.14 - 2.1.4
 From the 2.1.4 node, create a new table.
 Query that table
 On the 2.0.14 nodes you get these exceptions because the schema didn't 
 propagate there.  This exception kills the TCP connection between the nodes.
 {code}
 ERROR [Thread-19] 2015-04-08 18:48:45,337 CassandraDaemon.java (line 258) 
 Exception in thread Thread[Thread-19,5,main]
 java.lang.NullPointerException
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:247)
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:156)
   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:149)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:131)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:74)
 {code}
 Run cqlsh on the upgraded node and queries will fail until the TCP connection 
 is established again, easiest to repo with CL = ALL
 {code}
 cqlsh SELECT count(*) FROM test.norm where k = 22 ;
 ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' 
 responses] message=Operation timed out - received only 1 responses. 
 info={'received_responses': 1, 'required_responses': 2, 'consistency': 'ALL'}
 cqlsh SELECT count(*) FROM test.norm where k = 21 ;
 ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' 
 responses] message=Operation timed out - received only 1 responses. 
 info={'received_responses': 1, 'required_responses': 2, 'consistency': 'ALL'}
 {code}
 So connection made:
 {code}
 DEBUG [Thread-227] 2015-04-09 05:09:02,718 IncomingTcpConnection.java (line 
 107) Set version for /10.240.14.115 to 8 (will use 7)
 {code}
 Connection broken by query of table before schema propagated:
 {code}
 ERROR [Thread-227] 2015-04-09 05:10:24,015 CassandraDaemon.java (line 258) 
 Exception in thread Thread[Thread-227,5,main]
 java.lang.NullPointerException
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:247)
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:156)
   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:149)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:131)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:74)
 {code}
 All query to that node will fail with timeouts now until...
 Connection re-established
 {code}
 DEBUG [Thread-228] 2015-04-09 05:11:00,323 IncomingTcpConnection.java (line 
 107) Set version for /10.240.14.115 to 8 (will use 7)
 {code}
 Now queries work again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9136) Improve error handling when table is queried before the schema has fully propagated

2015-04-30 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521761#comment-14521761
 ] 

Jeremiah Jordan commented on CASSANDRA-9136:


I think it would be good to be able to recover.  That would be my preference.  
Yes you shouldn't query before schema has settled, but if you do, I don't think 
it shouldn't break all your other queries.  But a better error message would at 
least give people a clue to what broke them.

 Improve error handling when table is queried before the schema has fully 
 propagated
 ---

 Key: CASSANDRA-9136
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9136
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 3 Nodes GCE, N1-Standard-2, Ubuntu 12, 1 Node on 2.1.4, 
 2 on 2.0.14
Reporter: Russell Alexander Spitzer
Assignee: Tyler Hobbs
 Fix For: 2.1.x


 This error occurs during a rolling upgrade between 2.0.14 and 2.1.4.
 h3. Repo
 With all the nodes on 2.0.14 make the following tables
 {code}
 CREATE KEYSPACE test WITH replication = {
   'class': 'SimpleStrategy',
   'replication_factor': '2'
 };
 USE test;
 CREATE TABLE compact (
   k int,
   c int,
   d int,
   PRIMARY KEY ((k), c)
 ) WITH COMPACT STORAGE;
 CREATE TABLE norm (
   k int,
   c int,
   d int,
   PRIMARY KEY ((k), c)
 ) ;
 {code}
 Then load some data into these tables. I used the python driver
 {code}
 from cassandra.cluster import Cluster
 s = Cluster().connect()
 for x in range (1000):
 for y in range (1000):
s.execute_async(INSERT INTO test.compact (k,c,d) VALUES 
 (%d,%d,%d)%(x,y,y))
s.execute_async(INSERT INTO test.norm (k,c,d) VALUES 
 (%d,%d,%d)%(x,y,y))
 {code}
 Upgrade one node from 2.0.14 - 2.1.4
 From the 2.1.4 node, create a new table.
 Query that table
 On the 2.0.14 nodes you get these exceptions because the schema didn't 
 propagate there.  This exception kills the TCP connection between the nodes.
 {code}
 ERROR [Thread-19] 2015-04-08 18:48:45,337 CassandraDaemon.java (line 258) 
 Exception in thread Thread[Thread-19,5,main]
 java.lang.NullPointerException
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:247)
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:156)
   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:149)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:131)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:74)
 {code}
 Run cqlsh on the upgraded node and queries will fail until the TCP connection 
 is established again, easiest to repo with CL = ALL
 {code}
 cqlsh SELECT count(*) FROM test.norm where k = 22 ;
 ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' 
 responses] message=Operation timed out - received only 1 responses. 
 info={'received_responses': 1, 'required_responses': 2, 'consistency': 'ALL'}
 cqlsh SELECT count(*) FROM test.norm where k = 21 ;
 ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' 
 responses] message=Operation timed out - received only 1 responses. 
 info={'received_responses': 1, 'required_responses': 2, 'consistency': 'ALL'}
 {code}
 So connection made:
 {code}
 DEBUG [Thread-227] 2015-04-09 05:09:02,718 IncomingTcpConnection.java (line 
 107) Set version for /10.240.14.115 to 8 (will use 7)
 {code}
 Connection broken by query of table before schema propagated:
 {code}
 ERROR [Thread-227] 2015-04-09 05:10:24,015 CassandraDaemon.java (line 258) 
 Exception in thread Thread[Thread-227,5,main]
 java.lang.NullPointerException
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:247)
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:156)
   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:149)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:131)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:74)
 {code}
 All query to that node will fail with timeouts now until...
 Connection re-established
 {code}
 DEBUG [Thread-228] 2015-04-09 05:11:00,323 IncomingTcpConnection.java (line 
 107) Set version for /10.240.14.115 to 8 (will use 7)
 {code}
 Now queries work again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9136) Improve error handling when table is queried before the schema has fully propagated

2015-04-20 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502902#comment-14502902
 ] 

Sylvain Lebresne commented on CASSANDRA-9136:
-

It's not unreasonable per-se, but the fact that you have to manually pass how 
much bytes you've deserialized when throwing the exception makes this a bit 
error prone in general imo, even though it's arguably easy enough to proof 
check in this particular case (it would also make it slightly more annoying to 
add support for {{EncodedDataInputStream}} if we wanted too for instance, 
though that's a minor point).

The intial idea I had was to use something like {{BytesReadTracker}} to make 
the counting automatic, but I'm married to that idea either though since it 
adds a small overhead in general which I don't like.

Overall, I respect wanting to improve this but I think I'm of the opinion that 
simply making the error message a lot more clear should be good enough and that 
it's not worth trying to be too smart in recovering. Not a strong opinion 
though, just a data point.


 Improve error handling when table is queried before the schema has fully 
 propagated
 ---

 Key: CASSANDRA-9136
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9136
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 3 Nodes GCE, N1-Standard-2, Ubuntu 12, 1 Node on 2.1.4, 
 2 on 2.0.14
Reporter: Russell Alexander Spitzer
Assignee: Tyler Hobbs
 Fix For: 2.1.5


 This error occurs during a rolling upgrade between 2.0.14 and 2.1.4.
 h3. Repo
 With all the nodes on 2.0.14 make the following tables
 {code}
 CREATE KEYSPACE test WITH replication = {
   'class': 'SimpleStrategy',
   'replication_factor': '2'
 };
 USE test;
 CREATE TABLE compact (
   k int,
   c int,
   d int,
   PRIMARY KEY ((k), c)
 ) WITH COMPACT STORAGE;
 CREATE TABLE norm (
   k int,
   c int,
   d int,
   PRIMARY KEY ((k), c)
 ) ;
 {code}
 Then load some data into these tables. I used the python driver
 {code}
 from cassandra.cluster import Cluster
 s = Cluster().connect()
 for x in range (1000):
 for y in range (1000):
s.execute_async(INSERT INTO test.compact (k,c,d) VALUES 
 (%d,%d,%d)%(x,y,y))
s.execute_async(INSERT INTO test.norm (k,c,d) VALUES 
 (%d,%d,%d)%(x,y,y))
 {code}
 Upgrade one node from 2.0.14 - 2.1.4
 From the 2.1.4 node, create a new table.
 Query that table
 On the 2.0.14 nodes you get these exceptions because the schema didn't 
 propagate there.  This exception kills the TCP connection between the nodes.
 {code}
 ERROR [Thread-19] 2015-04-08 18:48:45,337 CassandraDaemon.java (line 258) 
 Exception in thread Thread[Thread-19,5,main]
 java.lang.NullPointerException
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:247)
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:156)
   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:149)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:131)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:74)
 {code}
 Run cqlsh on the upgraded node and queries will fail until the TCP connection 
 is established again, easiest to repo with CL = ALL
 {code}
 cqlsh SELECT count(*) FROM test.norm where k = 22 ;
 ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' 
 responses] message=Operation timed out - received only 1 responses. 
 info={'received_responses': 1, 'required_responses': 2, 'consistency': 'ALL'}
 cqlsh SELECT count(*) FROM test.norm where k = 21 ;
 ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' 
 responses] message=Operation timed out - received only 1 responses. 
 info={'received_responses': 1, 'required_responses': 2, 'consistency': 'ALL'}
 {code}
 So connection made:
 {code}
 DEBUG [Thread-227] 2015-04-09 05:09:02,718 IncomingTcpConnection.java (line 
 107) Set version for /10.240.14.115 to 8 (will use 7)
 {code}
 Connection broken by query of table before schema propagated:
 {code}
 ERROR [Thread-227] 2015-04-09 05:10:24,015 CassandraDaemon.java (line 258) 
 Exception in thread Thread[Thread-227,5,main]
 java.lang.NullPointerException
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:247)
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:156)
   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
   at 
 

[jira] [Commented] (CASSANDRA-9136) Improve error handling when table is queried before the schema has fully propagated

2015-04-15 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497108#comment-14497108
 ] 

Tyler Hobbs commented on CASSANDRA-9136:


[~slebresne] I've pushed a couple of commits to a 
[branch|https://github.com/thobbs/cassandra/tree/CASSANDRA-9136] as a basic 
example of what I would do (not complete or tested).  The first just throws 
{{UnknownColumnFamilyException}}, the second recovers from that error during 
deserialization.  Does that seem reasonable?

 Improve error handling when table is queried before the schema has fully 
 propagated
 ---

 Key: CASSANDRA-9136
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9136
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 3 Nodes GCE, N1-Standard-2, Ubuntu 12, 1 Node on 2.1.4, 
 2 on 2.0.14
Reporter: Russell Alexander Spitzer
Assignee: Tyler Hobbs
 Fix For: 2.1.5


 This error occurs during a rolling upgrade between 2.0.14 and 2.1.4.
 h3. Repo
 With all the nodes on 2.0.14 make the following tables
 {code}
 CREATE KEYSPACE test WITH replication = {
   'class': 'SimpleStrategy',
   'replication_factor': '2'
 };
 USE test;
 CREATE TABLE compact (
   k int,
   c int,
   d int,
   PRIMARY KEY ((k), c)
 ) WITH COMPACT STORAGE;
 CREATE TABLE norm (
   k int,
   c int,
   d int,
   PRIMARY KEY ((k), c)
 ) ;
 {code}
 Then load some data into these tables. I used the python driver
 {code}
 from cassandra.cluster import Cluster
 s = Cluster().connect()
 for x in range (1000):
 for y in range (1000):
s.execute_async(INSERT INTO test.compact (k,c,d) VALUES 
 (%d,%d,%d)%(x,y,y))
s.execute_async(INSERT INTO test.norm (k,c,d) VALUES 
 (%d,%d,%d)%(x,y,y))
 {code}
 Upgrade one node from 2.0.14 - 2.1.4
 From the 2.1.4 node, create a new table.
 Query that table
 On the 2.0.14 nodes you get these exceptions because the schema didn't 
 propagate there.  This exception kills the TCP connection between the nodes.
 {code}
 ERROR [Thread-19] 2015-04-08 18:48:45,337 CassandraDaemon.java (line 258) 
 Exception in thread Thread[Thread-19,5,main]
 java.lang.NullPointerException
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:247)
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:156)
   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:149)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:131)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:74)
 {code}
 Run cqlsh on the upgraded node and queries will fail until the TCP connection 
 is established again, easiest to repo with CL = ALL
 {code}
 cqlsh SELECT count(*) FROM test.norm where k = 22 ;
 ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' 
 responses] message=Operation timed out - received only 1 responses. 
 info={'received_responses': 1, 'required_responses': 2, 'consistency': 'ALL'}
 cqlsh SELECT count(*) FROM test.norm where k = 21 ;
 ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' 
 responses] message=Operation timed out - received only 1 responses. 
 info={'received_responses': 1, 'required_responses': 2, 'consistency': 'ALL'}
 {code}
 So connection made:
 {code}
 DEBUG [Thread-227] 2015-04-09 05:09:02,718 IncomingTcpConnection.java (line 
 107) Set version for /10.240.14.115 to 8 (will use 7)
 {code}
 Connection broken by query of table before schema propagated:
 {code}
 ERROR [Thread-227] 2015-04-09 05:10:24,015 CassandraDaemon.java (line 258) 
 Exception in thread Thread[Thread-227,5,main]
 java.lang.NullPointerException
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:247)
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:156)
   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:149)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:131)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:74)
 {code}
 All query to that node will fail with timeouts now until...
 Connection re-established
 {code}
 DEBUG [Thread-228] 2015-04-09 05:11:00,323 IncomingTcpConnection.java (line 
 107) Set version for /10.240.14.115 to 8 (will use 7)
 {code}
 Now queries work again.



--
This message was sent by Atlassian JIRA

[jira] [Commented] (CASSANDRA-9136) Improve error handling when table is queried before the schema has fully propagated

2015-04-15 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497030#comment-14497030
 ] 

Tyler Hobbs commented on CASSANDRA-9136:


Linking to CASSANDRA-8996 because this occasionally causes dtest failures when 
the default role setup triggers the NPE.

 Improve error handling when table is queried before the schema has fully 
 propagated
 ---

 Key: CASSANDRA-9136
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9136
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 3 Nodes GCE, N1-Standard-2, Ubuntu 12, 1 Node on 2.1.4, 
 2 on 2.0.14
Reporter: Russell Alexander Spitzer
Assignee: Tyler Hobbs
 Fix For: 2.1.5


 This error occurs during a rolling upgrade between 2.0.14 and 2.1.4.
 h3. Repo
 With all the nodes on 2.0.14 make the following tables
 {code}
 CREATE KEYSPACE test WITH replication = {
   'class': 'SimpleStrategy',
   'replication_factor': '2'
 };
 USE test;
 CREATE TABLE compact (
   k int,
   c int,
   d int,
   PRIMARY KEY ((k), c)
 ) WITH COMPACT STORAGE;
 CREATE TABLE norm (
   k int,
   c int,
   d int,
   PRIMARY KEY ((k), c)
 ) ;
 {code}
 Then load some data into these tables. I used the python driver
 {code}
 from cassandra.cluster import Cluster
 s = Cluster().connect()
 for x in range (1000):
 for y in range (1000):
s.execute_async(INSERT INTO test.compact (k,c,d) VALUES 
 (%d,%d,%d)%(x,y,y))
s.execute_async(INSERT INTO test.norm (k,c,d) VALUES 
 (%d,%d,%d)%(x,y,y))
 {code}
 Upgrade one node from 2.0.14 - 2.1.4
 From the 2.1.4 node, create a new table.
 Query that table
 On the 2.0.14 nodes you get these exceptions because the schema didn't 
 propagate there.  This exception kills the TCP connection between the nodes.
 {code}
 ERROR [Thread-19] 2015-04-08 18:48:45,337 CassandraDaemon.java (line 258) 
 Exception in thread Thread[Thread-19,5,main]
 java.lang.NullPointerException
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:247)
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:156)
   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:149)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:131)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:74)
 {code}
 Run cqlsh on the upgraded node and queries will fail until the TCP connection 
 is established again, easiest to repo with CL = ALL
 {code}
 cqlsh SELECT count(*) FROM test.norm where k = 22 ;
 ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' 
 responses] message=Operation timed out - received only 1 responses. 
 info={'received_responses': 1, 'required_responses': 2, 'consistency': 'ALL'}
 cqlsh SELECT count(*) FROM test.norm where k = 21 ;
 ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' 
 responses] message=Operation timed out - received only 1 responses. 
 info={'received_responses': 1, 'required_responses': 2, 'consistency': 'ALL'}
 {code}
 So connection made:
 {code}
 DEBUG [Thread-227] 2015-04-09 05:09:02,718 IncomingTcpConnection.java (line 
 107) Set version for /10.240.14.115 to 8 (will use 7)
 {code}
 Connection broken by query of table before schema propagated:
 {code}
 ERROR [Thread-227] 2015-04-09 05:10:24,015 CassandraDaemon.java (line 258) 
 Exception in thread Thread[Thread-227,5,main]
 java.lang.NullPointerException
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:247)
   at 
 org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:156)
   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:149)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:131)
   at 
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:74)
 {code}
 All query to that node will fail with timeouts now until...
 Connection re-established
 {code}
 DEBUG [Thread-228] 2015-04-09 05:11:00,323 IncomingTcpConnection.java (line 
 107) Set version for /10.240.14.115 to 8 (will use 7)
 {code}
 Now queries work again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)