Good morning, Sorry for the slow reply here. I finally had some time to test cqlsh tracing on a ccm cluster with 2 of 3 nodes down, to see if the unavailable error was due to cqlsh or my query. Reply inline below.
On 15/01/2015 12:46, "Tyler Hobbs" <ty...@datastax.com<mailto:ty...@datastax.com>> wrote: On Thu, Jan 15, 2015 at 6:30 AM, Richard Dawe <rich.d...@messagesystems.com<mailto:rich.d...@messagesystems.com>> wrote: I thought it might be quorum consistency level, because of the because I was seeing with cqlsh. I was testing with ccm with C* 2.0.8, 3 nodes, vnodes enabled ("ccm create test -v 2.0.8 -n 3 --vnodes -s”). With all three nodes up, my schema operations were working fine. When I took down two nodes using “ccm node2 stop”, “ccm node3 stop”, I found that schema operations through “ccm node1 cqlsh” were failing like this: cqlsh> ALTER TABLE test.test3 ADD fred text; Unable to complete request: one or more nodes were unavailable. That’s the full output — I had enabled tracing, but only that error came back. After reading your reply, I went back and re-ran my tests with cqlsh, and it seems like the “one or more nodes were unavailable” may be due to cqlsh’s error handling. If I wait a bit, and re-run my schema operations, they work fine with only one node up. I can see in the tracing that it’s only talking to node1 (127.0.0.1) to make the schema modifications. Is this a known issue in cqlsh? If it helps I can send the full command-line session log. That Unavailable error may actually be from the tracing-related queries failing (that's what I suspect, at least). Starting cqlsh with --debug might show you a stacktrace in that case, but I'm not 100% sure. Yes, it does seem to be cqlsh tracing. The debug output below was generated with: * A 3 node ccm cluster, running Cassandra 2.0.8 on Ubuntu 14.10 x86_64. * I took down 2 of the 3 nodes. * Table test5 has a replication factor of 3, primary key is “id text”. * cqlsh session was started after 2 of the 3 nodes had been shut down. Debug output: rdawe@cstar:~$ ccm node1 cqlsh --debug Using CQL driver: <module 'cql' from '/home/rdawe/.ccm/repository/2.0.8/bin/../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/__init__.py'> Using thrift lib: <module 'thrift' from '/home/rdawe/.ccm/repository/2.0.8/bin/../lib/thrift-python-internal-only-0.9.1.zip/thrift/__init__.py'> Connected to test at 127.0.0.1:9160. [cqlsh 4.1.1 | Cassandra 2.0.8-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19.39.0] Use HELP for help. cqlsh> USE test; cqlsh:test> TRACING ON Now tracing requests. cqlsh:test> SELECT * FROM test5; id | foo -------+------- blarg | ness hello | world (2 rows) Traceback (most recent call last): File "/home/rdawe/.ccm/repository/2.0.8/bin/cqlsh", line 827, in onecmd self.handle_statement(st, statementtext) File "/home/rdawe/.ccm/repository/2.0.8/bin/cqlsh", line 865, in handle_statement return custom_handler(parsed) File "/home/rdawe/.ccm/repository/2.0.8/bin/cqlsh", line 901, in do_select with_default_limit=with_default_limit) File "/home/rdawe/.ccm/repository/2.0.8/bin/cqlsh", line 910, in perform_statement print_trace_session(self, self.cursor, session_id) File "/home/rdawe/.ccm/repository/2.0.8/bin/../pylib/cqlshlib/tracing.py", line 26, in print_trace_session rows = fetch_trace_session(cursor, session_id) File "/home/rdawe/.ccm/repository/2.0.8/bin/../pylib/cqlshlib/tracing.py", line 47, in fetch_trace_session consistency_level='ONE') File "/home/rdawe/.ccm/repository/2.0.8/bin/../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cursor.py", line 80, in execute response = self.get_response(prepared_q, cl) File "/home/rdawe/.ccm/repository/2.0.8/bin/../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/thrifteries.py", line 77, in get_response return self.handle_cql_execution_errors(doquery, compressed_q, compress, cl) File "/home/rdawe/.ccm/repository/2.0.8/bin/../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/thrifteries.py", line 102, in handle_cql_execution_errors raise cql.OperationalError("Unable to complete request: one or " OperationalError: Unable to complete request: one or more nodes were unavailable. Sometimes I get a different error: rdawe@cstar:~$ echo -e 'TRACING ON\nSELECT * FROM test.test5;\n' | ccm node1 cqlsh --debug Using CQL driver: <module 'cql' from '/home/rdawe/.ccm/repository/2.0.8/bin/../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/__init__.py'> Using thrift lib: <module 'thrift' from '/home/rdawe/.ccm/repository/2.0.8/bin/../lib/thrift-python-internal-only-0.9.1.zip/thrift/__init__.py'> Now tracing requests. id | foo -------+------- blarg | ness hello | world (2 rows) <stdin>:3:Session edc8c010-bcd5-11e4-a008-1dd7f4de70a1 wasn't found. I notice that the system_traces keyspace has replication factor 2. Since 2 nodes are down, perhaps sometimes the tracing session would be stored on nodes that are down. And other times one of the two replicas for system_traces would be on the node that’s up, but for some reason storing the data in system_traces.sessions fails? Thanks, best regards, Rich