[jira] [Updated] (CASSANDRA-13196) test failure in snitch_test.TestGossipingPropertyFileSnitch.test_prefer_local_reconnect_on_listen_address
[ https://issues.apache.org/jira/browse/CASSANDRA-13196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandr Sorokoumov updated CASSANDRA-13196: - Reviewer: Alex Petrov Status: Patch Available (was: Open) The failure in the test ("keyspace keyspace1 does not exist") happened because during the pre-bootstrap schema migration all the migration tasks failed to complete and the node was bootstrapped with schema being out of sync. {{MigrationManager.waitUntilReadyForBootstrap}} (which is invoked by {{StorageService.waitForSchema}}) just waits for the inflight tasks to finish and discards ones that take longer than {{MIGRATION_TASK_WAIT_IN_SECONDS}} to complete. Schema migration tasks are scheduled when there is a big change in an endpoint state - it joins the cluster, becomes alive or its schema version has changed. The idea is that it is safe to restart the migration task if it has timed out because either the task will succeed on one of the next retries or will be eventually killed by {{FailureDetector}} if the endpoint is marked as unreachable. AFAIU there will be at least one migration task per endpoint. With the retry mechanism {{MigrationManager.waitUntilReadyForBootstrap}} will run until migration tasks to all the reachable nodes succeed. This means that either we will receive the migration data from at least one of the nodes or all the nodes will be unreachable, but then the bootstrap is supposed to fail anyway. *Steps to reproduce* To test the retry, I commented out sending reply in {{org.apache.cassandra.schema.SchemaPullVerbHandler.doVerb}} and ran the original {{snitch_test.TestGossipingPropertyFileSnitch.test_prefer_local_reconnect_on_listen_address}} test. _NB:_ the test will run forever because without response the migration requests timeout and then being restarted. *Code* https://github.com/Ge/cassandra/tree/13196-3.11 *CI builds*: * https://cassci.datastax.com/job/ifesdjeen-13196-trunk-dtest/ * https://cassci.datastax.com/job/ifesdjeen-13196-trunk-testall/ > test failure in > snitch_test.TestGossipingPropertyFileSnitch.test_prefer_local_reconnect_on_listen_address > - > > Key: CASSANDRA-13196 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13196 > Project: Cassandra > Issue Type: Bug >Reporter: Michael Shuler >Assignee: Aleksandr Sorokoumov > Labels: dtest, test-failure > Attachments: node1_debug.log, node1_gc.log, node1.log, > node2_debug.log, node2_gc.log, node2.log > > > example failure: > http://cassci.datastax.com/job/trunk_dtest/1487/testReport/snitch_test/TestGossipingPropertyFileSnitch/test_prefer_local_reconnect_on_listen_address > {code} > {novnode} > Error Message > Error from server: code=2200 [Invalid query] message="keyspace keyspace1 does > not exist" > >> begin captured logging << > dtest: DEBUG: cluster ccm directory: /tmp/dtest-k6b0iF > dtest: DEBUG: Done setting configuration options: > { 'initial_token': None, > 'num_tokens': '32', > 'phi_convict_threshold': 5, > 'range_request_timeout_in_ms': 1, > 'read_request_timeout_in_ms': 1, > 'request_timeout_in_ms': 1, > 'truncate_request_timeout_in_ms': 1, > 'write_request_timeout_in_ms': 1} > cassandra.policies: INFO: Using datacenter 'dc1' for DCAwareRoundRobinPolicy > (via host '127.0.0.1'); if incorrect, please specify a local_dc to the > constructor, or limit contact points to local cluster nodes > cassandra.cluster: INFO: New Cassandra host discovered > - >> end captured logging << - > Stacktrace > File "/usr/lib/python2.7/unittest/case.py", line 329, in run > testMethod() > File "/home/automaton/cassandra-dtest/snitch_test.py", line 87, in > test_prefer_local_reconnect_on_listen_address > new_rows = list(session.execute("SELECT * FROM {}".format(stress_table))) > File "/home/automaton/src/cassandra-driver/cassandra/cluster.py", line > 1998, in execute > return self.execute_async(query, parameters, trace, custom_payload, > timeout, execution_profile, paging_state).result() > File "/home/automaton/src/cassandra-driver/cassandra/cluster.py", line > 3784, in result > raise self._final_exception > 'Error from server: code=2200 [Invalid query] message="keyspace keyspace1 > does not exist"\n >> begin captured logging << > \ndtest: DEBUG: cluster ccm directory: > /tmp/dtest-k6b0iF\ndtest: DEBUG: Done setting configuration options:\n{ > \'initial_token\': None,\n\'num_tokens\': \'32\',\n > \'phi_convict_threshold\': 5,\n\'range_request_timeout_in_ms\': 1,\n > \'read_request_timeout_in
[jira] [Commented] (CASSANDRA-13268) Allow to create custom secondary index on static columns
[ https://issues.apache.org/jira/browse/CASSANDRA-13268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906700#comment-15906700 ] DOAN DuyHai commented on CASSANDRA-13268: - It is *already* possible to create custom secondary index on static columns ... For example with SASI ... > Allow to create custom secondary index on static columns > > > Key: CASSANDRA-13268 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13268 > Project: Cassandra > Issue Type: Improvement > Components: Core, CQL >Reporter: vincent royer >Priority: Trivial > Labels: features > Fix For: 3.11.x, 4.x > > Attachments: 0001-CASSANDRA-13268-custom-index-on-static-columns.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > Custom secondary index implementations (like elassandra) could gain avantage > to index static columns, even if not searchable with CQL. Here is a proposal > to allow index creation on static columns. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13268) Allow to create custom secondary index on static columns
[ https://issues.apache.org/jira/browse/CASSANDRA-13268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906701#comment-15906701 ] DOAN DuyHai commented on CASSANDRA-13268: - https://issues.apache.org/jira/browse/CASSANDRA-11183 > Allow to create custom secondary index on static columns > > > Key: CASSANDRA-13268 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13268 > Project: Cassandra > Issue Type: Improvement > Components: Core, CQL >Reporter: vincent royer >Priority: Trivial > Labels: features > Fix For: 3.11.x, 4.x > > Attachments: 0001-CASSANDRA-13268-custom-index-on-static-columns.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > Custom secondary index implementations (like elassandra) could gain avantage > to index static columns, even if not searchable with CQL. Here is a proposal > to allow index creation on static columns. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13304) Add checksumming to the native protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-13304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906708#comment-15906708 ] Tom van der Woerdt commented on CASSANDRA-13304: Do you have any data indicating this is an actual problem in Cassandra deployments? Would TLS not solve the same problem (and more)? > Add checksumming to the native protocol > --- > > Key: CASSANDRA-13304 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13304 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Kjellman >Assignee: Michael Kjellman > Attachments: 13304_v1.diff > > > The native binary transport implementation doesn't include checksums. This > makes it highly susceptible to silently inserting corrupted data either due > to hardware issues causing bit flips on the sender/client side, C*/receiver > side, or network in between. > Attaching an implementation that makes checksum'ing mandatory (assuming both > client and server know about a protocol version that supports checksums) -- > and also adds checksumming to clients that request compression. > The serialized format looks something like this: > {noformat} > * 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 > * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Number of Compressed Chunks | Compressed Length (e1)/ > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * / Compressed Length cont. (e1) |Uncompressed Length (e1) / > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Uncompressed Length cont. (e1)| CRC32 Checksum of Lengths (e1)| > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Checksum of Lengths cont. (e1)|Compressed Bytes (e1)+// > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum (e1) || > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |Compressed Length (e2) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Uncompressed Length (e2)| > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |CRC32 Checksum of Lengths (e2) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Compressed Bytes (e2) +// > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum (e2) || > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |Compressed Length (en) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Uncompressed Length (en)| > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |CRC32 Checksum of Lengths (en) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Compressed Bytes (en) +// > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum (en) || > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > {noformat} > The first pass here adds checksums only to the actual contents of the frame > body itself (and doesn't actually checksum lengths and headers). While it > would be great to fully add checksuming across the entire protocol, the > proposed implementation will ensure we at least catch corrupted data and > likely protect ourselves pretty well anyways. > I didn't go to the trouble of implementing a Snappy Checksum'ed Compressor > implementation as it's been deprecated for a while -- is really slow and > crappy compared to LZ4 -- and we should do everything in our power to make > sure no one in the community is still using it. I left it in (for obvious > backwards compatibility aspects) old for clients that don't know about the > new protocol. > The current protocol has a 256MB (max) frame body -- where the serialized > contents are simply written in to the frame body. > If the client sends a compression option in the startup, we will install a > FrameCompressor inline. Unfortunately, we went with a decision to treat the > frame body separately from the header bits etc in a given message. So, > instead we put a compressor implementation in the options and then if it's > not null, we push the serialized bytes for the frame body
[jira] [Commented] (CASSANDRA-13196) test failure in snitch_test.TestGossipingPropertyFileSnitch.test_prefer_local_reconnect_on_listen_address
[ https://issues.apache.org/jira/browse/CASSANDRA-13196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906715#comment-15906715 ] Jeff Jirsa commented on CASSANDRA-13196: There's a real risk (in large clusters, or in clusters with large schemas, or when upgrading versions where we run in a mixed-version state) that we can have a lot of migrationtasks in flight, so much so that we can actually kill nodes ( see CASSANDRA-11748 for example ) - re-queueing more migration tasks when one times out is a good way to make the problem worse, not better. I'm very concerned with the approach [here|https://github.com/Ge/cassandra/commit/463f3fecd9348ea0a4ce6eeeb30141527b8b10eb#diff-f484a759f797776d9cc5d8af92b29e5eR156] where we just blindly schedule another poll. Do we even know why this failed in the first place? Isn't the right fix understanding why all 3 migration tasks failed, not just making more and more and more migration tasks? > test failure in > snitch_test.TestGossipingPropertyFileSnitch.test_prefer_local_reconnect_on_listen_address > - > > Key: CASSANDRA-13196 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13196 > Project: Cassandra > Issue Type: Bug >Reporter: Michael Shuler >Assignee: Aleksandr Sorokoumov > Labels: dtest, test-failure > Attachments: node1_debug.log, node1_gc.log, node1.log, > node2_debug.log, node2_gc.log, node2.log > > > example failure: > http://cassci.datastax.com/job/trunk_dtest/1487/testReport/snitch_test/TestGossipingPropertyFileSnitch/test_prefer_local_reconnect_on_listen_address > {code} > {novnode} > Error Message > Error from server: code=2200 [Invalid query] message="keyspace keyspace1 does > not exist" > >> begin captured logging << > dtest: DEBUG: cluster ccm directory: /tmp/dtest-k6b0iF > dtest: DEBUG: Done setting configuration options: > { 'initial_token': None, > 'num_tokens': '32', > 'phi_convict_threshold': 5, > 'range_request_timeout_in_ms': 1, > 'read_request_timeout_in_ms': 1, > 'request_timeout_in_ms': 1, > 'truncate_request_timeout_in_ms': 1, > 'write_request_timeout_in_ms': 1} > cassandra.policies: INFO: Using datacenter 'dc1' for DCAwareRoundRobinPolicy > (via host '127.0.0.1'); if incorrect, please specify a local_dc to the > constructor, or limit contact points to local cluster nodes > cassandra.cluster: INFO: New Cassandra host discovered > - >> end captured logging << - > Stacktrace > File "/usr/lib/python2.7/unittest/case.py", line 329, in run > testMethod() > File "/home/automaton/cassandra-dtest/snitch_test.py", line 87, in > test_prefer_local_reconnect_on_listen_address > new_rows = list(session.execute("SELECT * FROM {}".format(stress_table))) > File "/home/automaton/src/cassandra-driver/cassandra/cluster.py", line > 1998, in execute > return self.execute_async(query, parameters, trace, custom_payload, > timeout, execution_profile, paging_state).result() > File "/home/automaton/src/cassandra-driver/cassandra/cluster.py", line > 3784, in result > raise self._final_exception > 'Error from server: code=2200 [Invalid query] message="keyspace keyspace1 > does not exist"\n >> begin captured logging << > \ndtest: DEBUG: cluster ccm directory: > /tmp/dtest-k6b0iF\ndtest: DEBUG: Done setting configuration options:\n{ > \'initial_token\': None,\n\'num_tokens\': \'32\',\n > \'phi_convict_threshold\': 5,\n\'range_request_timeout_in_ms\': 1,\n > \'read_request_timeout_in_ms\': 1,\n\'request_timeout_in_ms\': > 1,\n\'truncate_request_timeout_in_ms\': 1,\n > \'write_request_timeout_in_ms\': 1}\ncassandra.policies: INFO: Using > datacenter \'dc1\' for DCAwareRoundRobinPolicy (via host \'127.0.0.1\'); if > incorrect, please specify a local_dc to the constructor, or limit contact > points to local cluster nodes\ncassandra.cluster: INFO: New Cassandra host > discovered\n- >> end captured > logging << -' > {novnode} > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13308) Gossip breaks, Hint files not being deleted on nodetool decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-13308: --- Summary: Gossip breaks, Hint files not being deleted on nodetool decommission (was: Hint files not being deleted on nodetool decommission) > Gossip breaks, Hint files not being deleted on nodetool decommission > > > Key: CASSANDRA-13308 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13308 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging > Environment: Using Cassandra version 3.0.9 >Reporter: Arijit > Attachments: 28207.stack, logs, logs_decommissioned_node > > > How to reproduce the issue I'm seeing: > Shut down Cassandra on one node of the cluster and wait until we accumulate a > ton of hints. Start Cassandra on the node and immediately run "nodetool > decommission" on it. > The node streams its replicas and marks itself as DECOMMISSIONED, but other > nodes do not seem to see this message. "nodetool status" shows the > decommissioned node in state "UL" on all other nodes (it is also present in > system.peers), and Cassandra logs show that gossip tasks on nodes are not > proceeding (number of pending tasks keeps increasing). Jstack suggests that a > gossip task is blocked on hints dispatch (I can provide traces if this is not > obvious). Because the cluster is large and there are a lot of hints, this is > taking a while. > On inspecting "/var/lib/cassandra/hints" on the nodes, I see a bunch of hint > files for the decommissioned node. Documentation seems to suggest that these > hints should be deleted during "nodetool decommission", but it does not seem > to be the case here. This is the bug being reported. > To recover from this scenario, if I manually delete hint files on the nodes, > the hints dispatcher threads throw a bunch of exceptions and the > decommissioned node is now in state "DL" (perhaps it missed some gossip > messages?). The node is still in my "system.peers" table > Restarting Cassandra on all nodes after this step does not fix the issue (the > node remains in the peers table). In fact, after this point the > decommissioned node is in state "DN" -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12719) typo in cql examples
[ https://issues.apache.org/jira/browse/CASSANDRA-12719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906738#comment-15906738 ] Anthony Grasso commented on CASSANDRA-12719: +1 > typo in cql examples > > > Key: CASSANDRA-12719 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12719 > Project: Cassandra > Issue Type: Bug > Components: Documentation and Website >Reporter: suisuihan >Priority: Trivial > Attachments: 12719-3.11.txt, 12719-3.X.txt > > > Data Definition example use wrong definition -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-12719) typo in cql examples
[ https://issues.apache.org/jira/browse/CASSANDRA-12719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Grasso updated CASSANDRA-12719: --- Status: Ready to Commit (was: Patch Available) > typo in cql examples > > > Key: CASSANDRA-12719 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12719 > Project: Cassandra > Issue Type: Bug > Components: Documentation and Website >Reporter: suisuihan >Priority: Trivial > Attachments: 12719-3.11.txt, 12719-3.X.txt > > > Data Definition example use wrong definition -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[Cassandra Wiki] Update of "ContributorsGroup" by DaveBrosius
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification. The "ContributorsGroup" page has been changed by DaveBrosius: https://wiki.apache.org/cassandra/ContributorsGroup?action=diff&rev1=69&rev2=70 * AlekseyYeschenko * Alexis Wilke * AlicePorfirio + * AnthonyGrasso * bhamail * Ben McCann * BenedictElliottSmith
[jira] [Commented] (CASSANDRA-13304) Add checksumming to the native protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-13304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906751#comment-15906751 ] Michael Kjellman commented on CASSANDRA-13304: -- Yes, it's an actual issue [~tvdw]. TLS does solve but rotating of keys and assigning certs isn't a solved solution. Best to be assured we don't corrupt people just because they don't have TLS. > Add checksumming to the native protocol > --- > > Key: CASSANDRA-13304 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13304 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Kjellman >Assignee: Michael Kjellman > Attachments: 13304_v1.diff > > > The native binary transport implementation doesn't include checksums. This > makes it highly susceptible to silently inserting corrupted data either due > to hardware issues causing bit flips on the sender/client side, C*/receiver > side, or network in between. > Attaching an implementation that makes checksum'ing mandatory (assuming both > client and server know about a protocol version that supports checksums) -- > and also adds checksumming to clients that request compression. > The serialized format looks something like this: > {noformat} > * 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 > * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Number of Compressed Chunks | Compressed Length (e1)/ > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * / Compressed Length cont. (e1) |Uncompressed Length (e1) / > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Uncompressed Length cont. (e1)| CRC32 Checksum of Lengths (e1)| > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Checksum of Lengths cont. (e1)|Compressed Bytes (e1)+// > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum (e1) || > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |Compressed Length (e2) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Uncompressed Length (e2)| > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |CRC32 Checksum of Lengths (e2) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Compressed Bytes (e2) +// > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum (e2) || > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |Compressed Length (en) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Uncompressed Length (en)| > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |CRC32 Checksum of Lengths (en) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Compressed Bytes (en) +// > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum (en) || > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > {noformat} > The first pass here adds checksums only to the actual contents of the frame > body itself (and doesn't actually checksum lengths and headers). While it > would be great to fully add checksuming across the entire protocol, the > proposed implementation will ensure we at least catch corrupted data and > likely protect ourselves pretty well anyways. > I didn't go to the trouble of implementing a Snappy Checksum'ed Compressor > implementation as it's been deprecated for a while -- is really slow and > crappy compared to LZ4 -- and we should do everything in our power to make > sure no one in the community is still using it. I left it in (for obvious > backwards compatibility aspects) old for clients that don't know about the > new protocol. > The current protocol has a 256MB (max) frame body -- where the serialized > contents are simply written in to the frame body. > If the client sends a compression option in the startup, we will install a > FrameCompressor inline. Unfortunately, we went with a decision to treat the > frame body separately from the header bits etc in a given message. So, > instead we put a compressor implementation in the options and then if it's >
cassandra git commit: Add histogram for delay to deliver hints
Repository: cassandra Updated Branches: refs/heads/trunk 67e9a5ffd -> 0c5faef66 Add histogram for delay to deliver hints Patch by Jeff Jirsa; Reviewed by Stefan Podkowinski for CASSANDRA-13234 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0c5faef6 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0c5faef6 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0c5faef6 Branch: refs/heads/trunk Commit: 0c5faef664aa403998f59cc77c9b39861890cfa1 Parents: 67e9a5f Author: Jeff Jirsa Authored: Fri Feb 17 14:20:49 2017 -0800 Committer: Jeff Jirsa Committed: Sun Mar 12 21:37:11 2017 -0700 -- CHANGES.txt | 1 + doc/source/operating/metrics.rst| 25 .../cassandra/hints/EncodedHintMessage.java | 5 src/java/org/apache/cassandra/hints/Hint.java | 6 .../apache/cassandra/hints/HintsDispatcher.java | 3 ++ .../cassandra/metrics/HintsServiceMetrics.java | 31 6 files changed, 71 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0c5faef6/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 3acc2b4..7247f36 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -38,6 +38,7 @@ * Conditionally update index built status to avoid unnecessary flushes (CASSANDRA-12969) * cqlsh auto completion: refactor definition of compaction strategy options (CASSANDRA-12946) * Add support for arithmetic operators (CASSANDRA-11935) + * Add histogram for delay to deliver hints (CASSANDRA-13234) 3.11.0 http://git-wip-us.apache.org/repos/asf/cassandra/blob/0c5faef6/doc/source/operating/metrics.rst -- diff --git a/doc/source/operating/metrics.rst b/doc/source/operating/metrics.rst index 373d4d2..af2e36e 100644 --- a/doc/source/operating/metrics.rst +++ b/doc/source/operating/metrics.rst @@ -524,6 +524,31 @@ Hints_created- CounterNumber of hints on disk for this pee Hints_not_stored-CounterNumber of hints not stored for this peer, due to being down past the configured hint window. === == === +HintsService Metrics +^ + +Metrics specific to the Hints delivery service. There are also some metrics related to hints tracked in ``Storage Metrics`` + +These metrics include the peer endpoint **in the metric name** + +Reported name format: + +**Metric Name** +``org.apache.cassandra.metrics.HintsService.`` + +**JMX MBean** +``org.apache.cassandra.metrics:type=HintsService name=`` + +=== == === +NameType Description +=== == === +HintsSucceeded Meter A meter of the hints successfully delivered +HintsFailed Meter A meter of the hints that failed deliver +HintsTimedOutMeter A meter of the hints that timed out +Hints_delays Histogram Histogram of hint delivery delays (in milliseconds) +Hints_delays-Histogram Histogram of hint delivery delays (in milliseconds) per peer +=== == === + SSTable Index Metrics ^ http://git-wip-us.apache.org/repos/asf/cassandra/blob/0c5faef6/src/java/org/apache/cassandra/hints/EncodedHintMessage.java -- diff --git a/src/java/org/apache/cassandra/hints/EncodedHintMessage.java b/src/java/org/apache/cassandra/hints/EncodedHintMessage.java index 4fe05ac..50d1302 100644 --- a/src/java/org/apache/cassandra/hints/EncodedHintMessage.java +++ b/src/java/org/apache/cassandra/hints/EncodedHintMessage.java @@ -58,6 +58,11 @@ final class EncodedHintMessage return new MessageOut<>(MessagingService.Verb.HINT, this, serializer); } +public long getHintCreationTime() +{ +return Hint.serializer.getHintCreationTime(hint, version); +} + private static class Serializer implements IVersionedSerializer { public long serializedSize(EncodedHintMessage message, int version) http://git-wip-us.apache.org/repos/asf/cassandra/blob/0c5faef6/src/java/org/apache/cassandra/hints/Hint.java -- diff --git a/src/java/org/apache/cassandra/hints/Hint.java b/src/java/org/apache/cassandra/hints/Hint.java index 4e8f139..1582a3c 100644 --- a/src/java/org/apache/cassandra/hints/Hint.java +++ b/src/java/org/apache/cassandra/hints/Hint.java @@ -18,6 +18,7 @@ package org.apache.c
[jira] [Updated] (CASSANDRA-13234) Add histogram for delay to deliver hints
[ https://issues.apache.org/jira/browse/CASSANDRA-13234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-13234: --- Status: Ready to Commit (was: Patch Available) > Add histogram for delay to deliver hints > > > Key: CASSANDRA-13234 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13234 > Project: Cassandra > Issue Type: Improvement > Components: Observability >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa >Priority: Minor > Fix For: 4.0 > > > There is very little visibility into hint delivery in general - having > histograms available to understand how long it takes to deliver hints is > useful for operators to better identify problems. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (CASSANDRA-13234) Add histogram for delay to deliver hints
[ https://issues.apache.org/jira/browse/CASSANDRA-13234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906852#comment-15906852 ] Jeff Jirsa edited comment on CASSANDRA-13234 at 3/13/17 4:39 AM: - Committed as [0c5faef664aa403998f59cc77c9b39861890cfa1|https://github.com/apache/cassandra/commit/0c5faef664aa403998f59cc77c9b39861890cfa1] to trunk only (because [~iamaleksey] is right, it's a new feature, doesnt go into lowers). Added changes.txt entry and fixed up the extra nits at commit time. Thanks for the review, [~spo...@gmail.com] ! was (Author: jjirsa): Committed as [0c5faef664aa403998f59cc77c9b39861890cfa1|https://github.com/apache/cassandra/commit/0c5faef664aa403998f59cc77c9b39861890cfa1] to trunk only (because [~iamaleksey] , it's a new feature, doesnt go into lowers). Added changes.txt entry and fixed up the extra nits at commit time. Thanks for the review, [~spo...@gmail.com] ! > Add histogram for delay to deliver hints > > > Key: CASSANDRA-13234 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13234 > Project: Cassandra > Issue Type: Improvement > Components: Observability >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa >Priority: Minor > Fix For: 4.0 > > > There is very little visibility into hint delivery in general - having > histograms available to understand how long it takes to deliver hints is > useful for operators to better identify problems. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13234) Add histogram for delay to deliver hints
[ https://issues.apache.org/jira/browse/CASSANDRA-13234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-13234: --- Resolution: Fixed Status: Resolved (was: Ready to Commit) Committed as [0c5faef664aa403998f59cc77c9b39861890cfa1|https://github.com/apache/cassandra/commit/0c5faef664aa403998f59cc77c9b39861890cfa1] to trunk only (because [~iamaleksey] , it's a new feature, doesnt go into lowers). Added changes.txt entry and fixed up the extra nits at commit time. Thanks for the review, [~spo...@gmail.com] ! > Add histogram for delay to deliver hints > > > Key: CASSANDRA-13234 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13234 > Project: Cassandra > Issue Type: Improvement > Components: Observability >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa >Priority: Minor > Fix For: 4.0 > > > There is very little visibility into hint delivery in general - having > histograms available to understand how long it takes to deliver hints is > useful for operators to better identify problems. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[07/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/44f79bf2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/44f79bf2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/44f79bf2 Branch: refs/heads/trunk Commit: 44f79bf2f7a3a05f802014492ecbec67c49c02d0 Parents: aeca1d2 beb9658 Author: Jeff Jirsa Authored: Sun Mar 12 21:54:42 2017 -0700 Committer: Jeff Jirsa Committed: Sun Mar 12 21:56:00 2017 -0700 -- CHANGES.txt | 1 + .../cassandra/db/commitlog/CommitLogReplayer.java| 11 +++ .../apache/cassandra/db/commitlog/CommitLogTest.java | 15 ++- 3 files changed, 26 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/44f79bf2/CHANGES.txt -- diff --cc CHANGES.txt index 52a794b,2839291..140c860 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,21 -1,6 +1,22 @@@ -2.2.10 +3.0.13 + * Slice.isEmpty() returns false for some empty slices (CASSANDRA-13305) + * Add formatted row output to assertEmpty in CQL Tester (CASSANDRA-13238) +Merged from 2.2: ++ * Commitlog replay may fail if last mutation is within 4 bytes of end of segment (CASSANDRA-13282) * Fix queries updating multiple time the same list (CASSANDRA-13130) * Fix GRANT/REVOKE when keyspace isn't specified (CASSANDRA-13053) + + +3.0.12 + * Prevent data loss on upgrade 2.1 - 3.0 by adding component separator to LogRecord absolute path (CASSANDRA-13294) + * Improve testing on macOS by eliminating sigar logging (CASSANDRA-13233) + * Cqlsh copy-from should error out when csv contains invalid data for collections (CASSANDRA-13071) + * Update c.yaml doc for offheap memtables (CASSANDRA-13179) + * Faster StreamingHistogram (CASSANDRA-13038) + * Legacy deserializer can create unexpected boundary range tombstones (CASSANDRA-13237) + * Remove unnecessary assertion from AntiCompactionTest (CASSANDRA-13070) + * Fix cqlsh COPY for dates before 1900 (CASSANDRA-13185) +Merged from 2.2: * Avoid race on receiver by starting streaming sender thread after sending init message (CASSANDRA-12886) * Fix "multiple versions of ant detected..." when running ant test (CASSANDRA-13232) * Coalescing strategy sleeps too much (CASSANDRA-13090) http://git-wip-us.apache.org/repos/asf/cassandra/blob/44f79bf2/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java -- diff --cc src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java index d53f0f8,3cf4d0f..205c36a --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java @@@ -483,6 -439,17 +483,17 @@@ public class CommitLogReplaye int serializedSize; try { + // We rely on reading serialized size == 0 (LEGACY_END_OF_SEGMENT_MARKER) to identify the end + // of a segment, which happens naturally due to the 0 padding of the empty segment on creation. -// However, with 2.1 era commitlogs it's possible that the last mutation ended less than 4 bytes ++// However, it's possible with 2.1 era commitlogs that the last mutation ended less than 4 bytes + // from the end of the file, which means that we'll be unable to read an a full int and instead + // read an EOF here + if(end - reader.getFilePointer() < 4) + { + logger.trace("Not enough bytes left for another mutation in this CommitLog segment, continuing"); + return false; + } + // any of the reads may hit EOF serializedSize = reader.readInt(); if (serializedSize == LEGACY_END_OF_SEGMENT_MARKER) http://git-wip-us.apache.org/repos/asf/cassandra/blob/44f79bf2/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java -- diff --cc test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java index c4ab6ab,9b63885..90dc258 --- a/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java +++ b/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java @@@ -143,11 -137,25 +143,24 @@@ public class CommitLogTes } @Test + public void testRecoveryWithShortPadding() throws Exception + { + // If we have 0-3 bytes remaining, commitlog replayer + // should pass, because there's insufficient room + // left in the segment for the legacy size marker. + test
[02/10] cassandra git commit: Commitlog replay may fail if last mutation is within 4 bytes of end of segment
Commitlog replay may fail if last mutation is within 4 bytes of end of segment Patch by Jeff Jirsa; Reviewed by Branimir Lambov for CASSANDRA-13282 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/beb9658d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/beb9658d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/beb9658d Branch: refs/heads/cassandra-3.0 Commit: beb9658dd5e18e3a6a4e8431b6549ae4c33365a9 Parents: 5ef8a8b Author: Jeff Jirsa Authored: Sun Mar 12 21:54:04 2017 -0700 Committer: Jeff Jirsa Committed: Sun Mar 12 21:54:04 2017 -0700 -- CHANGES.txt | 1 + .../db/commitlog/CommitLogReplayer.java | 11 +++ .../cassandra/db/commitlog/CommitLogTest.java | 20 3 files changed, 28 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/beb9658d/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 09e4039..2839291 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -9,6 +9,7 @@ * Fix failing COPY TO STDOUT (CASSANDRA-12497) * Fix ColumnCounter::countAll behaviour for reverse queries (CASSANDRA-13222) * Exceptions encountered calling getSeeds() breaks OTC thread (CASSANDRA-13018) + * Commitlog replay may fail if last mutation is within 4 bytes of end of segment (CASSANDRA-13282) Merged from 2.1: * Remove unused repositories (CASSANDRA-13278) * Log stacktrace of uncaught exceptions (CASSANDRA-13108) http://git-wip-us.apache.org/repos/asf/cassandra/blob/beb9658d/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java index a58aeb4..3cf4d0f 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java @@ -439,6 +439,17 @@ public class CommitLogReplayer int serializedSize; try { +// We rely on reading serialized size == 0 (LEGACY_END_OF_SEGMENT_MARKER) to identify the end +// of a segment, which happens naturally due to the 0 padding of the empty segment on creation. +// However, with 2.1 era commitlogs it's possible that the last mutation ended less than 4 bytes +// from the end of the file, which means that we'll be unable to read an a full int and instead +// read an EOF here +if(end - reader.getFilePointer() < 4) +{ +logger.trace("Not enough bytes left for another mutation in this CommitLog segment, continuing"); +return false; +} + // any of the reads may hit EOF serializedSize = reader.readInt(); if (serializedSize == LEGACY_END_OF_SEGMENT_MARKER) http://git-wip-us.apache.org/repos/asf/cassandra/blob/beb9658d/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java -- diff --git a/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java b/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java index b42..9b63885 100644 --- a/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java +++ b/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java @@ -137,12 +137,24 @@ public class CommitLogTest } @Test +public void testRecoveryWithShortPadding() throws Exception +{ +// If we have 0-3 bytes remaining, commitlog replayer +// should pass, because there's insufficient room +// left in the segment for the legacy size marker. +testRecovery(new byte[1], null); +testRecovery(new byte[2], null); +testRecovery(new byte[3], null); +} + +@Test public void testRecoveryWithShortSize() throws Exception { -runExpecting(new WrappedRunnable() { -public void runMayThrow() throws Exception -{ -testRecovery(new byte[2], CommitLogDescriptor.VERSION_20); +runExpecting(new WrappedRunnable() { +public void runMayThrow() throws Exception { +byte[] data = new byte[5]; +data[3] = 1; // Not a legacy marker, give it a fake (short) size +testRecovery(data, CommitLogDescriptor.VERSION_20); } }, CommitLogReplayException.class); }
[06/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/44f79bf2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/44f79bf2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/44f79bf2 Branch: refs/heads/cassandra-3.0 Commit: 44f79bf2f7a3a05f802014492ecbec67c49c02d0 Parents: aeca1d2 beb9658 Author: Jeff Jirsa Authored: Sun Mar 12 21:54:42 2017 -0700 Committer: Jeff Jirsa Committed: Sun Mar 12 21:56:00 2017 -0700 -- CHANGES.txt | 1 + .../cassandra/db/commitlog/CommitLogReplayer.java| 11 +++ .../apache/cassandra/db/commitlog/CommitLogTest.java | 15 ++- 3 files changed, 26 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/44f79bf2/CHANGES.txt -- diff --cc CHANGES.txt index 52a794b,2839291..140c860 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,21 -1,6 +1,22 @@@ -2.2.10 +3.0.13 + * Slice.isEmpty() returns false for some empty slices (CASSANDRA-13305) + * Add formatted row output to assertEmpty in CQL Tester (CASSANDRA-13238) +Merged from 2.2: ++ * Commitlog replay may fail if last mutation is within 4 bytes of end of segment (CASSANDRA-13282) * Fix queries updating multiple time the same list (CASSANDRA-13130) * Fix GRANT/REVOKE when keyspace isn't specified (CASSANDRA-13053) + + +3.0.12 + * Prevent data loss on upgrade 2.1 - 3.0 by adding component separator to LogRecord absolute path (CASSANDRA-13294) + * Improve testing on macOS by eliminating sigar logging (CASSANDRA-13233) + * Cqlsh copy-from should error out when csv contains invalid data for collections (CASSANDRA-13071) + * Update c.yaml doc for offheap memtables (CASSANDRA-13179) + * Faster StreamingHistogram (CASSANDRA-13038) + * Legacy deserializer can create unexpected boundary range tombstones (CASSANDRA-13237) + * Remove unnecessary assertion from AntiCompactionTest (CASSANDRA-13070) + * Fix cqlsh COPY for dates before 1900 (CASSANDRA-13185) +Merged from 2.2: * Avoid race on receiver by starting streaming sender thread after sending init message (CASSANDRA-12886) * Fix "multiple versions of ant detected..." when running ant test (CASSANDRA-13232) * Coalescing strategy sleeps too much (CASSANDRA-13090) http://git-wip-us.apache.org/repos/asf/cassandra/blob/44f79bf2/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java -- diff --cc src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java index d53f0f8,3cf4d0f..205c36a --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java @@@ -483,6 -439,17 +483,17 @@@ public class CommitLogReplaye int serializedSize; try { + // We rely on reading serialized size == 0 (LEGACY_END_OF_SEGMENT_MARKER) to identify the end + // of a segment, which happens naturally due to the 0 padding of the empty segment on creation. -// However, with 2.1 era commitlogs it's possible that the last mutation ended less than 4 bytes ++// However, it's possible with 2.1 era commitlogs that the last mutation ended less than 4 bytes + // from the end of the file, which means that we'll be unable to read an a full int and instead + // read an EOF here + if(end - reader.getFilePointer() < 4) + { + logger.trace("Not enough bytes left for another mutation in this CommitLog segment, continuing"); + return false; + } + // any of the reads may hit EOF serializedSize = reader.readInt(); if (serializedSize == LEGACY_END_OF_SEGMENT_MARKER) http://git-wip-us.apache.org/repos/asf/cassandra/blob/44f79bf2/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java -- diff --cc test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java index c4ab6ab,9b63885..90dc258 --- a/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java +++ b/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java @@@ -143,11 -137,25 +143,24 @@@ public class CommitLogTes } @Test + public void testRecoveryWithShortPadding() throws Exception + { + // If we have 0-3 bytes remaining, commitlog replayer + // should pass, because there's insufficient room + // left in the segment for the legacy size marker. +
[03/10] cassandra git commit: Commitlog replay may fail if last mutation is within 4 bytes of end of segment
Commitlog replay may fail if last mutation is within 4 bytes of end of segment Patch by Jeff Jirsa; Reviewed by Branimir Lambov for CASSANDRA-13282 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/beb9658d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/beb9658d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/beb9658d Branch: refs/heads/cassandra-3.11 Commit: beb9658dd5e18e3a6a4e8431b6549ae4c33365a9 Parents: 5ef8a8b Author: Jeff Jirsa Authored: Sun Mar 12 21:54:04 2017 -0700 Committer: Jeff Jirsa Committed: Sun Mar 12 21:54:04 2017 -0700 -- CHANGES.txt | 1 + .../db/commitlog/CommitLogReplayer.java | 11 +++ .../cassandra/db/commitlog/CommitLogTest.java | 20 3 files changed, 28 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/beb9658d/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 09e4039..2839291 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -9,6 +9,7 @@ * Fix failing COPY TO STDOUT (CASSANDRA-12497) * Fix ColumnCounter::countAll behaviour for reverse queries (CASSANDRA-13222) * Exceptions encountered calling getSeeds() breaks OTC thread (CASSANDRA-13018) + * Commitlog replay may fail if last mutation is within 4 bytes of end of segment (CASSANDRA-13282) Merged from 2.1: * Remove unused repositories (CASSANDRA-13278) * Log stacktrace of uncaught exceptions (CASSANDRA-13108) http://git-wip-us.apache.org/repos/asf/cassandra/blob/beb9658d/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java index a58aeb4..3cf4d0f 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java @@ -439,6 +439,17 @@ public class CommitLogReplayer int serializedSize; try { +// We rely on reading serialized size == 0 (LEGACY_END_OF_SEGMENT_MARKER) to identify the end +// of a segment, which happens naturally due to the 0 padding of the empty segment on creation. +// However, with 2.1 era commitlogs it's possible that the last mutation ended less than 4 bytes +// from the end of the file, which means that we'll be unable to read an a full int and instead +// read an EOF here +if(end - reader.getFilePointer() < 4) +{ +logger.trace("Not enough bytes left for another mutation in this CommitLog segment, continuing"); +return false; +} + // any of the reads may hit EOF serializedSize = reader.readInt(); if (serializedSize == LEGACY_END_OF_SEGMENT_MARKER) http://git-wip-us.apache.org/repos/asf/cassandra/blob/beb9658d/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java -- diff --git a/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java b/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java index b42..9b63885 100644 --- a/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java +++ b/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java @@ -137,12 +137,24 @@ public class CommitLogTest } @Test +public void testRecoveryWithShortPadding() throws Exception +{ +// If we have 0-3 bytes remaining, commitlog replayer +// should pass, because there's insufficient room +// left in the segment for the legacy size marker. +testRecovery(new byte[1], null); +testRecovery(new byte[2], null); +testRecovery(new byte[3], null); +} + +@Test public void testRecoveryWithShortSize() throws Exception { -runExpecting(new WrappedRunnable() { -public void runMayThrow() throws Exception -{ -testRecovery(new byte[2], CommitLogDescriptor.VERSION_20); +runExpecting(new WrappedRunnable() { +public void runMayThrow() throws Exception { +byte[] data = new byte[5]; +data[3] = 1; // Not a legacy marker, give it a fake (short) size +testRecovery(data, CommitLogDescriptor.VERSION_20); } }, CommitLogReplayException.class); }
[09/10] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a2399d4d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a2399d4d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a2399d4d Branch: refs/heads/trunk Commit: a2399d4d309ac6b60a150ea20af8dc6f006d51ff Parents: 2c111d1 44f79bf Author: Jeff Jirsa Authored: Sun Mar 12 21:56:11 2017 -0700 Committer: Jeff Jirsa Committed: Sun Mar 12 21:57:25 2017 -0700 -- CHANGES.txt | 1 + .../cassandra/db/commitlog/CommitLogReader.java | 12 .../apache/cassandra/db/commitlog/CommitLogTest.java | 15 ++- 3 files changed, 27 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2399d4d/CHANGES.txt -- diff --cc CHANGES.txt index 302a028,140c860..ab28dd4 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -33,140 -43,6 +33,141 @@@ Merged from 3.0 live rows in sstabledump (CASSANDRA-13177) * Provide user workaround when system_schema.columns does not contain entries for a table that's in system_schema.tables (CASSANDRA-13180) +Merged from 2.2: ++ * Commitlog replay may fail if last mutation is within 4 bytes of end of segment (CASSANDRA-13282) + * Fix queries updating multiple time the same list (CASSANDRA-13130) + * Fix GRANT/REVOKE when keyspace isn't specified (CASSANDRA-13053) + * Fix flaky LongLeveledCompactionStrategyTest (CASSANDRA-12202) + * Fix failing COPY TO STDOUT (CASSANDRA-12497) + * Fix ColumnCounter::countAll behaviour for reverse queries (CASSANDRA-13222) + * Exceptions encountered calling getSeeds() breaks OTC thread (CASSANDRA-13018) + * Fix negative mean latency metric (CASSANDRA-12876) + * Use only one file pointer when creating commitlog segments (CASSANDRA-12539) +Merged from 2.1: + * Remove unused repositories (CASSANDRA-13278) + * Log stacktrace of uncaught exceptions (CASSANDRA-13108) + * Use portable stderr for java error in startup (CASSANDRA-13211) + * Fix Thread Leak in OutboundTcpConnection (CASSANDRA-13204) + * Coalescing strategy can enter infinite loop (CASSANDRA-13159) + + +3.10 + * Fix secondary index queries regression (CASSANDRA-13013) + * Add duration type to the protocol V5 (CASSANDRA-12850) + * Fix duration type validation (CASSANDRA-13143) + * Fix flaky GcCompactionTest (CASSANDRA-12664) + * Fix TestHintedHandoff.hintedhandoff_decom_test (CASSANDRA-13058) + * Fixed query monitoring for range queries (CASSANDRA-13050) + * Remove outboundBindAny configuration property (CASSANDRA-12673) + * Use correct bounds for all-data range when filtering (CASSANDRA-12666) + * Remove timing window in test case (CASSANDRA-12875) + * Resolve unit testing without JCE security libraries installed (CASSANDRA-12945) + * Fix inconsistencies in cassandra-stress load balancing policy (CASSANDRA-12919) + * Fix validation of non-frozen UDT cells (CASSANDRA-12916) + * Don't shut down socket input/output on StreamSession (CASSANDRA-12903) + * Fix Murmur3PartitionerTest (CASSANDRA-12858) + * Move cqlsh syntax rules into separate module and allow easier customization (CASSANDRA-12897) + * Fix CommitLogSegmentManagerTest (CASSANDRA-12283) + * Fix cassandra-stress truncate option (CASSANDRA-12695) + * Fix crossNode value when receiving messages (CASSANDRA-12791) + * Don't load MX4J beans twice (CASSANDRA-12869) + * Extend native protocol request flags, add versions to SUPPORTED, and introduce ProtocolVersion enum (CASSANDRA-12838) + * Set JOINING mode when running pre-join tasks (CASSANDRA-12836) + * remove net.mintern.primitive library due to license issue (CASSANDRA-12845) + * Properly format IPv6 addresses when logging JMX service URL (CASSANDRA-12454) + * Optimize the vnode allocation for single replica per DC (CASSANDRA-12777) + * Use non-token restrictions for bounds when token restrictions are overridden (CASSANDRA-12419) + * Fix CQLSH auto completion for PER PARTITION LIMIT (CASSANDRA-12803) + * Use different build directories for Eclipse and Ant (CASSANDRA-12466) + * Avoid potential AttributeError in cqlsh due to no table metadata (CASSANDRA-12815) + * Fix RandomReplicationAwareTokenAllocatorTest.testExistingCluster (CASSANDRA-12812) + * Upgrade commons-codec to 1.9 (CASSANDRA-12790) + * Make the fanout size for LeveledCompactionStrategy to be configurable (CASSANDRA-11550) + * Add duration data type (CASSANDRA-11873) + * Fix timeout in ReplicationAwareTokenAllocatorTest (CASSANDRA-12784) + * Improve sum aggregate functions (CASSANDRA-12417) + * Make cassandra.yaml docs for batch_size_*_threshold_in_kb reflect changes in CASSANDRA-10876 (CASSA
[10/10] cassandra git commit: Merge branch 'cassandra-3.11' into trunk
Merge branch 'cassandra-3.11' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/dd5251c4 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/dd5251c4 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/dd5251c4 Branch: refs/heads/trunk Commit: dd5251c46073d75c09e47f645b1d0ebc3a135411 Parents: 0c5faef a2399d4 Author: Jeff Jirsa Authored: Sun Mar 12 21:57:37 2017 -0700 Committer: Jeff Jirsa Committed: Sun Mar 12 21:58:36 2017 -0700 -- CHANGES.txt | 1 + .../apache/cassandra/db/commitlog/CommitLogReader.java | 12 2 files changed, 13 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/dd5251c4/CHANGES.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/dd5251c4/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java -- diff --cc src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java index 1da0cee,d1cb8d6..9eec477 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java @@@ -253,6 -265,18 +253,18 @@@ public class CommitLogReade int serializedSize; try { + // We rely on reading serialized size == 0 (LEGACY_END_OF_SEGMENT_MARKER) to identify the end + // of a segment, which happens naturally due to the 0 padding of the empty segment on creation. + // However, it's possible with 2.1 era commitlogs that the last mutation ended less than 4 bytes + // from the end of the file, which means that we'll be unable to read an a full int and instead + // read an EOF here + if(end - reader.getFilePointer() < 4) + { -logger.trace("Not enough bytes left for another mutation in this CommitLog segment, continuing"); ++logger.trace("Not enough bytes left for another mutation in this CommitLog section, continuing"); + statusTracker.requestTermination(); + return; + } + // any of the reads may hit EOF serializedSize = reader.readInt(); if (serializedSize == LEGACY_END_OF_SEGMENT_MARKER)
[04/10] cassandra git commit: Commitlog replay may fail if last mutation is within 4 bytes of end of segment
Commitlog replay may fail if last mutation is within 4 bytes of end of segment Patch by Jeff Jirsa; Reviewed by Branimir Lambov for CASSANDRA-13282 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/beb9658d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/beb9658d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/beb9658d Branch: refs/heads/trunk Commit: beb9658dd5e18e3a6a4e8431b6549ae4c33365a9 Parents: 5ef8a8b Author: Jeff Jirsa Authored: Sun Mar 12 21:54:04 2017 -0700 Committer: Jeff Jirsa Committed: Sun Mar 12 21:54:04 2017 -0700 -- CHANGES.txt | 1 + .../db/commitlog/CommitLogReplayer.java | 11 +++ .../cassandra/db/commitlog/CommitLogTest.java | 20 3 files changed, 28 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/beb9658d/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 09e4039..2839291 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -9,6 +9,7 @@ * Fix failing COPY TO STDOUT (CASSANDRA-12497) * Fix ColumnCounter::countAll behaviour for reverse queries (CASSANDRA-13222) * Exceptions encountered calling getSeeds() breaks OTC thread (CASSANDRA-13018) + * Commitlog replay may fail if last mutation is within 4 bytes of end of segment (CASSANDRA-13282) Merged from 2.1: * Remove unused repositories (CASSANDRA-13278) * Log stacktrace of uncaught exceptions (CASSANDRA-13108) http://git-wip-us.apache.org/repos/asf/cassandra/blob/beb9658d/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java index a58aeb4..3cf4d0f 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java @@ -439,6 +439,17 @@ public class CommitLogReplayer int serializedSize; try { +// We rely on reading serialized size == 0 (LEGACY_END_OF_SEGMENT_MARKER) to identify the end +// of a segment, which happens naturally due to the 0 padding of the empty segment on creation. +// However, with 2.1 era commitlogs it's possible that the last mutation ended less than 4 bytes +// from the end of the file, which means that we'll be unable to read an a full int and instead +// read an EOF here +if(end - reader.getFilePointer() < 4) +{ +logger.trace("Not enough bytes left for another mutation in this CommitLog segment, continuing"); +return false; +} + // any of the reads may hit EOF serializedSize = reader.readInt(); if (serializedSize == LEGACY_END_OF_SEGMENT_MARKER) http://git-wip-us.apache.org/repos/asf/cassandra/blob/beb9658d/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java -- diff --git a/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java b/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java index b42..9b63885 100644 --- a/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java +++ b/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java @@ -137,12 +137,24 @@ public class CommitLogTest } @Test +public void testRecoveryWithShortPadding() throws Exception +{ +// If we have 0-3 bytes remaining, commitlog replayer +// should pass, because there's insufficient room +// left in the segment for the legacy size marker. +testRecovery(new byte[1], null); +testRecovery(new byte[2], null); +testRecovery(new byte[3], null); +} + +@Test public void testRecoveryWithShortSize() throws Exception { -runExpecting(new WrappedRunnable() { -public void runMayThrow() throws Exception -{ -testRecovery(new byte[2], CommitLogDescriptor.VERSION_20); +runExpecting(new WrappedRunnable() { +public void runMayThrow() throws Exception { +byte[] data = new byte[5]; +data[3] = 1; // Not a legacy marker, give it a fake (short) size +testRecovery(data, CommitLogDescriptor.VERSION_20); } }, CommitLogReplayException.class); }
[05/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/44f79bf2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/44f79bf2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/44f79bf2 Branch: refs/heads/cassandra-3.11 Commit: 44f79bf2f7a3a05f802014492ecbec67c49c02d0 Parents: aeca1d2 beb9658 Author: Jeff Jirsa Authored: Sun Mar 12 21:54:42 2017 -0700 Committer: Jeff Jirsa Committed: Sun Mar 12 21:56:00 2017 -0700 -- CHANGES.txt | 1 + .../cassandra/db/commitlog/CommitLogReplayer.java| 11 +++ .../apache/cassandra/db/commitlog/CommitLogTest.java | 15 ++- 3 files changed, 26 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/44f79bf2/CHANGES.txt -- diff --cc CHANGES.txt index 52a794b,2839291..140c860 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,21 -1,6 +1,22 @@@ -2.2.10 +3.0.13 + * Slice.isEmpty() returns false for some empty slices (CASSANDRA-13305) + * Add formatted row output to assertEmpty in CQL Tester (CASSANDRA-13238) +Merged from 2.2: ++ * Commitlog replay may fail if last mutation is within 4 bytes of end of segment (CASSANDRA-13282) * Fix queries updating multiple time the same list (CASSANDRA-13130) * Fix GRANT/REVOKE when keyspace isn't specified (CASSANDRA-13053) + + +3.0.12 + * Prevent data loss on upgrade 2.1 - 3.0 by adding component separator to LogRecord absolute path (CASSANDRA-13294) + * Improve testing on macOS by eliminating sigar logging (CASSANDRA-13233) + * Cqlsh copy-from should error out when csv contains invalid data for collections (CASSANDRA-13071) + * Update c.yaml doc for offheap memtables (CASSANDRA-13179) + * Faster StreamingHistogram (CASSANDRA-13038) + * Legacy deserializer can create unexpected boundary range tombstones (CASSANDRA-13237) + * Remove unnecessary assertion from AntiCompactionTest (CASSANDRA-13070) + * Fix cqlsh COPY for dates before 1900 (CASSANDRA-13185) +Merged from 2.2: * Avoid race on receiver by starting streaming sender thread after sending init message (CASSANDRA-12886) * Fix "multiple versions of ant detected..." when running ant test (CASSANDRA-13232) * Coalescing strategy sleeps too much (CASSANDRA-13090) http://git-wip-us.apache.org/repos/asf/cassandra/blob/44f79bf2/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java -- diff --cc src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java index d53f0f8,3cf4d0f..205c36a --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java @@@ -483,6 -439,17 +483,17 @@@ public class CommitLogReplaye int serializedSize; try { + // We rely on reading serialized size == 0 (LEGACY_END_OF_SEGMENT_MARKER) to identify the end + // of a segment, which happens naturally due to the 0 padding of the empty segment on creation. -// However, with 2.1 era commitlogs it's possible that the last mutation ended less than 4 bytes ++// However, it's possible with 2.1 era commitlogs that the last mutation ended less than 4 bytes + // from the end of the file, which means that we'll be unable to read an a full int and instead + // read an EOF here + if(end - reader.getFilePointer() < 4) + { + logger.trace("Not enough bytes left for another mutation in this CommitLog segment, continuing"); + return false; + } + // any of the reads may hit EOF serializedSize = reader.readInt(); if (serializedSize == LEGACY_END_OF_SEGMENT_MARKER) http://git-wip-us.apache.org/repos/asf/cassandra/blob/44f79bf2/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java -- diff --cc test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java index c4ab6ab,9b63885..90dc258 --- a/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java +++ b/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java @@@ -143,11 -137,25 +143,24 @@@ public class CommitLogTes } @Test + public void testRecoveryWithShortPadding() throws Exception + { + // If we have 0-3 bytes remaining, commitlog replayer + // should pass, because there's insufficient room + // left in the segment for the legacy size marker. +
[01/10] cassandra git commit: Commitlog replay may fail if last mutation is within 4 bytes of end of segment
Repository: cassandra Updated Branches: refs/heads/cassandra-2.2 5ef8a8b40 -> beb9658dd refs/heads/cassandra-3.0 aeca1d2bd -> 44f79bf2f refs/heads/cassandra-3.11 2c111d15b -> a2399d4d3 refs/heads/trunk 0c5faef66 -> dd5251c46 Commitlog replay may fail if last mutation is within 4 bytes of end of segment Patch by Jeff Jirsa; Reviewed by Branimir Lambov for CASSANDRA-13282 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/beb9658d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/beb9658d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/beb9658d Branch: refs/heads/cassandra-2.2 Commit: beb9658dd5e18e3a6a4e8431b6549ae4c33365a9 Parents: 5ef8a8b Author: Jeff Jirsa Authored: Sun Mar 12 21:54:04 2017 -0700 Committer: Jeff Jirsa Committed: Sun Mar 12 21:54:04 2017 -0700 -- CHANGES.txt | 1 + .../db/commitlog/CommitLogReplayer.java | 11 +++ .../cassandra/db/commitlog/CommitLogTest.java | 20 3 files changed, 28 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/beb9658d/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 09e4039..2839291 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -9,6 +9,7 @@ * Fix failing COPY TO STDOUT (CASSANDRA-12497) * Fix ColumnCounter::countAll behaviour for reverse queries (CASSANDRA-13222) * Exceptions encountered calling getSeeds() breaks OTC thread (CASSANDRA-13018) + * Commitlog replay may fail if last mutation is within 4 bytes of end of segment (CASSANDRA-13282) Merged from 2.1: * Remove unused repositories (CASSANDRA-13278) * Log stacktrace of uncaught exceptions (CASSANDRA-13108) http://git-wip-us.apache.org/repos/asf/cassandra/blob/beb9658d/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java index a58aeb4..3cf4d0f 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java @@ -439,6 +439,17 @@ public class CommitLogReplayer int serializedSize; try { +// We rely on reading serialized size == 0 (LEGACY_END_OF_SEGMENT_MARKER) to identify the end +// of a segment, which happens naturally due to the 0 padding of the empty segment on creation. +// However, with 2.1 era commitlogs it's possible that the last mutation ended less than 4 bytes +// from the end of the file, which means that we'll be unable to read an a full int and instead +// read an EOF here +if(end - reader.getFilePointer() < 4) +{ +logger.trace("Not enough bytes left for another mutation in this CommitLog segment, continuing"); +return false; +} + // any of the reads may hit EOF serializedSize = reader.readInt(); if (serializedSize == LEGACY_END_OF_SEGMENT_MARKER) http://git-wip-us.apache.org/repos/asf/cassandra/blob/beb9658d/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java -- diff --git a/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java b/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java index b42..9b63885 100644 --- a/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java +++ b/test/unit/org/apache/cassandra/db/commitlog/CommitLogTest.java @@ -137,12 +137,24 @@ public class CommitLogTest } @Test +public void testRecoveryWithShortPadding() throws Exception +{ +// If we have 0-3 bytes remaining, commitlog replayer +// should pass, because there's insufficient room +// left in the segment for the legacy size marker. +testRecovery(new byte[1], null); +testRecovery(new byte[2], null); +testRecovery(new byte[3], null); +} + +@Test public void testRecoveryWithShortSize() throws Exception { -runExpecting(new WrappedRunnable() { -public void runMayThrow() throws Exception -{ -testRecovery(new byte[2], CommitLogDescriptor.VERSION_20); +runExpecting(new WrappedRunnable() { +public void runMayThrow() throws Exception { +byte[] data = new byte[5]; +data[3] = 1; // Not a legacy marker, give it a fake (
[08/10] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a2399d4d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a2399d4d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a2399d4d Branch: refs/heads/cassandra-3.11 Commit: a2399d4d309ac6b60a150ea20af8dc6f006d51ff Parents: 2c111d1 44f79bf Author: Jeff Jirsa Authored: Sun Mar 12 21:56:11 2017 -0700 Committer: Jeff Jirsa Committed: Sun Mar 12 21:57:25 2017 -0700 -- CHANGES.txt | 1 + .../cassandra/db/commitlog/CommitLogReader.java | 12 .../apache/cassandra/db/commitlog/CommitLogTest.java | 15 ++- 3 files changed, 27 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2399d4d/CHANGES.txt -- diff --cc CHANGES.txt index 302a028,140c860..ab28dd4 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -33,140 -43,6 +33,141 @@@ Merged from 3.0 live rows in sstabledump (CASSANDRA-13177) * Provide user workaround when system_schema.columns does not contain entries for a table that's in system_schema.tables (CASSANDRA-13180) +Merged from 2.2: ++ * Commitlog replay may fail if last mutation is within 4 bytes of end of segment (CASSANDRA-13282) + * Fix queries updating multiple time the same list (CASSANDRA-13130) + * Fix GRANT/REVOKE when keyspace isn't specified (CASSANDRA-13053) + * Fix flaky LongLeveledCompactionStrategyTest (CASSANDRA-12202) + * Fix failing COPY TO STDOUT (CASSANDRA-12497) + * Fix ColumnCounter::countAll behaviour for reverse queries (CASSANDRA-13222) + * Exceptions encountered calling getSeeds() breaks OTC thread (CASSANDRA-13018) + * Fix negative mean latency metric (CASSANDRA-12876) + * Use only one file pointer when creating commitlog segments (CASSANDRA-12539) +Merged from 2.1: + * Remove unused repositories (CASSANDRA-13278) + * Log stacktrace of uncaught exceptions (CASSANDRA-13108) + * Use portable stderr for java error in startup (CASSANDRA-13211) + * Fix Thread Leak in OutboundTcpConnection (CASSANDRA-13204) + * Coalescing strategy can enter infinite loop (CASSANDRA-13159) + + +3.10 + * Fix secondary index queries regression (CASSANDRA-13013) + * Add duration type to the protocol V5 (CASSANDRA-12850) + * Fix duration type validation (CASSANDRA-13143) + * Fix flaky GcCompactionTest (CASSANDRA-12664) + * Fix TestHintedHandoff.hintedhandoff_decom_test (CASSANDRA-13058) + * Fixed query monitoring for range queries (CASSANDRA-13050) + * Remove outboundBindAny configuration property (CASSANDRA-12673) + * Use correct bounds for all-data range when filtering (CASSANDRA-12666) + * Remove timing window in test case (CASSANDRA-12875) + * Resolve unit testing without JCE security libraries installed (CASSANDRA-12945) + * Fix inconsistencies in cassandra-stress load balancing policy (CASSANDRA-12919) + * Fix validation of non-frozen UDT cells (CASSANDRA-12916) + * Don't shut down socket input/output on StreamSession (CASSANDRA-12903) + * Fix Murmur3PartitionerTest (CASSANDRA-12858) + * Move cqlsh syntax rules into separate module and allow easier customization (CASSANDRA-12897) + * Fix CommitLogSegmentManagerTest (CASSANDRA-12283) + * Fix cassandra-stress truncate option (CASSANDRA-12695) + * Fix crossNode value when receiving messages (CASSANDRA-12791) + * Don't load MX4J beans twice (CASSANDRA-12869) + * Extend native protocol request flags, add versions to SUPPORTED, and introduce ProtocolVersion enum (CASSANDRA-12838) + * Set JOINING mode when running pre-join tasks (CASSANDRA-12836) + * remove net.mintern.primitive library due to license issue (CASSANDRA-12845) + * Properly format IPv6 addresses when logging JMX service URL (CASSANDRA-12454) + * Optimize the vnode allocation for single replica per DC (CASSANDRA-12777) + * Use non-token restrictions for bounds when token restrictions are overridden (CASSANDRA-12419) + * Fix CQLSH auto completion for PER PARTITION LIMIT (CASSANDRA-12803) + * Use different build directories for Eclipse and Ant (CASSANDRA-12466) + * Avoid potential AttributeError in cqlsh due to no table metadata (CASSANDRA-12815) + * Fix RandomReplicationAwareTokenAllocatorTest.testExistingCluster (CASSANDRA-12812) + * Upgrade commons-codec to 1.9 (CASSANDRA-12790) + * Make the fanout size for LeveledCompactionStrategy to be configurable (CASSANDRA-11550) + * Add duration data type (CASSANDRA-11873) + * Fix timeout in ReplicationAwareTokenAllocatorTest (CASSANDRA-12784) + * Improve sum aggregate functions (CASSANDRA-12417) + * Make cassandra.yaml docs for batch_size_*_threshold_in_kb reflect changes in CASSANDRA-108
[jira] [Updated] (CASSANDRA-13282) Commitlog replay may fail if last mutation is within 4 bytes of end of segment
[ https://issues.apache.org/jira/browse/CASSANDRA-13282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-13282: --- Resolution: Fixed Fix Version/s: (was: 3.11.x) (was: 4.x) (was: 3.0.x) (was: 2.2.x) 4.0 3.11.0 3.0.13 2.2.10 Status: Resolved (was: Ready to Commit) Committed to 2.2 as [beb9658dd5e18e3a6a4e8431b6549ae4c33365a9|https://github.com/apache/cassandra/commit/beb9658dd5e18e3a6a4e8431b6549ae4c33365a9] and merged up to trunk with the slightly more verbose comment to clarify it's from 2.1 era commitlog segments. Thanks for the quick review, [~blambov] > Commitlog replay may fail if last mutation is within 4 bytes of end of segment > -- > > Key: CASSANDRA-13282 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13282 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa > Fix For: 2.2.10, 3.0.13, 3.11.0, 4.0 > > Attachments: whiteboard.png > > > Following CASSANDRA-9749 , stricter correctness checks on commitlog replay > can incorrectly detect "corrupt segments" and stop commitlog replay (and > potentially stop cassandra, depending on the configured policy). In > {{CommitlogReplayer#replaySyncSection}} we try to read a 4 byte int > {{serializedSize}}, and if it's 0 (which will happen due to zeroing when the > segment was created), we continue on to the next segment. However, it appears > that if a mutation is sized such that it ends with 1, 2, or 3 bytes remaining > in the segment, we'll pass the {{isEOF}} on the while loop but fail to read > the {{serializedSize}} int, and fail. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
cassandra git commit: Fix typo in documentation
Repository: cassandra Updated Branches: refs/heads/trunk dd5251c46 -> 666a00089 Fix typo in documentation Patch by Suresh C; Reviewed by Anthony Grasso for CASSANDRA-12719 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/666a0008 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/666a0008 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/666a0008 Branch: refs/heads/trunk Commit: 666a00089cb40f8d84f2974070b18c0744acbb9b Parents: dd5251c Author: Suresh Authored: Mon Dec 12 22:39:43 2016 + Committer: Jeff Jirsa Committed: Sun Mar 12 22:02:26 2017 -0700 -- doc/source/cql/ddl.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/666a0008/doc/source/cql/ddl.rst -- diff --git a/doc/source/cql/ddl.rst b/doc/source/cql/ddl.rst index fb97e54..756b18e 100644 --- a/doc/source/cql/ddl.rst +++ b/doc/source/cql/ddl.rst @@ -359,7 +359,7 @@ instance, given:: a int, b int, c int, -PRIMARY KEY (a, c, d) +PRIMARY KEY (a, b, c) ); SELECT * FROM t;
[jira] [Assigned] (CASSANDRA-12719) typo in cql examples
[ https://issues.apache.org/jira/browse/CASSANDRA-12719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa reassigned CASSANDRA-12719: -- Assignee: suisuihan > typo in cql examples > > > Key: CASSANDRA-12719 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12719 > Project: Cassandra > Issue Type: Bug > Components: Documentation and Website >Reporter: suisuihan >Assignee: suisuihan >Priority: Trivial > Attachments: 12719-3.11.txt, 12719-3.X.txt > > > Data Definition example use wrong definition -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-12719) typo in cql examples
[ https://issues.apache.org/jira/browse/CASSANDRA-12719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-12719: --- Resolution: Fixed Reviewer: Anthony Grasso Status: Resolved (was: Ready to Commit) Thanks to both of you, for the patch and the confirmation that it was accurate! Committed as [666a00089cb40f8d84f2974070b18c0744acbb9b|https://github.com/apache/cassandra/commit/666a00089cb40f8d84f2974070b18c0744acbb9b] , it won't take effect immediately, but will be corrected in the very near future once the site docs are rebuilt. > typo in cql examples > > > Key: CASSANDRA-12719 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12719 > Project: Cassandra > Issue Type: Bug > Components: Documentation and Website >Reporter: suisuihan >Priority: Trivial > Attachments: 12719-3.11.txt, 12719-3.X.txt > > > Data Definition example use wrong definition -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12653) In-flight shadow round requests
[ https://issues.apache.org/jira/browse/CASSANDRA-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906881#comment-15906881 ] Jeff Jirsa commented on CASSANDRA-12653: Which of you two new committers (congrats!) feels like committing this? > In-flight shadow round requests > --- > > Key: CASSANDRA-12653 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12653 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski >Priority: Minor > Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x > > > Bootstrapping or replacing a node in the cluster requires to gather and check > some host IDs or tokens by doing a gossip "shadow round" once before joining > the cluster. This is done by sending a gossip SYN to all seeds until we > receive a response with the cluster state, from where we can move on in the > bootstrap process. Receiving a response will call the shadow round done and > calls {{Gossiper.resetEndpointStateMap}} for cleaning up the received state > again. > The issue here is that at this point there might be other in-flight requests > and it's very likely that shadow round responses from other seeds will be > received afterwards, while the current state of the bootstrap process doesn't > expect this to happen (e.g. gossiper may or may not be enabled). > One side effect will be that MigrationTasks are spawned for each shadow round > reply except the first. Tasks might or might not execute based on whether at > execution time {{Gossiper.resetEndpointStateMap}} had been called, which > effects the outcome of {{FailureDetector.instance.isAlive(endpoint))}} at > start of the task. You'll see error log messages such as follows when this > happend: > {noformat} > INFO [SharedPool-Worker-1] 2016-09-08 08:36:39,255 Gossiper.java:993 - > InetAddress /xx.xx.xx.xx is now UP > ERROR [MigrationStage:1]2016-09-08 08:36:39,255 FailureDetector.java:223 > - unknown endpoint /xx.xx.xx.xx > {noformat} > Although is isn't pretty, I currently don't see any serious harm from this, > but it would be good to get a second opinion (feel free to close as "wont > fix"). > /cc [~Stefania] [~thobbs] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13262) Incorrect cqlsh results when selecting same columns multiple times
[ https://issues.apache.org/jira/browse/CASSANDRA-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906894#comment-15906894 ] Murukesh Mohanan commented on CASSANDRA-13262: -- This is on the Python side, specifically because the results are converted to an OrderedDict ([bin/cqlsh.py#L500|https://github.com/apache/cassandra/blob/trunk/bin/cqlsh.py#L500]): {code} self.session.row_factory = ordered_dict_factory {code} Dictionaries of course don't support duplicate keys. The default row_factory is a named tuple, which also doesn't like duplicate keys, so we have changes to the key names: {code} Row(rack=u'rack1', timeout=5000, rack_=u'rack1') OrderedDict([(u'rack', u'rack1'), (u'timeout', 5000)]) {code} The simple fix would be explicitly list the values corresponding to each column in [print_static_result()|https://github.com/apache/cassandra/blob/trunk/bin/cqlsh.py#L1115]: {code} formatted_values = [map(self.myformat_value, [row[c] for c in column_names], cql_types) for row in result.current_rows] {code} And that sort of negates the point of using an OrderedDict in the first place. > Incorrect cqlsh results when selecting same columns multiple times > -- > > Key: CASSANDRA-13262 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13262 > Project: Cassandra > Issue Type: Bug >Reporter: Stefan Podkowinski >Priority: Minor > Labels: lhf > > Just stumbled over this on trunk: > {quote} > cqlsh:test1> select a, b, c from table1; > a | b| c > ---+--+- > 1 |b | 2 > 2 | null | 2.2 > (2 rows) > cqlsh:test1> select a, a, b, c from table1; > a | a| b | c > ---+--+-+-- > 1 |b | 2 | null > 2 | null | 2.2 | null > (2 rows) > cqlsh:test1> select a, a, a, b, c from table1; > a | a| a | b| c > ---+--+---+--+-- > 1 |b | 2.0 | null | null > 2 | null | 2.2004768 | null | null > {quote} > My guess is that his is on the Python side, but haven't really looked into it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13262) Incorrect cqlsh results when selecting same columns multiple times
[ https://issues.apache.org/jira/browse/CASSANDRA-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Murukesh Mohanan updated CASSANDRA-13262: - Attachment: 0001-Fix-incorrect-cqlsh-results-when-selecting-same-colu.patch > Incorrect cqlsh results when selecting same columns multiple times > -- > > Key: CASSANDRA-13262 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13262 > Project: Cassandra > Issue Type: Bug >Reporter: Stefan Podkowinski >Priority: Minor > Labels: lhf > Attachments: > 0001-Fix-incorrect-cqlsh-results-when-selecting-same-colu.patch > > > Just stumbled over this on trunk: > {quote} > cqlsh:test1> select a, b, c from table1; > a | b| c > ---+--+- > 1 |b | 2 > 2 | null | 2.2 > (2 rows) > cqlsh:test1> select a, a, b, c from table1; > a | a| b | c > ---+--+-+-- > 1 |b | 2 | null > 2 | null | 2.2 | null > (2 rows) > cqlsh:test1> select a, a, a, b, c from table1; > a | a| a | b| c > ---+--+---+--+-- > 1 |b | 2.0 | null | null > 2 | null | 2.2004768 | null | null > {quote} > My guess is that his is on the Python side, but haven't really looked into it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13262) Incorrect cqlsh results when selecting same columns multiple times
[ https://issues.apache.org/jira/browse/CASSANDRA-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Murukesh Mohanan updated CASSANDRA-13262: - Assignee: Murukesh Mohanan Status: Patch Available (was: Open) > Incorrect cqlsh results when selecting same columns multiple times > -- > > Key: CASSANDRA-13262 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13262 > Project: Cassandra > Issue Type: Bug >Reporter: Stefan Podkowinski >Assignee: Murukesh Mohanan >Priority: Minor > Labels: lhf > Attachments: > 0001-Fix-incorrect-cqlsh-results-when-selecting-same-colu.patch > > > Just stumbled over this on trunk: > {quote} > cqlsh:test1> select a, b, c from table1; > a | b| c > ---+--+- > 1 |b | 2 > 2 | null | 2.2 > (2 rows) > cqlsh:test1> select a, a, b, c from table1; > a | a| b | c > ---+--+-+-- > 1 |b | 2 | null > 2 | null | 2.2 | null > (2 rows) > cqlsh:test1> select a, a, a, b, c from table1; > a | a| a | b| c > ---+--+---+--+-- > 1 |b | 2.0 | null | null > 2 | null | 2.2004768 | null | null > {quote} > My guess is that his is on the Python side, but haven't really looked into it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13262) Incorrect cqlsh results when selecting same columns multiple times
[ https://issues.apache.org/jira/browse/CASSANDRA-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Murukesh Mohanan updated CASSANDRA-13262: - Attachment: 0001-Fix-incorrect-cqlsh-results-when-selecting-same-colu.patch Sorry for the double upload, there was noise due to some whitespace differences in the previous one. > Incorrect cqlsh results when selecting same columns multiple times > -- > > Key: CASSANDRA-13262 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13262 > Project: Cassandra > Issue Type: Bug >Reporter: Stefan Podkowinski >Assignee: Murukesh Mohanan >Priority: Minor > Labels: lhf > Attachments: > 0001-Fix-incorrect-cqlsh-results-when-selecting-same-colu.patch > > > Just stumbled over this on trunk: > {quote} > cqlsh:test1> select a, b, c from table1; > a | b| c > ---+--+- > 1 |b | 2 > 2 | null | 2.2 > (2 rows) > cqlsh:test1> select a, a, b, c from table1; > a | a| b | c > ---+--+-+-- > 1 |b | 2 | null > 2 | null | 2.2 | null > (2 rows) > cqlsh:test1> select a, a, a, b, c from table1; > a | a| a | b| c > ---+--+---+--+-- > 1 |b | 2.0 | null | null > 2 | null | 2.2004768 | null | null > {quote} > My guess is that his is on the Python side, but haven't really looked into it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13262) Incorrect cqlsh results when selecting same columns multiple times
[ https://issues.apache.org/jira/browse/CASSANDRA-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Murukesh Mohanan updated CASSANDRA-13262: - Attachment: (was: 0001-Fix-incorrect-cqlsh-results-when-selecting-same-colu.patch) > Incorrect cqlsh results when selecting same columns multiple times > -- > > Key: CASSANDRA-13262 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13262 > Project: Cassandra > Issue Type: Bug >Reporter: Stefan Podkowinski >Assignee: Murukesh Mohanan >Priority: Minor > Labels: lhf > Attachments: > 0001-Fix-incorrect-cqlsh-results-when-selecting-same-colu.patch > > > Just stumbled over this on trunk: > {quote} > cqlsh:test1> select a, b, c from table1; > a | b| c > ---+--+- > 1 |b | 2 > 2 | null | 2.2 > (2 rows) > cqlsh:test1> select a, a, b, c from table1; > a | a| b | c > ---+--+-+-- > 1 |b | 2 | null > 2 | null | 2.2 | null > (2 rows) > cqlsh:test1> select a, a, a, b, c from table1; > a | a| a | b| c > ---+--+---+--+-- > 1 |b | 2.0 | null | null > 2 | null | 2.2004768 | null | null > {quote} > My guess is that his is on the Python side, but haven't really looked into it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)