[jira] [Commented] (CASSANDRA-5632) Cross-DC bandwidth-saving broken
[ https://issues.apache.org/jira/browse/CASSANDRA-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13773487#comment-13773487 ] Jeremy Hanna commented on CASSANDRA-5632: - I believe for the issue that was fixed here it originated in 1.2 and was present up through 1.2.5. > Cross-DC bandwidth-saving broken > > > Key: CASSANDRA-5632 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5632 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.0 >Reporter: Jonathan Ellis >Assignee: Jonathan Ellis > Fix For: 1.2.6 > > Attachments: 5632.txt, 5632-v2.txt, cassandra-topology.properties, > fix_patch_bug.log > > > We group messages by destination as follows to avoid sending multiple > messages to a remote datacenter: > {code} > // Multimap that holds onto all the messages and addresses meant for > a specific datacenter > Map> dcMessages > {code} > When we cleaned out the MessageProducer stuff for 2.0, this code > {code} > Multimap messages = > dcMessages.get(dc); > ... > > messages.put(producer.getMessage(Gossiper.instance.getVersion(destination)), > destination); > {code} > turned into > {code} > Multimap messages = > dcMessages.get(dc); > ... > messages.put(rm.createMessage(), destination); > {code} > Thus, we weren't actually grouping anything anymore -- each destination > replica was stored under a separate Message key, unlike under the old > CachingMessageProducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5632) Cross-DC bandwidth-saving broken
[ https://issues.apache.org/jira/browse/CASSANDRA-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13773293#comment-13773293 ] Robert Coli commented on CASSANDRA-5632: Do you have an "affects" version for this issue? Description says it started when a re-write for 2.0 started, but it affects 1.2.x so I'm confused? :D > Cross-DC bandwidth-saving broken > > > Key: CASSANDRA-5632 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5632 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.0 >Reporter: Jonathan Ellis >Assignee: Jonathan Ellis > Fix For: 1.2.6 > > Attachments: 5632.txt, 5632-v2.txt, cassandra-topology.properties, > fix_patch_bug.log > > > We group messages by destination as follows to avoid sending multiple > messages to a remote datacenter: > {code} > // Multimap that holds onto all the messages and addresses meant for > a specific datacenter > Map> dcMessages > {code} > When we cleaned out the MessageProducer stuff for 2.0, this code > {code} > Multimap messages = > dcMessages.get(dc); > ... > > messages.put(producer.getMessage(Gossiper.instance.getVersion(destination)), > destination); > {code} > turned into > {code} > Multimap messages = > dcMessages.get(dc); > ... > messages.put(rm.createMessage(), destination); > {code} > Thus, we weren't actually grouping anything anymore -- each destination > replica was stored under a separate Message key, unlike under the old > CachingMessageProducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5632) Cross-DC bandwidth-saving broken
[ https://issues.apache.org/jira/browse/CASSANDRA-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13688840#comment-13688840 ] Jonathan Ellis commented on CASSANDRA-5632: --- Thanks, Ryan. Go ahead and create a ticket for that and I'll put my next junior hire on it. :) > Cross-DC bandwidth-saving broken > > > Key: CASSANDRA-5632 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5632 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.0 >Reporter: Jonathan Ellis >Assignee: Jonathan Ellis > Fix For: 1.2.6 > > Attachments: 5632.txt, 5632-v2.txt, cassandra-topology.properties, > fix_patch_bug.log > > > We group messages by destination as follows to avoid sending multiple > messages to a remote datacenter: > {code} > // Multimap that holds onto all the messages and addresses meant for > a specific datacenter > Map> dcMessages > {code} > When we cleaned out the MessageProducer stuff for 2.0, this code > {code} > Multimap messages = > dcMessages.get(dc); > ... > > messages.put(producer.getMessage(Gossiper.instance.getVersion(destination)), > destination); > {code} > turned into > {code} > Multimap messages = > dcMessages.get(dc); > ... > messages.put(rm.createMessage(), destination); > {code} > Thus, we weren't actually grouping anything anymore -- each destination > replica was stored under a separate Message key, unlike under the old > CachingMessageProducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5632) Cross-DC bandwidth-saving broken
[ https://issues.apache.org/jira/browse/CASSANDRA-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13688838#comment-13688838 ] Ryan McGuire commented on CASSANDRA-5632: - I've [written a dtest|https://github.com/riptano/cassandra-dtest/pull/13/files] that automates the testing of this issue. This test clearly shows that the coordinator was talking to more than one node in a different datacenter, and the patch resolves that issue. It also verifies that [~hayato.shimizu]'s comment about using the same forwarder is not happening now. [~jbellis] - I noticed in your [blog post about tracing|http://www.datastax.com/dev/blog/advanced-request-tracing-in-cassandra-1-2] you said not to rely on the activity field, well, that's exactly what I'm doing here. So, +1 to the idea of making those enums so this doesn't break in the future. > Cross-DC bandwidth-saving broken > > > Key: CASSANDRA-5632 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5632 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.0 >Reporter: Jonathan Ellis >Assignee: Jonathan Ellis > Fix For: 1.2.6 > > Attachments: 5632.txt, 5632-v2.txt, cassandra-topology.properties, > fix_patch_bug.log > > > We group messages by destination as follows to avoid sending multiple > messages to a remote datacenter: > {code} > // Multimap that holds onto all the messages and addresses meant for > a specific datacenter > Map> dcMessages > {code} > When we cleaned out the MessageProducer stuff for 2.0, this code > {code} > Multimap messages = > dcMessages.get(dc); > ... > > messages.put(producer.getMessage(Gossiper.instance.getVersion(destination)), > destination); > {code} > turned into > {code} > Multimap messages = > dcMessages.get(dc); > ... > messages.put(rm.createMessage(), destination); > {code} > Thus, we weren't actually grouping anything anymore -- each destination > replica was stored under a separate Message key, unlike under the old > CachingMessageProducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5632) Cross-DC bandwidth-saving broken
[ https://issues.apache.org/jira/browse/CASSANDRA-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13688451#comment-13688451 ] Hayato Shimizu commented on CASSANDRA-5632: --- It seems that issue 1. in my earlier comment was fixed with 1.2.5 by Yuki (CASSANDRA-5424), where in 1.2.4 NetworkTopologyStrategy.calculateNaturalEndpoints HashSet replicas was changed to LinkedHashSet, so please ignore. > Cross-DC bandwidth-saving broken > > > Key: CASSANDRA-5632 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5632 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.0 >Reporter: Jonathan Ellis >Assignee: Jonathan Ellis > Fix For: 1.2.6 > > Attachments: 5632.txt, 5632-v2.txt, cassandra-topology.properties, > fix_patch_bug.log > > > We group messages by destination as follows to avoid sending multiple > messages to a remote datacenter: > {code} > // Multimap that holds onto all the messages and addresses meant for > a specific datacenter > Map> dcMessages > {code} > When we cleaned out the MessageProducer stuff for 2.0, this code > {code} > Multimap messages = > dcMessages.get(dc); > ... > > messages.put(producer.getMessage(Gossiper.instance.getVersion(destination)), > destination); > {code} > turned into > {code} > Multimap messages = > dcMessages.get(dc); > ... > messages.put(rm.createMessage(), destination); > {code} > Thus, we weren't actually grouping anything anymore -- each destination > replica was stored under a separate Message key, unlike under the old > CachingMessageProducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5632) Cross-DC bandwidth-saving broken
[ https://issues.apache.org/jira/browse/CASSANDRA-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687511#comment-13687511 ] Dave Brosius commented on CASSANDRA-5632: - other than simple FF, +LGTM -import org.apache.cassandra.tracing.Tracing; +import org.apache.cassandra.tracing.4Tracing; > Cross-DC bandwidth-saving broken > > > Key: CASSANDRA-5632 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5632 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.0 >Reporter: Jonathan Ellis >Assignee: Jonathan Ellis > Fix For: 1.2.6 > > Attachments: 5632.txt, 5632-v2.txt, cassandra-topology.properties, > fix_patch_bug.log > > > We group messages by destination as follows to avoid sending multiple > messages to a remote datacenter: > {code} > // Multimap that holds onto all the messages and addresses meant for > a specific datacenter > Map> dcMessages > {code} > When we cleaned out the MessageProducer stuff for 2.0, this code > {code} > Multimap messages = > dcMessages.get(dc); > ... > > messages.put(producer.getMessage(Gossiper.instance.getVersion(destination)), > destination); > {code} > turned into > {code} > Multimap messages = > dcMessages.get(dc); > ... > messages.put(rm.createMessage(), destination); > {code} > Thus, we weren't actually grouping anything anymore -- each destination > replica was stored under a separate Message key, unlike under the old > CachingMessageProducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5632) Cross-DC bandwidth-saving broken
[ https://issues.apache.org/jira/browse/CASSANDRA-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13686986#comment-13686986 ] Jonathan Ellis commented on CASSANDRA-5632: --- I note that .55 doesn't ever log "Sending message" to .50 either. So the message gets dropped somewhere inside .55's MessagingService. cross-node_timeout is my best guess. Next-best guess is that there's a reconnect somehow dropping the message a la CASSANDRA-5393. > Cross-DC bandwidth-saving broken > > > Key: CASSANDRA-5632 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5632 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.0 >Reporter: Jonathan Ellis >Assignee: Jonathan Ellis > Fix For: 1.2.6 > > Attachments: 5632.txt, cassandra-topology.properties, > fix_patch_bug.log > > > We group messages by destination as follows to avoid sending multiple > messages to a remote datacenter: > {code} > // Multimap that holds onto all the messages and addresses meant for > a specific datacenter > Map> dcMessages > {code} > When we cleaned out the MessageProducer stuff for 2.0, this code > {code} > Multimap messages = > dcMessages.get(dc); > ... > > messages.put(producer.getMessage(Gossiper.instance.getVersion(destination)), > destination); > {code} > turned into > {code} > Multimap messages = > dcMessages.get(dc); > ... > messages.put(rm.createMessage(), destination); > {code} > Thus, we weren't actually grouping anything anymore -- each destination > replica was stored under a separate Message key, unlike under the old > CachingMessageProducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5632) Cross-DC bandwidth-saving broken
[ https://issues.apache.org/jira/browse/CASSANDRA-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13686886#comment-13686886 ] Jonathan Ellis commented on CASSANDRA-5632: --- You're not running with cross_node_timeout enabled, are you? Because some of these clocks are minutes apart. {noformat} # Enable operation timeout information exchange between nodes to accurately # measure request timeouts, If disabled cassandra will assuming the request # was forwarded to the replica instantly by the coordinator # # Warning: before enabling this property make sure to ntp is installed # and the times are synchronized between the nodes. cross_node_timeout: false {noformat} > Cross-DC bandwidth-saving broken > > > Key: CASSANDRA-5632 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5632 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.0 >Reporter: Jonathan Ellis >Assignee: Jonathan Ellis > Fix For: 1.2.6 > > Attachments: 5632.txt, cassandra-topology.properties, > fix_patch_bug.log > > > We group messages by destination as follows to avoid sending multiple > messages to a remote datacenter: > {code} > // Multimap that holds onto all the messages and addresses meant for > a specific datacenter > Map> dcMessages > {code} > When we cleaned out the MessageProducer stuff for 2.0, this code > {code} > Multimap messages = > dcMessages.get(dc); > ... > > messages.put(producer.getMessage(Gossiper.instance.getVersion(destination)), > destination); > {code} > turned into > {code} > Multimap messages = > dcMessages.get(dc); > ... > messages.put(rm.createMessage(), destination); > {code} > Thus, we weren't actually grouping anything anymore -- each destination > replica was stored under a separate Message key, unlike under the old > CachingMessageProducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5632) Cross-DC bandwidth-saving broken
[ https://issues.apache.org/jira/browse/CASSANDRA-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13686817#comment-13686817 ] Jonathan Ellis commented on CASSANDRA-5632: --- bq. Secondary DC coordinator node is always the same node. This introduces a bottleneck in the secondary DC. It's the same node for a given token range. When all token ranges are considered, it is evenly spread. bq. RPC timeout occurs from a node that is not verifiable in the trace output. Well. That's not a very useful error message, is it. :) > Cross-DC bandwidth-saving broken > > > Key: CASSANDRA-5632 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5632 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.0 >Reporter: Jonathan Ellis >Assignee: Jonathan Ellis > Fix For: 1.2.6 > > Attachments: 5632.txt, cassandra-topology.properties, > fix_patch_bug.log > > > We group messages by destination as follows to avoid sending multiple > messages to a remote datacenter: > {code} > // Multimap that holds onto all the messages and addresses meant for > a specific datacenter > Map> dcMessages > {code} > When we cleaned out the MessageProducer stuff for 2.0, this code > {code} > Multimap messages = > dcMessages.get(dc); > ... > > messages.put(producer.getMessage(Gossiper.instance.getVersion(destination)), > destination); > {code} > turned into > {code} > Multimap messages = > dcMessages.get(dc); > ... > messages.put(rm.createMessage(), destination); > {code} > Thus, we weren't actually grouping anything anymore -- each destination > replica was stored under a separate Message key, unlike under the old > CachingMessageProducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5632) Cross-DC bandwidth-saving broken
[ https://issues.apache.org/jira/browse/CASSANDRA-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13686828#comment-13686828 ] Jonathan Ellis commented on CASSANDRA-5632: --- .55 is the forwarding node in DC2. It logs that it applies the mutation and acks it: {noformat} Enqueuing response to /192.168.56.50 | 05:57:33,825 | 192.168.56.55 | 14785 {noformat} But there is no "Processing response from /192.168.56.55" line logged by .50. Hmm. > Cross-DC bandwidth-saving broken > > > Key: CASSANDRA-5632 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5632 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.0 >Reporter: Jonathan Ellis >Assignee: Jonathan Ellis > Fix For: 1.2.6 > > Attachments: 5632.txt, cassandra-topology.properties, > fix_patch_bug.log > > > We group messages by destination as follows to avoid sending multiple > messages to a remote datacenter: > {code} > // Multimap that holds onto all the messages and addresses meant for > a specific datacenter > Map> dcMessages > {code} > When we cleaned out the MessageProducer stuff for 2.0, this code > {code} > Multimap messages = > dcMessages.get(dc); > ... > > messages.put(producer.getMessage(Gossiper.instance.getVersion(destination)), > destination); > {code} > turned into > {code} > Multimap messages = > dcMessages.get(dc); > ... > messages.put(rm.createMessage(), destination); > {code} > Thus, we weren't actually grouping anything anymore -- each destination > replica was stored under a separate Message key, unlike under the old > CachingMessageProducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5632) Cross-DC bandwidth-saving broken
[ https://issues.apache.org/jira/browse/CASSANDRA-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13686809#comment-13686809 ] Hayato Shimizu commented on CASSANDRA-5632: --- The patch fixes the issue of bandwidth-saving. However, there seems to be two regressive issues being introduced. 1. Secondary DC coordinator node is always the same node. This introduces a bottleneck in the secondary DC. 2. When using cqlsh, with EACH_QUORUM/ALL, with tracing on, on a row insert, RPC timeout occurs from a node that is not verifiable in the trace output. Trace output has been attached for a 6 node cluster, DC1:3, DC2:3 replication factor configuration. network-topology configuration is also attached for clarity. > Cross-DC bandwidth-saving broken > > > Key: CASSANDRA-5632 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5632 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.0 >Reporter: Jonathan Ellis >Assignee: Jonathan Ellis > Fix For: 1.2.6 > > Attachments: 5632.txt, cassandra-topology.properties, > fix_patch_bug.log > > > We group messages by destination as follows to avoid sending multiple > messages to a remote datacenter: > {code} > // Multimap that holds onto all the messages and addresses meant for > a specific datacenter > Map> dcMessages > {code} > When we cleaned out the MessageProducer stuff for 2.0, this code > {code} > Multimap messages = > dcMessages.get(dc); > ... > > messages.put(producer.getMessage(Gossiper.instance.getVersion(destination)), > destination); > {code} > turned into > {code} > Multimap messages = > dcMessages.get(dc); > ... > messages.put(rm.createMessage(), destination); > {code} > Thus, we weren't actually grouping anything anymore -- each destination > replica was stored under a separate Message key, unlike under the old > CachingMessageProducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5632) Cross-DC bandwidth-saving broken
[ https://issues.apache.org/jira/browse/CASSANDRA-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13682842#comment-13682842 ] Hayato Shimizu commented on CASSANDRA-5632: --- Tested and fixed the issue. > Cross-DC bandwidth-saving broken > > > Key: CASSANDRA-5632 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5632 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.0 >Reporter: Jonathan Ellis >Assignee: Jonathan Ellis > Fix For: 1.2.6 > > Attachments: 5632.txt > > > We group messages by destination as follows to avoid sending multiple > messages to a remote datacenter: > {code} > // Multimap that holds onto all the messages and addresses meant for > a specific datacenter > Map> dcMessages > {code} > When we cleaned out the MessageProducer stuff for 2.0, this code > {code} > Multimap messages = > dcMessages.get(dc); > ... > > messages.put(producer.getMessage(Gossiper.instance.getVersion(destination)), > destination); > {code} > turned into > {code} > Multimap messages = > dcMessages.get(dc); > ... > messages.put(rm.createMessage(), destination); > {code} > Thus, we weren't actually grouping anything anymore -- each destination > replica was stored under a separate Message key, unlike under the old > CachingMessageProducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira