[jira] [Commented] (CASSANDRA-14389) Resolve local address binding in 4.0
[ https://issues.apache.org/jira/browse/CASSANDRA-14389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16447020#comment-16447020 ] Dinesh Joshi commented on CASSANDRA-14389: -- Yes that looks good. I was fiddling with different ways of writing the test and the constructor that I introduced in {{StreamSession}} made the factory redundant. Thank you for taking care of it. > Resolve local address binding in 4.0 > > > Key: CASSANDRA-14389 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14389 > Project: Cassandra > Issue Type: Bug >Reporter: Jason Brown >Assignee: Dinesh Joshi >Priority: Minor > Fix For: 4.x > > > CASSANDRA-8457/CASSANDRA-12229 introduced a regression against > CASSANDRA-12673. This was discovered with CASSANDRA-14362 and moved here for > resolution independent of that ticket. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14389) Resolve local address binding in 4.0
[ https://issues.apache.org/jira/browse/CASSANDRA-14389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16446874#comment-16446874 ] Jason Brown commented on CASSANDRA-14389: - I didn't understand the {{NettyStreamingMessageSenderFactory}} and subclassing of {{NettyStreamingMessageSender}} in the {{StreamSessionTest}}. You weren't really taking any advantage of the subclass as all {{NSMSStub}} did was override the parent's constructor, only to call the parent's constructor. Thus I've removed {{NettyStreamingMessageSenderFactory}} (ignore the comment on the commit) and cleaned up the test. I've pushed up a commit to my branch, on top of your squash, and ran the testa again. If you are good with that, I'm +1 on the rest of the patch. > Resolve local address binding in 4.0 > > > Key: CASSANDRA-14389 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14389 > Project: Cassandra > Issue Type: Bug >Reporter: Jason Brown >Assignee: Dinesh Joshi >Priority: Minor > Fix For: 4.x > > > CASSANDRA-8457/CASSANDRA-12229 introduced a regression against > CASSANDRA-12673. This was discovered with CASSANDRA-14362 and moved here for > resolution independent of that ticket. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14389) Resolve local address binding in 4.0
[ https://issues.apache.org/jira/browse/CASSANDRA-14389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16446553#comment-16446553 ] Dinesh Joshi commented on CASSANDRA-14389: -- I have incorporated the changes from your commit as well as added tests for \{{MessagingService#getPreferredRemoteAddr}}. Also squashed all commit into one. > Resolve local address binding in 4.0 > > > Key: CASSANDRA-14389 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14389 > Project: Cassandra > Issue Type: Bug >Reporter: Jason Brown >Assignee: Dinesh Joshi >Priority: Minor > Fix For: 4.x > > > CASSANDRA-8457/CASSANDRA-12229 introduced a regression against > CASSANDRA-12673. This was discovered with CASSANDRA-14362 and moved here for > resolution independent of that ticket. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14389) Resolve local address binding in 4.0
[ https://issues.apache.org/jira/browse/CASSANDRA-14389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16446396#comment-16446396 ] Dinesh Joshi commented on CASSANDRA-14389: -- Hi [~jasobrown], thank you for reviewing. I have incorporated your changes. I also added a test for {{StreamSession}}. > Resolve local address binding in 4.0 > > > Key: CASSANDRA-14389 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14389 > Project: Cassandra > Issue Type: Bug >Reporter: Jason Brown >Assignee: Dinesh Joshi >Priority: Minor > Fix For: 4.x > > > CASSANDRA-8457/CASSANDRA-12229 introduced a regression against > CASSANDRA-12673. This was discovered with CASSANDRA-14362 and moved here for > resolution independent of that ticket. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14389) Resolve local address binding in 4.0
[ https://issues.apache.org/jira/browse/CASSANDRA-14389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16446288#comment-16446288 ] Jason Brown commented on CASSANDRA-14389: - @Dniesh, This looks pretty good. However, I wonder if we can dump all the places where we naively plumb the 'connecting' address though. I took a pass at it here: ||14389|| |[branch|https://github.com/jasobrown/cassandra/tree/14389]| |[utests & dtests|https://circleci.com/gh/jasobrown/workflows/cassandra/tree/14389]| || The only change of interest is in {{MessagingService#getPreferredRemoteAddr()}}, I check into {{SystemKeyspace.getPreferredIP(to)}} if there was no {{OutboundMessagingPool}} set up in {[MessagingService#channelManagers}}. wdyt? > Resolve local address binding in 4.0 > > > Key: CASSANDRA-14389 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14389 > Project: Cassandra > Issue Type: Bug >Reporter: Jason Brown >Assignee: Dinesh Joshi >Priority: Minor > Fix For: 4.x > > > CASSANDRA-8457/CASSANDRA-12229 introduced a regression against > CASSANDRA-12673. This was discovered with CASSANDRA-14362 and moved here for > resolution independent of that ticket. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14389) Resolve local address binding in 4.0
[ https://issues.apache.org/jira/browse/CASSANDRA-14389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16445297#comment-16445297 ] Dinesh Joshi commented on CASSANDRA-14389: -- ||trunk|| |[branch|https://github.com/dineshjoshi/cassandra/tree/CASSANDRA-14389-trunk-fix-streaming]| |[utests & dtests|https://circleci.com/gh/dineshjoshi/workflows/cassandra/tree/CASSANDRA-14389-trunk-fix-streaming]| || > Resolve local address binding in 4.0 > > > Key: CASSANDRA-14389 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14389 > Project: Cassandra > Issue Type: Bug >Reporter: Jason Brown >Assignee: Dinesh Joshi >Priority: Minor > Fix For: 4.x > > > CASSANDRA-8457/CASSANDRA-12229 introduced a regression against > CASSANDRA-12673. This was discovered with CASSANDRA-14362 and moved here for > resolution independent of that ticket. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14389) Resolve local address binding in 4.0
[ https://issues.apache.org/jira/browse/CASSANDRA-14389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16444266#comment-16444266 ] Dinesh Joshi commented on CASSANDRA-14389: -- I found the issue. When you leave the local side of the socket unbound, the kernel will prefer the IP address that matches the remote IP. Say node1 with IP {{127.0.0.1}} wants to open a connection to node2 with IP {{127.0.0.2}}, the socket would look like {{<127.0.0.2:61002, 127.0.0.2:7000>}} on node1. This seems to confuse the streaming code. Here's how - Say we have three nodes node1, node2 & node3 with IPs {{127.0.0.1, 127.0.0.2, 127.0.0.3}}. node1 has data and node3 is bootstrapping. It requests a stream from node1. So node3 is the `peer` in this case and node1's code execution is described below - * node1 receives the request ({{StreamingInboundHandler#deriveSession}}) and {{StreamResultFuture#initReceivingSide}} creates a new {{StreamResultFuture}} and calls {{attachConnection()}}. At this point it has two sets of IP & Ports from the peer. They are identified by the variable `{{from}}` & expression `{{channel.remoteAddress()}}` a.k.a `{{connecting}}` ). * {{StreamResultFuture#attachConnection calls StreamCoordinator#getOrCreateSessionById}} passing the from IP & {{InetAddressAndPort.getByAddressOverrideDefaults(connecting, from.port)}} (!!!) * The key observation here is `from` is the IP that the peer sent in the `{{StreamMessageHeader}}` while `connecting` is the remote IP of the peer. * {{StreamCoordinator#getOrCreateSessionById}} subsequently calls {{StreamCoordinator#getOrCreateHostData(peer)}}. So we're indexing the {{peerSessions}} by the `{{peer}}` IP address. We also end up creating a `{{StreamSession}}` in the process. * During `{{StreamSession}}` creation, we end up passing the `{{peer}}` and `{{connecting}}` IPs. We use the `connecting` IP to establish the outbound connection to the peer. ({{NettyStreamingMessageSender}} is now connected to `{{connecting}}` IP on port {{7000}}). In our case, since we leave the local side of the socket unbound, although the `{{peer}}` correctly sets its IP to {{127.0.0.3}} in the {{StreamMessageHeader}}, the {{localAddress}} that the kernel chooses for it is {{127.0.0.1}}. On the inbound node1 seems to think that the `peer` is {{127.0.0.3}} however the connecting IP address should be {{127.0.0.1}}. Therefore, it prefers that IP when trying to establish an outbound session. In fact it establishes a connection to itself leading to the `{{Unknown peer requested: 127.0.0.1:7000}}` exception. Note that along the way it actually drops the ephemeral port and instead uses the port returned by {{MessagingService#portFor}}. Streaming code seems to rely on the perceived remote IP address of the host rather than the one that is set in the message header. I am not sure if preferring the IP address set in the header is the correct approach. > Resolve local address binding in 4.0 > > > Key: CASSANDRA-14389 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14389 > Project: Cassandra > Issue Type: Bug >Reporter: Jason Brown >Assignee: Jason Brown >Priority: Minor > Fix For: 4.x > > > CASSANDRA-8457/CASSANDRA-12229 introduced a regression against > CASSANDRA-12673. This was discovered with CASSANDRA-14362 and moved here for > resolution independent of that ticket. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org