[ https://issues.apache.org/jira/browse/KUDU-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16147745#comment-16147745 ]
Michael Ho commented on KUDU-2086: ---------------------------------- Actually, I wonder if it has to do with the endianness. Network address is usually represented as big endian so the contiguous range of IP addresses would actually differ in the most significant byte (in a 32-bit integer) when represented as little endian. Need to run some simple experiments to verify the behavior. > Uneven assignment of connections to Reactor threads creates skew and limits > transfer throughput > ----------------------------------------------------------------------------------------------- > > Key: KUDU-2086 > URL: https://issues.apache.org/jira/browse/KUDU-2086 > Project: Kudu > Issue Type: Bug > Components: rpc > Affects Versions: 1.4.0 > Reporter: Mostafa Mokhtar > Assignee: Michael Ho > > Uneven assignment of connections to Reactor threads causes a couple of > reactor threads to run @100% which limits overall system throughput. > Increasing the number of reactor threads alleviate the problem but some > threads are still running much hotter than others. > Snapshot below is from a 20 node cluster > {code} > ps -T -p 69387 | grep rpc | grep -v "00:00" | awk '{print $4,$0}' | sort > 00:03:17 69387 69596 ? 00:03:17 rpc reactor-695 > 00:03:20 69387 69632 ? 00:03:20 rpc reactor-696 > 00:03:21 69387 69607 ? 00:03:21 rpc reactor-696 > 00:03:25 69387 69629 ? 00:03:25 rpc reactor-696 > 00:03:26 69387 69594 ? 00:03:26 rpc reactor-695 > 00:03:34 69387 69595 ? 00:03:34 rpc reactor-695 > 00:03:35 69387 69625 ? 00:03:35 rpc reactor-696 > 00:03:38 69387 69570 ? 00:03:38 rpc reactor-695 > 00:03:38 69387 69620 ? 00:03:38 rpc reactor-696 > 00:03:47 69387 69639 ? 00:03:47 rpc reactor-696 > 00:03:48 69387 69593 ? 00:03:48 rpc reactor-695 > 00:03:49 69387 69591 ? 00:03:49 rpc reactor-695 > 00:04:04 69387 69600 ? 00:04:04 rpc reactor-696 > 00:07:16 69387 69640 ? 00:07:16 rpc reactor-696 > 00:07:39 69387 69616 ? 00:07:39 rpc reactor-696 > 00:07:54 69387 69572 ? 00:07:54 rpc reactor-695 > 00:09:10 69387 69613 ? 00:09:10 rpc reactor-696 > 00:09:28 69387 69567 ? 00:09:28 rpc reactor-695 > 00:09:39 69387 69603 ? 00:09:39 rpc reactor-696 > 00:09:42 69387 69641 ? 00:09:42 rpc reactor-696 > 00:09:59 69387 69604 ? 00:09:59 rpc reactor-696 > 00:10:06 69387 69623 ? 00:10:06 rpc reactor-696 > 00:10:43 69387 69636 ? 00:10:43 rpc reactor-696 > 00:10:59 69387 69642 ? 00:10:59 rpc reactor-696 > 00:11:28 69387 69585 ? 00:11:28 rpc reactor-695 > 00:12:43 69387 69598 ? 00:12:43 rpc reactor-695 > 00:15:42 69387 69578 ? 00:15:42 rpc reactor-695 > 00:16:10 69387 69614 ? 00:16:10 rpc reactor-696 > 00:17:43 69387 69575 ? 00:17:43 rpc reactor-695 > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)