[ https://issues.apache.org/jira/browse/KUDU-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexey Serbin updated KUDU-3008: -------------------------------- Labels: high-availability location-awareness master (was: ) > Don't put all replicas into one location with 2 locations and odd replica > factor. > --------------------------------------------------------------------------------- > > Key: KUDU-3008 > URL: https://issues.apache.org/jira/browse/KUDU-3008 > Project: Kudu > Issue Type: Improvement > Reporter: ZhangYao > Assignee: ZhangYao > Priority: Minor > Labels: high-availability, location-awareness, master > > Accidentally I found that kudu will put all replicas of a table into one > location when we only have 2 locations and the replica factor is odd. Below > is the case: > {{location /DEFAULT/22254 has 3 tservers}} > {{location /DEFAULT/22255 has 3 tservers}} > {{Table created: replica factor = 3, tablet = 8.}} > {{Before I create the table, the ksck tablet summary is:}} > > {code:java} > Tablet Replica Count by Tablet Server > UUID | Host | > Replica Count | Location > ----------------------------------+------------------------------------+---------------+---------------- > 5f5ddec364834ce59282d37388010f06 | opencomputeoffline.xxxxxx.net:7056 | 10 > | /DEFAULT/22255 > 00f24c36d39a49e8b77ff43b3bcbf0c9 | opencomputeoffline.xxxxxx.net:7054 | 10 > | /DEFAULT/22255 > d0091ae869704458865b9b079ad2389e | opencomputeoffline.xxxxxx.net:7055 | 9 > | /DEFAULT/22255 > 507547dd183c4474855d55f7bdd9d526 | opencomputeoffline.xxxxxx.net:7052 | 7 > | /DEFAULT/22254 > c6a2b6e99f0a43308d9e5773b2d8c729 | opencomputeoffline.xxxxxx.net:7053 | 6 > | /DEFAULT/22254 > 031808c37385477fb063e50fbc614f44 | opencomputeoffline.xxxxxx.net:7050 | 6 > | /DEFAULT/22254 {code} > {{After I create the table, the ksck tablet summary is:}} > > {code:java} > Tablet Replica Count by Tablet Server > UUID | Host | Replica Count | Location > ----------------------------------+------------------------------------+---------------+---------------- > 507547dd183c4474855d55f7bdd9d526 | opencomputeoffline.xxxxxx.net:7052 | 15 | > /DEFAULT/22254 > c6a2b6e99f0a43308d9e5773b2d8c729 | opencomputeoffline.xxxxxx.net:7053 | 14 | > /DEFAULT/22254 > 031808c37385477fb063e50fbc614f44 | opencomputeoffline.xxxxxx.net:7050 | 14 | > /DEFAULT/22254 > 5f5ddec364834ce59282d37388010f06 | opencomputeoffline.xxxxxx.net:7056 | 10 | > /DEFAULT/22255 > 00f24c36d39a49e8b77ff43b3bcbf0c9 | opencomputeoffline.xxxxxx.net:7054 | 10 | > /DEFAULT/22255 > d0091ae869704458865b9b079ad2389e | opencomputeoffline.xxxxxx.net:7055 | 9 | > /DEFAULT/22255 {code} > I found that /DEFAULT/22255 doesn't have new replica and all replicas are > located in /DEFAULT/22254. When look into the code I found that in > PlacementPolicy::SelectLocation when location num is 2, we only take care > about even replica factor and try to spread replicas evenly in 2 locations. I > think we should also consider about the odd replica factor. When there is 2 > locations, although there must have one location contains replicas more than > half but it better than contains all replicas. -- This message was sent by Atlassian Jira (v8.3.4#803005)