Hello,

 I am trying whithout success to set up a user-defined rebalancer and I don't understand what could be the problem.

I set-up a standalone cluster with 3 instances, add a resource with 20 partitions (and in a nutshell follow the user-defined rebalancer tutorial). With the same code, when i used the semi-auto balancer, i got this ideal state :

IdealState for crawlDB:
{
  "id" : "crawlDB",
  "mapFields" : {
    "crawlDB_0" : { "standalone-1.localhost" : "MASTER" },
    "crawlDB_1" : { "standalone-1.localhost" : "MASTER" },
    "crawlDB_10" : { "standalone-1.localhost" : "MASTER" },
    "crawlDB_11" : { "standalone-1.localhost" : "MASTER" },
    "crawlDB_12" : { "standalone-1.localhost" : "MASTER" },
    "crawlDB_13" : { "standalone-1.localhost" : "MASTER" },
    "crawlDB_14" : { "standalone-1.localhost" : "MASTER" },
    "crawlDB_15" : { "standalone-2.localhost" : "MASTER" },
    "crawlDB_16" : { "standalone-2.localhost" : "MASTER" },
    "crawlDB_17" : { "standalone-3.localhost" : "MASTER" },
    "crawlDB_18" : { "standalone-2.localhost" : "MASTER" },
    "crawlDB_19" : { "standalone-2.localhost" : "MASTER" },
    "crawlDB_2" : { "standalone-3.localhost" : "MASTER" },
    "crawlDB_3" : { "standalone-2.localhost" : "MASTER" },
    "crawlDB_4" : { "standalone-3.localhost" : "MASTER" },
    "crawlDB_5" : { "standalone-3.localhost" : "MASTER" },
    "crawlDB_6" : { "standalone-3.localhost" : "MASTER" },
    "crawlDB_7" : { "standalone-2.localhost" : "MASTER" },
    "crawlDB_8" : { "standalone-3.localhost" : "MASTER" },
    "crawlDB_9" : { "standalone-2.localhost" : "MASTER" }
  },
  "listFields" : {
    "crawlDB_0" : [ "standalone-1.localhost" ],
    "crawlDB_1" : [ "standalone-1.localhost" ],
    "crawlDB_10" : [ "standalone-1.localhost" ],
    "crawlDB_11" : [ "standalone-1.localhost" ],
    "crawlDB_12" : [ "standalone-1.localhost" ],
    "crawlDB_13" : [ "standalone-1.localhost" ],
    "crawlDB_14" : [ "standalone-1.localhost" ],
    "crawlDB_15" : [ "standalone-2.localhost" ],
    "crawlDB_16" : [ "standalone-2.localhost" ],
    "crawlDB_17" : [ "standalone-3.localhost" ],
    "crawlDB_18" : [ "standalone-2.localhost" ],
    "crawlDB_19" : [ "standalone-2.localhost" ],
    "crawlDB_2" : [ "standalone-3.localhost" ],
    "crawlDB_3" : [ "standalone-2.localhost" ],
    "crawlDB_4" : [ "standalone-3.localhost" ],
    "crawlDB_5" : [ "standalone-3.localhost" ],
    "crawlDB_6" : [ "standalone-3.localhost" ],
    "crawlDB_7" : [ "standalone-2.localhost" ],
    "crawlDB_8" : [ "standalone-3.localhost" ],
    "crawlDB_9" : [ "standalone-2.localhost" ]
  },
  "simpleFields" : {
    "IDEAL_STATE_MODE" : "AUTO",
    "NUM_PARTITIONS" : "20",
    "REBALANCE_MODE" : "SEMI_AUTO",
    "REBALANCE_STRATEGY" : "DEFAULT",
    "REPLICAS" : "1",
    "STATE_MODEL_DEF_REF" : "MasterSlave",
    "STATE_MODEL_FACTORY_NAME" : "DEFAULT"
  }
}

which is correct. Now if used my own rebalancer (based on a simple modulo to compute preferences and state-map), the generated mapping remains empty. Just for testing purpose, I made a test where i used the SemiAutoRebalancer class as a user-defined balancer and i got the same result :

IdealState for crawlDB:
{
  "id" : "crawlDB",
  "mapFields" : {
    "crawlDB_0" : { },
    "crawlDB_1" : { },
    "crawlDB_10" : { },
    "crawlDB_11" : { },
    "crawlDB_12" : { },
    "crawlDB_13" : { },
    "crawlDB_14" : { },
    "crawlDB_15" : { },
    "crawlDB_16" : { },
    "crawlDB_17" : { },
    "crawlDB_18" : { },
    "crawlDB_19" : { },
    "crawlDB_2" : { },
    "crawlDB_3" : { },
    "crawlDB_4" : { },
    "crawlDB_5" : { },
    "crawlDB_6" : { },
    "crawlDB_7" : { },
    "crawlDB_8" : { },
    "crawlDB_9" : { }
  },
  "listFields" : {
    "crawlDB_0" : [ ],
    "crawlDB_1" : [ ],
    "crawlDB_10" : [ ],
    "crawlDB_11" : [ ],
    "crawlDB_12" : [ ],
    "crawlDB_13" : [ ],
    "crawlDB_14" : [ ],
    "crawlDB_15" : [ ],
    "crawlDB_16" : [ ],
    "crawlDB_17" : [ ],
    "crawlDB_18" : [ ],
    "crawlDB_19" : [ ],
    "crawlDB_2" : [ ],
    "crawlDB_3" : [ ],
    "crawlDB_4" : [ ],
    "crawlDB_5" : [ ],
    "crawlDB_6" : [ ],
    "crawlDB_7" : [ ],
    "crawlDB_8" : [ ],
    "crawlDB_9" : [ ]
  },
  "simpleFields" : {
    "IDEAL_STATE_MODE" : "AUTO",
    "NUM_PARTITIONS" : "20",
    "REBALANCER_CLASS_NAME" : "org.apache.helix.controller.rebalancer.SemiAutoRebalancer",
    "REBALANCE_MODE" : "USER_DEFINED",
    "REBALANCE_STRATEGY" : "DEFAULT",
    "REPLICAS" : "1",
    "STATE_MODEL_DEF_REF" : "MasterSlave",
    "STATE_MODEL_FACTORY_NAME" : "DEFAULT"
  }
}

I also try to change the rebalance strategy since the default one (auto) doesn't seem to compute a mapping if no live instance are present. But the same stragegy is used with the semi-auto balancer above and it works. Any clue ?

Thanks !




Reply via email to