Re: Negative Core Node Numbers

Chris Ulicny Thu, 04 Jan 2018 17:15:29 -0800

Thanks Anshum,

They don't seem to be consistently numbered on any particular collection
creation, but the same numbers will be reused (eventually). After about 3
or 4 tries, I got the same numbered replica on the same machine, so
something is being cleared out. The numbers are never consecutive though,
they start around 1, seem to be relatively sequential with gaps until about
120 or so, and then are all over the place. One other thing that seems to
be consistent on each new collection: the numbers at the end of
"core_node#" never appear as the number at the end of
"testcollection_shard1_replica_n#". Parts of the cluster state are below.


 "shard1":{
            "range":"80000000-8146ffff", "state":"active", "replicas":{
              "core_node2":{"core":"testcollection_shard1_replica_n1",
"base_url":"http://host5:8080/solr";, "node_name":"host5:8080_solr",
"state":"active","type":"NRT", "leader":"true"},
              "core_node4":{"core":"testcollection_shard1_replica_n3",
"base_url":"http://host3:8080/solr";, "node_name":"host3:8080_solr",
"state":"active","type":"NRT"}}},
"shard2":{
            "range":"81470000-828effff", "state":"active", "replicas":{
              "core_node6":{"core":"testcollection_shard2_replica_n5",
"base_url":"http://host1:8080/solr";, "node_name":"host1:8080_solr",
"state":"active","type":"NRT"},
              "core_node8":{"core":"testcollection_shard2_replica_n7",
"base_url":"http://host2:8080/solr";, "node_name":"host2:8080_solr",
"state":"active","type":"NRT", "leader":"true"}}}
...
"shard170":{
            "range":"58510000-5998ffff", "state":"active", "replicas":{

"core_node800109264":{"core":"testcollection_shard170_replica_n-2046950790
<(204)%20695-0790>", "base_url":"http://host2:8080/solr";,
"node_name":"host2:8080_solr","state":"active", "type":"NRT",
"leader":"true"},

"core_node766423250":{"core":"testcollection_shard170_replica_n-2080505220",
"base_url":"http://host4:8080/solr";,
"node_name":"host4:8080_solr","state":"active", "type":"NRT"}}}
...

Is there a way to view the counter in a deployed environment, or is it only
accessible through debugging solr?

The setup I've been trying was 200 shards with 2 replicas each, but trying
to create a collection with 1 shard and 200 replicas of it results in the
same situation with abnormal numbers.

A few other details on the setup: 5 solr nodes (v7.1.0), 3 zookeeper nodes
(v3.4.11), Ubuntu 16.04, all hosts (zk & solr) are machines in Google's
Cloud environment.


On Thu, Jan 4, 2018 at 5:53 PM Anshum Gupta <ansh...@apple.com> wrote:

> Hi Chris,
>
> The core node numbers should be cleared out when the collection is
> deleted. Is that something you see consistently ?
>
> P.S: I just tried creating a collection with 1 shard and 200 replicas and
> saw the core node numbers as expected. On deleting and recreating the
> collection, I saw that the counter was reset. Just to be clear, I tried
> this on master.
>
> -Anshum
>
>
>
> On Jan 4, 2018, at 12:16 PM, Chris Ulicny <culicny@iq.media> wrote:
>
> Hi,
>
> In 7.1, how does solr determine the numbers that are assigned to the
> replicas? I'm familiar with the earlier naming conventions from 6.3, but I
> wanted to know if there was supposed to be any connection between the
> "_n##" suffix and the number assigned to the "core_node##" name since they
> don't seem to follow the old convention. As an example node from
> clusterstatus for a testcollection with replication factor 2.
>
> "core_node91":{
>                "core":"testcollection_shard22_replica_n84",
>                "base_url":"http://host:8080/solr";,
>                "node_name":"host:8080_solr",
>                "state":"active",
>                "type":"NRT",
>                "leader":"true"}
>
> Along the same lines, when creating the testcollection with 200 shards and
> replication factor of 2, I am also getting nodes that have negative numbers
> assigned to them which looks a lot like an int overflow issue. From the
> cluster status:
>
>          "shard157":{
>            "range":"47ae0000-48f4ffff",
>            "state":"active",
>            "replicas":{
>              "core_node1675945628":{
>                "core":"testcollection _shard157_replica_n-1174535610",
>                "base_url":"http://host1:8080/solr";,
>                "node_name":"host1:8080_solr",
>                "state":"active",
>                "type":"NRT"},
>              "core_node1642259614":{
>                "core":"testcollection _shard157_replica_n-1208090040",
>                "base_url":"http://host2:8080/solr";,
>                "node_name":"host2:8080_solr",
>                "state":"active",
>                "type":"NRT",
>                "leader":"true"}}}
>
> This keeps happening even when the collection is successfully deleted (no
> directories or files left on disk), the entire cluster is shutdown, and the
> zookeeper chroot path cleared out of all content. The only thing that
> happened prior to this cycle was a single failed collection creation which
> seemed to clean itself up properly, after which everything was shutdown and
> cleaned from zookeeper as well.
>
> Is there something else that is keeping track of those values that wasn't
> cleared out? Or is this now the expected behavior for the numerical
> assignments to replicas?
>
> Thanks,
> Chris
>
>
>

Re: Negative Core Node Numbers

Reply via email to