Not aware of any that survive node restart, though in the past, there were races around starting an expansion while one node was partitioned/down (and missing the initial gossip / UP). A heap dump could have told us a bit more conclusively, but it's hard to guess for now.
On Mon, Oct 23, 2023 at 3:22 PM Jaydeep Chovatia <chovatia.jayd...@gmail.com> wrote: > The issue was persisting on a few nodes despite no changes to the > topology. Even node restarting did not help. Only after we evacuated those > nodes, the issue got resolved. > > Do you think of a possible situation under which this could happen? > > Jaydeep > > On Sat, Oct 21, 2023 at 10:25 AM Jaydeep Chovatia < > chovatia.jayd...@gmail.com> wrote: > >> Thanks, Jeff! >> I will keep this thread updated on our findings. >> >> Jaydeep >> >> On Sat, Oct 21, 2023 at 9:37 AM Jeff Jirsa <jji...@gmail.com> wrote: >> >>> That code path was added to protect against invalid gossip states >>> >>> For this logger to be issued, the coordinator receiving the query must >>> identify a set of replicas holding the data to serve the read, and one of >>> the selected replicas must disagree that it’s a replica based on its view >>> of the token ring >>> >>> This probably means that at least one node in your cluster has an >>> invalid view of the ring - if you issue a “nodetool ring” from every host >>> and compare them, you’ll probably notice one or more is wrong >>> >>> It’s also possible this happens for a few seconds during adding / moving >>> / removing hosts >>> >>> If you weren’t changing the topology of the cluster, it’s likely the >>> case that bouncing the cluster fixes it >>> >>> (Im unsure of the defaults and not able to look it up, but cassandra can >>> log or log and drop the read - you probably want to drop the read log, >>> which is the right solution so it doesn’t accidentally return a missing / >>> empty result set as a valid query result, instead it’ll force it to read >>> from other replicas or time out) >>> >>> >>> >>> >>> >>> On Oct 20, 2023, at 10:57 PM, Jaydeep Chovatia < >>> chovatia.jayd...@gmail.com> wrote: >>> >>> >>> >>> Hi, >>> >>> I am using Cassandra 4.0.6 in production, and receiving the following >>> error. This indicates that Cassandra nodes have mismatch in token-owership. >>> >>> Has anyone seen this issue before? >>> >>> Received a read request from /XX.XX.XXX.XXX:YYYYY for a range that is not >>> owned by the current replica Read(keyspace.table columns=*/[c1] rowFilter= >>> limits=LIMIT 100 key=7BE78B90-AD66-406B-AA05-6A062F72F542:0 >>> filter=slice(slices=ALL, reversed=false), nowInSec=1697751757). >>> >>> Jaydeep >>> >>>