[ 
https://issues.apache.org/jira/browse/CASSANDRA-16961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johnny Miller updated CASSANDRA-16961:
--------------------------------------
    Description: 
When compaction encounters a large partition, it outputs a warning in the logs 
e.g.:
 (Apologies, had to redact some information)

WARN [CompactionExecutor:343] 2021-09-16 09:28:43,539 BigTableWriter.java:211 - 
Writing large partition XXX/XXXX:sourceid:{color:#de350b}*2021-09-16 
05\:00Z*{color} (1.381GiB) to sstable 
/mnt/var/lib/cassandra/data/segment/message-336c5ff04db211ebbffc2980407d44d6/md-58982-big-Data.db

i.e 
[https://github.com/apache/cassandra/blob/cassandra-3.11.5/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java#L211]

*Example Table/insert*

CREATE TABLE myks.mytable (
 sourceid text,
 {color:#de350b}*messagehour timestamp,*{color}
 messagetime timestamp,
 messageid text
 PRIMARY KEY ((sourceid, messagehour), messagetime, messageid)
 ) ;

 

insert into myks.mytable (sourceid, messagehour, messagetime, messageid) values 
('sourceid', '{color:#de350b}*2021-09-16 05:00Z'*{color}, '2021-09-16 
05:00:31Z', '123ABC');

If I then need to try and work out which nodes in the cluster contain the 
replica data for this partition (from the logs), I will get the token via CQL

eg:
 select distinct token(sourceid,messagehour) from myks.mytable where 
sourceid='sourceid' and messagehour='{color:#de350b}*2021-09-16 05:00Z*{color}';

system.token(sourceid, messagehour)
 -------------------------------------
 {color:#de350b}*7663675819538124697*{color}

I then run nodetool to get the endpoints for this token/ks/table

eg
 nodetool getendpoints myks mytable {color:#de350b}*7663675819538124697*{color}
 172.31.10.187
 172.31.12.193
 172.31.13.91

And *the list of endpoints is not correct* as the value outputted in the 
timestamp warning log entry, I suspect, is missing additional 
information/precision so obviously will give back the wrong token and hence the 
wrong endpoints.

Possibly this warning log statement should output the actual partition key 
token in addition to the other information to avoid confusion and the string 
representation of the timestamp be correct.

 

  was:
When compaction encounters a large partition, it outputs a warning in the logs 
e.g.:
(Apologies, had to redact some information)


WARN [CompactionExecutor:343] 2021-09-16 09:28:43,539 BigTableWriter.java:211 - 
Writing large partition XXX/XXXX:PROsVuVbHju33:{color:#de350b}*2021-09-16 
05\:00Z*{color} (1.381GiB) to sstable 
/mnt/var/lib/cassandra/data/segment/message-336c5ff04db211ebbffc2980407d44d6/md-58982-big-Data.db


i.e 
[https://github.com/apache/cassandra/blob/cassandra-3.11.5/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java#L211]


*Example Table/insert*


CREATE TABLE myks.mytable (
 sourceid text,
 {color:#de350b}*messagehour timestamp,*{color}
 messagetime timestamp,
 messageid text
 PRIMARY KEY ((sourceid, messagehour), messagetime, messageid)
) ;

 

insert into myks.mytable (sourceid, messagehour, messagetime, messageid) values 
('PROsVuVbHju33', '{color:#de350b}*2021-09-16 05:00Z'*{color}, '2021-09-16 
05:00:31Z', '123ABC');


If I then need to try and work out which nodes in the cluster contain the 
replica data for this partition (from the logs), I will get the token via CQL

eg:
select distinct token(sourceid,messagehour) from myks.mytable where 
sourceid='PROsVuVbHju33' and messagehour='{color:#de350b}*2021-09-16 
05:00Z*{color}';
 
 system.token(sourceid, messagehour)
-------------------------------------
 {color:#de350b}*7663675819538124697*{color}
 
I then run nodetool to get the endpoints for this token/ks/table
 
eg
nodetool getendpoints myks mytable {color:#de350b}*7663675819538124697*{color}
172.31.10.187
172.31.12.193
172.31.13.91
 
And *the list of endpoints is not correct* as the value outputted in the 
timestamp warning log entry, I suspect, is missing additional 
information/precision so obviously will give back the wrong token and hence the 
wrong endpoints.
 
Possibly this warning log statement should output the actual partition key 
token in addition to the other information to avoid confusion and the string 
representation of the timestamp be correct.

 


> Timestamp String displayed for partition compaction warnings is not correct
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-16961
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16961
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Johnny Miller
>            Priority: Normal
>
> When compaction encounters a large partition, it outputs a warning in the 
> logs e.g.:
>  (Apologies, had to redact some information)
> WARN [CompactionExecutor:343] 2021-09-16 09:28:43,539 BigTableWriter.java:211 
> - Writing large partition XXX/XXXX:sourceid:{color:#de350b}*2021-09-16 
> 05\:00Z*{color} (1.381GiB) to sstable 
> /mnt/var/lib/cassandra/data/segment/message-336c5ff04db211ebbffc2980407d44d6/md-58982-big-Data.db
> i.e 
> [https://github.com/apache/cassandra/blob/cassandra-3.11.5/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java#L211]
> *Example Table/insert*
> CREATE TABLE myks.mytable (
>  sourceid text,
>  {color:#de350b}*messagehour timestamp,*{color}
>  messagetime timestamp,
>  messageid text
>  PRIMARY KEY ((sourceid, messagehour), messagetime, messageid)
>  ) ;
>  
> insert into myks.mytable (sourceid, messagehour, messagetime, messageid) 
> values ('sourceid', '{color:#de350b}*2021-09-16 05:00Z'*{color}, '2021-09-16 
> 05:00:31Z', '123ABC');
> If I then need to try and work out which nodes in the cluster contain the 
> replica data for this partition (from the logs), I will get the token via CQL
> eg:
>  select distinct token(sourceid,messagehour) from myks.mytable where 
> sourceid='sourceid' and messagehour='{color:#de350b}*2021-09-16 
> 05:00Z*{color}';
> system.token(sourceid, messagehour)
>  -------------------------------------
>  {color:#de350b}*7663675819538124697*{color}
> I then run nodetool to get the endpoints for this token/ks/table
> eg
>  nodetool getendpoints myks mytable 
> {color:#de350b}*7663675819538124697*{color}
>  172.31.10.187
>  172.31.12.193
>  172.31.13.91
> And *the list of endpoints is not correct* as the value outputted in the 
> timestamp warning log entry, I suspect, is missing additional 
> information/precision so obviously will give back the wrong token and hence 
> the wrong endpoints.
> Possibly this warning log statement should output the actual partition key 
> token in addition to the other information to avoid confusion and the string 
> representation of the timestamp be correct.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to