[ https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17874309#comment-17874309 ]
Maxwell Guo edited comment on CASSANDRA-19448 at 8/16/24 4:17 PM: ------------------------------------------------------------------ Hi [~brandon.williams], I think I found the reason for this [failures|https://app.circleci.com/pipelines/github/driftx/cassandra/1704/workflows/bd8b0614-0b2a-4231-aca7-22688e0a06b8/jobs/97629/tests]. I pulled a branch, made simple modifications [here by add a sleep|https://github.com/Maxwell-Guo/cassandra/blob/CASSANDRA-19448-test-repeat/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L47], and did a repeat test according to your ci configuration, and it was [successful|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/631/workflows/64d0372c-48e5-448c-99ae-87e326e5f09e]. ||Heading 1||Heading 2|| | trunk |[trunk|https://github.com/apache/cassandra/pull/3215/files]| |5.0|[5.0|https://github.com/apache/cassandra/pull/3236/]| |4.1|[4.1|https://github.com/apache/cassandra/pull/3237/files]| |4.0|[4.0|https://github.com/apache/cassandra/pull/3238/files]| The reason for the problem is that because I changed the recovery time point granularity to [microseconds|https://github.com/Maxwell-Guo/cassandra/blob/CASSANDRA-19448/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L81] (this original test case is [milliseconds|https://github.com/Maxwell-Guo/cassandra/blob/trunk/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L48] level). The three actions of [ INSERT twice and getting the current millisecond timestamp |https://github.com/Maxwell-Guo/cassandra/blob/trunk/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L44-L48] in the test example are within one millisecond. If it happens, there will be a problem, because the timestamp of our c* is microseconds, then the second [INSERT|https://github.com/Maxwell-Guo/cassandra/blob/trunk/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L45] time will be 1 greater than the timestamp of the first [INSERT|https://github.com/Maxwell-Guo/cassandra/blob/trunk/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L44] and the [current millisecond timestamp multiplied by 1000|https://github.com/Maxwell-Guo/cassandra/blob/CASSANDRA-19448/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L81], so the recovery failed, but if the original recovery granularity was milliseconds, this problem would not exist. [~blambov] I found the case for [DropRecreateAndRestoreTest|https://github.com/Maxwell-Guo/cassandra/blob/CASSANDRA-19448/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L52] is written by you . Would you mind I add a sleep here ? EDIT : I will modify the test case by inserting value with specified ts. was (Author: maxwellguo): Hi [~brandon.williams], I think I found the reason for this [failures|https://app.circleci.com/pipelines/github/driftx/cassandra/1704/workflows/bd8b0614-0b2a-4231-aca7-22688e0a06b8/jobs/97629/tests]. I pulled a branch, made simple modifications [here by add a sleep|https://github.com/Maxwell-Guo/cassandra/blob/CASSANDRA-19448-test-repeat/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L47], and did a repeat test according to your ci configuration, and it was [successful|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/631/workflows/64d0372c-48e5-448c-99ae-87e326e5f09e]. ||Heading 1||Heading 2|| | trunk |[trunk|https://github.com/apache/cassandra/pull/3215/files]| |5.0|[5.0|https://github.com/apache/cassandra/pull/3236/]| |4.1|[4.1|https://github.com/apache/cassandra/pull/3237/files]| |4.0|[4.0|https://github.com/apache/cassandra/pull/3238/files]| The reason for the problem is that because I changed the recovery time point granularity to [microseconds|https://github.com/Maxwell-Guo/cassandra/blob/CASSANDRA-19448/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L81] (this original test case is [milliseconds|https://github.com/Maxwell-Guo/cassandra/blob/trunk/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L48] level). The three actions of [ INSERT twice and getting the current millisecond timestamp |https://github.com/Maxwell-Guo/cassandra/blob/trunk/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L44-L48] in the test example are within one millisecond. If it happens, there will be a problem, because the timestamp of our c* is microseconds, then the second [INSERT|https://github.com/Maxwell-Guo/cassandra/blob/trunk/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L45] time will be 1 greater than the timestamp of the first [INSERT|https://github.com/Maxwell-Guo/cassandra/blob/trunk/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L44] and the [current millisecond timestamp multiplied by 1000|https://github.com/Maxwell-Guo/cassandra/blob/CASSANDRA-19448/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L81], so the recovery failed, but if the original recovery granularity was milliseconds, this problem would not exist. [~blambov] I found the case for [DropRecreateAndRestoreTest|https://github.com/Maxwell-Guo/cassandra/blob/CASSANDRA-19448/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L52] is written by you . Would you mind I add a sleep here ? > CommitlogArchiver only has granularity to seconds for restore_point_in_time > --------------------------------------------------------------------------- > > Key: CASSANDRA-19448 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19448 > Project: Cassandra > Issue Type: Bug > Components: Local/Commit Log > Reporter: Jeremy Hanna > Assignee: Maxwell Guo > Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > Commitlog archiver allows users to backup commitlog files for the purpose of > doing point in time restores. The [configuration > file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties] > gives an example of down to the seconds granularity but then asks what > whether the timestamps are microseconds or milliseconds - defaulting to > microseconds. Because the [CommitLogArchiver uses a second based date > format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52], > if a user specifies to restore at something at a lower granularity like > milliseconds or microseconds, that means that the it will truncate everything > after the second and restore to that second. So say you specify a > restore_point_in_time like this: > restore_point_in_time=2024:01:18 17:01:01.623392 > it will silently truncate everything after the 01 seconds. So effectively to > the user, it is missing updates between 01 and 01.623392. > This appears to be a bug in the intent. We should allow users to specify > down to the millisecond or even microsecond level. If we allow them to > specify down to microseconds for the restore point in time, then it may > internally need to change from a long. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org