[ https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17832876#comment-17832876 ]
Tiago L. Alves commented on CASSANDRA-19448: -------------------------------------------- [~maxwellguo] Thanks for providing a more general patch. I understand that C* timestamp allows microsecond granularity and aligning all those timestamps granularity sounds good. What I wonder is, in which scenarios would microsecond-level PIT restore would be useful? According to [https://stackoverflow.com/questions/97853/whats-the-best-way-to-synchronize-times-to-millisecond-accuracy-and-precision-b] it's possible to synchronize time to tenths of milliseconds. Wouldn't this mean that millisecond-level PIT restore would suffice? On the other hand, Amazon claims that's possible according to: [https://aws.amazon.com/blogs/compute/its-about-time-microsecond-accurate-clocks-on-amazon-ec2-instances/ |https://aws.amazon.com/blogs/compute/its-about-time-microsecond-accurate-clocks-on-amazon-ec2-instances/] opening the door for microsecond-level PIT restore. I would be happy to learn from your thoughts on those. Regarding the code changes, couldn't we detect automatically the granularity of PIT restore based on the value an user specifies? We could make our parsing more lenient by just doing `DateTimeFormatter.ofPattern("yyyy:MM:dd HH:mm:ss[.[SSSSSS][SSS]]")` allowing to parse all seconds, milliseconds, and microseconds. Also, if we could leverage on `Instant` datatype instead of `long` in `CommitLogArchiver` and postpone conversions to `CommitLogRestorer`. Wdyt? > CommitlogArchiver only has granularity to seconds for restore_point_in_time > --------------------------------------------------------------------------- > > Key: CASSANDRA-19448 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19448 > Project: Cassandra > Issue Type: Bug > Components: Local/Commit Log > Reporter: Jeremy Hanna > Assignee: Maxwell Guo > Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > Commitlog archiver allows users to backup commitlog files for the purpose of > doing point in time restores. The [configuration > file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties] > gives an example of down to the seconds granularity but then asks what > whether the timestamps are microseconds or milliseconds - defaulting to > microseconds. Because the [CommitLogArchiver uses a second based date > format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52], > if a user specifies to restore at something at a lower granularity like > milliseconds or microseconds, that means that the it will truncate everything > after the second and restore to that second. So say you specify a > restore_point_in_time like this: > restore_point_in_time=2024:01:18 17:01:01.623392 > it will silently truncate everything after the 01 seconds. So effectively to > the user, it is missing updates between 01 and 01.623392. > This appears to be a bug in the intent. We should allow users to specify > down to the millisecond or even microsecond level. If we allow them to > specify down to microseconds for the restore point in time, then it may > internally need to change from a long. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org