On Wed, Sep 4, 2024 at 12:23 PM shveta malik <shveta.ma...@gmail.com> wrote:
>
> Hello hackers,
> (Cc people involved in the earlier discussion)
>
> I would like to discuss the $Subject.
>
> While discussing Logical Replication's Conflict Detection and
> Resolution (CDR) design in [1] , it came to  our notice that the
> commit LSN and timestamp may not correlate perfectly i.e. commits may
> happen with LSN1 < LSN2 but with Ts1 > Ts2. This issue may arise
> because, during the commit process, the timestamp (xactStopTimestamp)
> is captured slightly earlier than when space is reserved in the WAL.
>
>  ~~
>
>  Reproducibility of conflict-resolution problem due to the timestamp inversion
> ------------------------------------------------
> It was suggested that timestamp inversion *may* impact the time-based
> resolutions such as last_update_wins (targeted to be implemented in
> [1]) as we may end up making wrong decisions if timestamps and LSNs
> are not correctly ordered. And thus we tried some tests but failed to
> find any practical scenario where it could be a problem.
>
> Basically, the proposed conflict resolution is a row-level resolution,
> and to cause the row value to be inconsistent, we need to modify the
> same row in concurrent transactions and commit the changes
> concurrently. But this doesn't seem possible because concurrent
> updates on the same row are disallowed (e.g., the later update will be
> blocked due to the row lock).  See [2] for the details.
>
> We tried to give some thoughts on multi table cases as well e.g.,
> update table A with foreign key and update the table B that table A
> refers to. But update on table A will block the update on table B as
> well, so we could not reproduce data-divergence due to the
> LSN/timestamp mismatch issue there.
>
>  ~~
>
> Idea proposed to fix the timestamp inversion issue
> ------------------------------------------------
> There was a suggestion in [3] to acquire the timestamp while reserving
> the space (because that happens in LSN order). The clock would need to
> be monotonic (easy enough with CLOCK_MONOTONIC), but also cheap. The
> main problem why it's being done outside the critical section, because
> gettimeofday() may be quite expensive. There's a concept of hybrid
> clock, combining "time" and logical counter, which might be useful
> independently of CDR.
>
> On further analyzing this idea, we found that CLOCK_MONOTONIC can be
> accepted only by clock_gettime() which has more precision than
> gettimeofday() and thus is equally or more expensive theoretically (we
> plan to test it and post the results). It does not look like a good
> idea to call any of these when holding spinlock to reserve the wal
> position.  As for the suggested solution "hybrid clock", it might not
> help here because the logical counter is only used to order the
> transactions with the same timestamp. The problem here is how to get
> the timestamp along with wal position
> reservation(ReserveXLogInsertLocation).
>

Here are the tests done to compare clock_gettime() and gettimeofday()
performance.

Machine details :
Intel(R) Xeon(R) CPU E7-4890 v2 @ 2.80GHz
CPU(s): 120; 800GB RAM

Three functions were tested across three different call volumes (1
million, 100 million, and 1 billion):
1) clock_gettime() with CLOCK_REALTIME
2) clock_gettime() with CLOCK_MONOTONIC
3) gettimeofday()

--> clock_gettime() with CLOCK_MONOTONIC sometimes shows slightly
better performance, but not consistently. The difference in time taken
by all three functions is minimal, with averages varying by no more
than ~2.5%. Overall, the performance between CLOCK_MONOTONIC and
gettimeofday() is essentially the same.

Below are the test results -
(each test was run twice for consistency)

1) For 1 million calls:
 1a) clock_gettime() with CLOCK_REALTIME:
    - Run 1: 0.01770 seconds, Run 2: 0.01772 seconds, Average: 0.01771 seconds.
 1b) clock_gettime() with CLOCK_MONOTONIC:
    - Run 1: 0.01753 seconds, Run 2: 0.01748 seconds, Average: 0.01750 seconds.
 1c) gettimeofday():
    - Run 1: 0.01742 seconds, Run 2: 0.01777 seconds, Average: 0.01760 seconds.

2) For 100 million calls:
 2a) clock_gettime() with CLOCK_REALTIME:
    - Run 1: 1.76649 seconds, Run 2: 1.76602 seconds, Average: 1.76625 seconds.
 2b) clock_gettime() with CLOCK_MONOTONIC:
    - Run 1: 1.72768 seconds, Run 2: 1.72988 seconds, Average: 1.72878 seconds.
 2c) gettimeofday():
    - Run 1: 1.72436 seconds, Run 2: 1.72174 seconds, Average: 1.72305 seconds.

3) For 1 billion calls:
 3a) clock_gettime() with CLOCK_REALTIME:
    - Run 1: 17.63859 seconds, Run 2: 17.65529 seconds, Average:
17.64694 seconds.
 3b) clock_gettime() with CLOCK_MONOTONIC:
    - Run 1: 17.15109 seconds, Run 2: 17.27406 seconds, Average:
17.21257 seconds.
 3c) gettimeofday():
    - Run 1: 17.21368 seconds, Run 2: 17.22983 seconds, Average:
17.22175 seconds.
~~~~
Attached the scripts used for tests.

--
Thanks,
Nisha

<<attachment: clock_gettime_test.zip>>

Reply via email to