On Wed, 10 Sep 2025 19:46:12 GMT, Kieran Farrell <[email protected]> wrote:

>> With the recent approval of UUIDv7 
>> (https://datatracker.ietf.org/doc/rfc9562/), this PR aims to add a new 
>> static method UUID.timestampUUID() which constructs and returns a UUID in 
>> support of the new time generated UUID version. 
>> 
>> The specification requires embedding the current timestamp in milliseconds 
>> into the first bits 0–47. The version number in bits 48–51, bits 52–63 are 
>> available for sub-millisecond precision or for pseudorandom data. The 
>> variant is set in bits 64–65. The remaining bits 66–127 are free to use for 
>> more pseudorandom data or to employ a counter based approach for increased 
>> time percision 
>> (https://www.rfc-editor.org/rfc/rfc9562.html#name-uuid-version-7).
>> 
>> The choice of implementation comes down to balancing the sensitivity level 
>> of being able to distingush UUIDs created below <1ms apart with performance. 
>> A test simulating a high-concurrency environment with 4 threads generating 
>> 10000 UUIDv7 values in parallel to measure the collision rate of each 
>> implementation (the amount of times the time based portion of the UUID was 
>> not unique and entries could not distinguished by time) yeilded the 
>> following results for each implemtation:
>> 
>> 
>> - random-byte-only - 99.8%
>> - higher-precision - 3.5%
>> - counter-based - 0%
>> 
>> 
>> Performance tests show a decrease in performance as expected with the 
>> counter based implementation due to the introduction of synchronization:
>> 
>> - random-byte-only   143.487 ± 10.932  ns/op
>> - higher-precision      149.651 ±  8.438 ns/op
>> - counter-based         245.036 ±  2.943  ns/op
>> 
>> The best balance here might be to employ a higher-precision implementation 
>> as the large increase in time sensitivity comes at a very slight performance 
>> cost.
>
> Kieran Farrell has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   update method name

An initial remark about the APIs being proposed in this PR. Reading through the 
motivation section of RFC-9562 
https://www.rfc-editor.org/rfc/rfc9562.html#name-update-motivation, I think a 
few important things that we should consider for the API we are proposing area:


> Many things have changed in the time since UUIDs were originally created. 
> Modern applications have a need to create and utilize UUIDs as the primary 
> identifier for a variety of different items ... 
> In such cases, "auto-increment" schemes that are often used by databases do 
> not work well: the effort required to coordinate sequential numeric 
> identifiers across a network can easily become a burden.
> The fact that UUIDs can be used to create unique, reasonably short values in 
> distributed systems without requiring coordination makes them a good 
> alternative, but UUID versions 1-5, which were originally defined by 
> [RFC4122], lack certain other desirable characteristics...
>
> ...
> Due to the aforementioned issues, many widely distributed database 
> applications and large application vendors have sought to solve the problem 
> of creating a better time-based, sortable unique identifier for use as a 
> database key. This has led to numerous implementations over the past 10+ 
> years solving the same problem in slightly different ways ...

Then later in section 6.1 and 6.2 
https://www.rfc-editor.org/rfc/rfc9562.html#section-6.1 it's further stated 
that:

> UUID timestamp source, precision, and length were topics of great debate 
> while creating UUIDv7 for this specification. Choosing the right timestamp 
> for your application is very important.
...
> Monotonicity (each subsequent value being greater than the last) is the 
> backbone of time-based sortable UUIDs.

Given all this, I think the API we provide must try and achieve these primary 
motivations. That would then mean, not allowing arbitrary values to be passed 
by applications for generating a UUIDv7 `UUID` instance. So I think we 
shouldn't introduce the:


public static UUID epochMillis(long timestamp)


being proposed in this PR. The implementation of this method will have no 
control (unless we add some logic of keeping track of each call) over what 
"timestamp" gets passed for subsequent calls and thus cannot guarantee the 
generated UUIDv7 value to be monotonic. Of course, we could expect the 
applications to make sure they pass the right timestamp(s) for each call, but 
then that brings us back to what the RFC motivation stated - that several 
libraries do it differently. So I think having libraries/applications do the 
work of passing the right timestamp may not be an useful way to expose the 
UUIDv7 generation.

I think the other API being proposed in this PR:


public static UUID epochMillis()


is the only one we should introduce. I'm still reviewing the monotonicity 
implementation and discussion of this `epochMillis()` method in this PR and 
will reply separately on that.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25303#issuecomment-3324223138

Reply via email to