Hi Adam,

I prefer to go for this approach.


   -

   *Snowflake BIGINT* → no migration, performance-friendly, but requires
   care around machine ID/clock handling.

The machine Id and clock handling are items, that I consider, are already
managed in production environments. And if the system is already connected
to a payment system, the IT admins are already aware or are more familiar
about having clock synchronizarion while working with distributed systems
connected to Apache Fineract.

Regards

Victor


El El vie, 5 de sep de 2025 a la(s) 2:03 a.m., Ádám Sághy <
adamsa...@gmail.com> escribió:

> Hi Paul, Tom, James, Victor,
>
> Thanks again for sharing your experience!
>
> I see why UUIDv7 can be attractive: time-ordered, globally unique, and
> avoids the pitfalls of machine IDs and clock sync.
>
> That said, one major consideration for us is migration effort.
>
>    -
>
>    *UUIDv7* would require a *full schema and data migration*, touching
>    every table, index, and foreign key. For a system like Fineract this is a
>    massive undertaking, realistically close to a year of work before things
>    are stable, with high risk and ongoing maintenance cost for projects that
>    are using Fineract for a while or using it with massive amount of data.
>    -
>
>    *Snowflake-style BIGINTs*, on the other hand, can be introduced
>    without changing the existing PK data type. We can keep the schema as-is,
>    avoid re-indexing terabytes of data, and stay compatible with all current
>    integrations.
>
> Performance is the other angle: UUIDv7 doubles index size (128-bit vs
> 64-bit) and tends to fragment B-tree indexes more than sequential or
> roughly sequential IDs. Snowflake IDs remain compact BIGINTs, time-ordered
> enough for efficient indexing, and guarantee immediate availability in
> memory without requiring a flush.
>
> There’s another very important angle I’d like to add to this discussion:
> our current *PK generation strategy*. Today Fineract relies on IDENTITY 
> columns,
> which means the database only provides an ID during flush/commit. The
> backend cannot obtain an ID before that point. Unfortunately, the platform
> depends heavily on having an ID *earlier* in the lifecycle for things
> like:
>
>    -
>
>    Creating and sending business events with the new entity ID
>    -
>
>    Populating result objects with the resource ID immediately after
>    creation
>    -
>
>    Enabling true batch write operations (currently blocked)
>    -
>
>    Avoiding unnecessary flushes purely to retrieve generated IDs
>
> This limitation has significant impact on performance, testability, and
> developer experience.
>
> That’s why I’ve been leaning toward application-level ID generation
> (Snowflake-style BIGINTs). They solve the “flush dependency” problem while
> retaining our current schema and data type, so *no massive migration* is
> required. UUIDv7 would address uniqueness and ordering, but it would mean
> replacing all PK columns, indexes, and FKs — realistically a year-long,
> high-risk migration for a project of this size.
>
> Of course, Snowflake IDs aren’t perfect — after ~50 years we’d eventually
> hit their limits, but realistically no system of this kind goes half a
> century without a redesign anyway.
>
> So to me the trade-off looks like:
>
>    -
>
>    *UUIDv7* → simpler conceptually, but very heavy migration + larger
>    index/storage footprint.
>    -
>
>    *Snowflake BIGINT* → no migration, performance-friendly, but requires
>    care around machine ID/clock handling.
>
>
> *At the end the question is whether we want to fix 1 problem with a short
> implementation timeline, or fix 2 problems with a pretty lengthy
> implementation timeline and with the risk of a very complex data migration!*
>
>
> *As an ending note, I really love this conversation and all the ideas
> shared — that was exactly my intention in starting the thread. Please keep
> it going; we should consider all angles and hear all voices in order to
> select the best option for the community and the project!*
>
> Best,
> Adam
>
> On 2025. Sep 4., at 13:36, Paul <pchristi...@gmail.com> wrote:
>
> Hi Adam,
> I'm not super technical, but have faced similar issues in the past and
> will try to regurgitate what we discovered as a "better way" which might be
> useful.
>
> First, let me share using the Machine ID in the sequence ends up being an
> ugly pain in the ass point and was a near catastrophic issue for us . . .
> so while your solution would be an improvement, it creates new risks which
> are just as onerous. That was on metal servers. With cloud services,
> scaling and churn would seem to multiply the risk? While I'm sure there are
> ways around each challenge, again, each "way around" complicates the
> solution and worse, increases time, cost and cognitive load.
>
> I going to suggest (PLEASE double check my sanity) 128-bit IDs using
> time-ordered UUIDs v7. This provides global uniqueness with the only
> negative being a slight increase in index size (16 vs 8 bit). It should
> avoid several "what ifs" on clock issues (jumps and fall backs) and risk of
> using Machin ID, you will discover a LOT of extra effort is required to
> make those work. THEN after ALL that, will find that uniqueness is NOT
> certain. I found this out the hard way. In short, we tried what sounded
> like your method and failed to enjoy a truly scalable solution. In fact, we
> made 3 major efforts to get it to "play nice" and failed.
>
> This simplified method provides immediate ops benefit, is a low risk
> implementation. (Lead Developer, suggested it was zero risk, but I don't
> beleive in such a thing)  We were working with Postgres, Java app, NOT
> Fineract, but assuming its generally applicable.
>
> Assumptions:
> 1) You're not trying to remove or replace primary keys and will keep them
> for internal use
> 2) Postgres handles indexing of uuid auto-magically; as long as the column
> is marked unique
> 3) Use uuid as the public ID in APIs
>
> If you think this is helpful or a better option, feel free to quiz me.
>
> Regards
> Paul
>
>
>

Reply via email to