Hi Adam, I prefer to go for this approach.
- *Snowflake BIGINT* → no migration, performance-friendly, but requires care around machine ID/clock handling. The machine Id and clock handling are items, that I consider, are already managed in production environments. And if the system is already connected to a payment system, the IT admins are already aware or are more familiar about having clock synchronizarion while working with distributed systems connected to Apache Fineract. Regards Victor El El vie, 5 de sep de 2025 a la(s) 2:03 a.m., Ádám Sághy < adamsa...@gmail.com> escribió: > Hi Paul, Tom, James, Victor, > > Thanks again for sharing your experience! > > I see why UUIDv7 can be attractive: time-ordered, globally unique, and > avoids the pitfalls of machine IDs and clock sync. > > That said, one major consideration for us is migration effort. > > - > > *UUIDv7* would require a *full schema and data migration*, touching > every table, index, and foreign key. For a system like Fineract this is a > massive undertaking, realistically close to a year of work before things > are stable, with high risk and ongoing maintenance cost for projects that > are using Fineract for a while or using it with massive amount of data. > - > > *Snowflake-style BIGINTs*, on the other hand, can be introduced > without changing the existing PK data type. We can keep the schema as-is, > avoid re-indexing terabytes of data, and stay compatible with all current > integrations. > > Performance is the other angle: UUIDv7 doubles index size (128-bit vs > 64-bit) and tends to fragment B-tree indexes more than sequential or > roughly sequential IDs. Snowflake IDs remain compact BIGINTs, time-ordered > enough for efficient indexing, and guarantee immediate availability in > memory without requiring a flush. > > There’s another very important angle I’d like to add to this discussion: > our current *PK generation strategy*. Today Fineract relies on IDENTITY > columns, > which means the database only provides an ID during flush/commit. The > backend cannot obtain an ID before that point. Unfortunately, the platform > depends heavily on having an ID *earlier* in the lifecycle for things > like: > > - > > Creating and sending business events with the new entity ID > - > > Populating result objects with the resource ID immediately after > creation > - > > Enabling true batch write operations (currently blocked) > - > > Avoiding unnecessary flushes purely to retrieve generated IDs > > This limitation has significant impact on performance, testability, and > developer experience. > > That’s why I’ve been leaning toward application-level ID generation > (Snowflake-style BIGINTs). They solve the “flush dependency” problem while > retaining our current schema and data type, so *no massive migration* is > required. UUIDv7 would address uniqueness and ordering, but it would mean > replacing all PK columns, indexes, and FKs — realistically a year-long, > high-risk migration for a project of this size. > > Of course, Snowflake IDs aren’t perfect — after ~50 years we’d eventually > hit their limits, but realistically no system of this kind goes half a > century without a redesign anyway. > > So to me the trade-off looks like: > > - > > *UUIDv7* → simpler conceptually, but very heavy migration + larger > index/storage footprint. > - > > *Snowflake BIGINT* → no migration, performance-friendly, but requires > care around machine ID/clock handling. > > > *At the end the question is whether we want to fix 1 problem with a short > implementation timeline, or fix 2 problems with a pretty lengthy > implementation timeline and with the risk of a very complex data migration!* > > > *As an ending note, I really love this conversation and all the ideas > shared — that was exactly my intention in starting the thread. Please keep > it going; we should consider all angles and hear all voices in order to > select the best option for the community and the project!* > > Best, > Adam > > On 2025. Sep 4., at 13:36, Paul <pchristi...@gmail.com> wrote: > > Hi Adam, > I'm not super technical, but have faced similar issues in the past and > will try to regurgitate what we discovered as a "better way" which might be > useful. > > First, let me share using the Machine ID in the sequence ends up being an > ugly pain in the ass point and was a near catastrophic issue for us . . . > so while your solution would be an improvement, it creates new risks which > are just as onerous. That was on metal servers. With cloud services, > scaling and churn would seem to multiply the risk? While I'm sure there are > ways around each challenge, again, each "way around" complicates the > solution and worse, increases time, cost and cognitive load. > > I going to suggest (PLEASE double check my sanity) 128-bit IDs using > time-ordered UUIDs v7. This provides global uniqueness with the only > negative being a slight increase in index size (16 vs 8 bit). It should > avoid several "what ifs" on clock issues (jumps and fall backs) and risk of > using Machin ID, you will discover a LOT of extra effort is required to > make those work. THEN after ALL that, will find that uniqueness is NOT > certain. I found this out the hard way. In short, we tried what sounded > like your method and failed to enjoy a truly scalable solution. In fact, we > made 3 major efforts to get it to "play nice" and failed. > > This simplified method provides immediate ops benefit, is a low risk > implementation. (Lead Developer, suggested it was zero risk, but I don't > beleive in such a thing) We were working with Postgres, Java app, NOT > Fineract, but assuming its generally applicable. > > Assumptions: > 1) You're not trying to remove or replace primary keys and will keep them > for internal use > 2) Postgres handles indexing of uuid auto-magically; as long as the column > is marked unique > 3) Use uuid as the public ID in APIs > > If you think this is helpful or a better option, feel free to quiz me. > > Regards > Paul > > >