Hello, Going to dig up this thread. I've been playing with some ideas of changes around the Loan COB job and its Business Steps, and as some might expect, it's not that easy.
I've been wondering if my testing methodology is correct. I'm using the DB shared by Adam Saghy in FINERACT-2234, with Fineract stack deployed on local Docker, with some artificial delay added to communication between Fineract and its DB, and I'm looking at the built-in logs. Some ideas I was trying out (ex., delaying loading of Loan's reference collections) either have no noticeable impact on times or are slower. So I figured, I'll measure the changes implemented as part of FINERACT-2272, FINERACT-2276, FINERACT-2277, FINERACT-2278 - child tickets of FINERACT-2234 - to test the test methodology. The result of four of the above tickets is maybe a 10% at most (from ~2m30s to ~2m20s for 300+ days past the DB from FINERACT-2234). *Does* this measured performance increment make sense? Also, is the Apache Issues the main place to track progress/tasks - a lot of FINERACT-2234 are not updated, even if the commits are already in the main branch. Best regards, PiotrW On Tue, Apr 8, 2025 at 10:58 AM Jakub Sławiński <[email protected]> wrote: > Thank you Piotr for the update. > > Adam, thank you for summing up the possible improvements! > > If you still would like to add something to that list, please do so. > > > Regards, > Jakub. > > > On Tue, Apr 8, 2025 at 10:55 AM Piotr Wargulak > <[email protected]> wrote: > >> Hello, >> >> Just an FYI - I've started work/discussion under FINERACT-2234, and keep >> the JIRA updated with progress. >> >> Best regards >> PiotrW >> >> On Mon, Apr 7, 2025 at 3:34 PM Ádám Sághy <[email protected]> wrote: >> >>> Hi Jakub, >>> >>> During recent discussions, several areas for improvement in Fineract >>> were mentioned. I’d like to highlight two recommendations in particular >>> that I believe are excellent candidates for your performance optimization >>> initiative: >>> ------------------------------ >>> *1. Job Performance Optimization (Originally suggested by Arnold)* >>> >>> “One of the things Fineract clearly struggles with is the performance of >>> jobs. These are mostly single-threaded implementations that begin to suffer >>> even under moderate data volumes. >>> Many could be rewritten or enhanced to run as Spring Batch partitioned >>> and chunked jobs. >>> While Fineract does include one scaled job (Loan COB), the rest are >>> implemented as Spring Batch tasklets, which are either single-threaded or >>> only parallelized within the tasklet itself. Neither approach is >>> well-suited to handling large-scale datasets.” >>> >>> Fineract’s job system plays a critical role in core functions like >>> interest calculation, accrual recognition, event dispatching, report >>> generation, and dividend payouts. As Arnold noted, the current >>> implementations are suboptimal in terms of performance. >>> >>> Redesigning and rewriting these jobs with scalability in mind would be a >>> highly valuable contribution to the project — one with clear and measurable >>> impact. >>> ------------------------------ >>> *2. JPA Usage and Entity Fetching Patterns* >>> >>> This is another area with significant room for improvement. Most >>> database interactions in Fineract go through JPA. For instance, submitting >>> a new loan involves creating a Loan entity, setting its fields, and >>> persisting it via the entity manager. >>> >>> When performing operations on existing loans, Fineract often loads the >>> Loan entity along with many associated entities — far more than >>> typically necessary. >>> Example: Making a Loan Repayment >>> >>> When fetching a loan, the following associated data may also be loaded: >>> >>> - >>> >>> *Client info* → Office, Image, Staff >>> - >>> >>> *Group info* → Office, Staff, Members, Group level >>> - >>> >>> Group Loan Individual Monitoring Account >>> - >>> >>> Fund info >>> - >>> >>> Loan Officer info → Staff >>> - >>> >>> Interest Recalculation, Top-Up Details >>> >>> Despite many of these associations being marked as LAZY, the method >>> LoanAssemblerImpl#assembleFrom(Long) contains logic that explicitly >>> fetches extensive related data, including: >>> >>> - >>> >>> Loan charges → charge details, tax group, payment type, etc. >>> - >>> >>> Tranche charges >>> - >>> >>> Repayment installments >>> - >>> >>> Transactions → office, charge mappings, etc. >>> - >>> >>> Disbursement details >>> - >>> >>> Term variations >>> - >>> >>> Collaterals and related management >>> - >>> >>> Loan officer assignment history >>> >>> As you can see, a large amount of data is fetched — much of which is *not >>> necessary* for a simple repayment operation. For example, top-up >>> details, disbursement info, or officer assignment history are likely >>> irrelevant in this context. >>> >>> This is just one use case. I strongly believe that by carefully >>> reviewing each operation and selectively loading only the necessary data, >>> we can significantly improve performance and reduce infrastructure overhead. >>> >>> If tackling every case individually feels too complex, a good starting >>> point could be: >>> >>> - >>> >>> Removing some of the unnecessary associations from the Loan entity >>> - >>> >>> Minimizing eager loading >>> - >>> >>> Fetching related data only when explicitly needed >>> >>> *Example:* >>> When creating a new loan transaction, there's no need to fetch *all* loan >>> transactions — exceptions may apply. >>> ------------------------------ >>> *3. Additional Note: Primary Key Generation Strategy* >>> >>> One issue that hasn’t been discussed yet is the sub-optimal strategy >>> used for primary key generation. >>> >>> Fineract currently supports three database engines: >>> >>> - >>> >>> MySQL (original) >>> - >>> >>> MariaDB (added later) >>> - >>> >>> PostgreSQL (added more recently) >>> >>> Because MySQL and MariaDB do not support sequences, Fineract relies on >>> identity columns for PK generation. This means: >>> >>> - >>> >>> You must flush the persistence context to retrieve the generated ID. >>> - >>> >>> As a result, multiple flushes occur during transactions — especially >>> when IDs are needed immediately (e.g., for external events or JDBC >>> queries). >>> >>> Currently, PK fields are Long and auto-generated. >>> My recommendation: >>> >>> - >>> >>> Switch from Long to String or UUID >>> - >>> >>> Stop relying on database-generated IDs >>> - >>> >>> Generate IDs within Fineract itself to avoid unnecessary flushes and >>> improve consistency across database engines >>> >>> Whether we use UUIDs, NanoIDs, or other formats (e.g., VARCHAR(22)), is >>> a topic for broader discussion — perhaps via the mailing list. But moving >>> away from auto-generated, database-dependent, and easily guessable IDs >>> would be a step forward for both performance and architecture. >>> ------------------------------ >>> >>> I hope this provides some helpful context and direction! >>> >>> Best regards, >>> Adam >>> >> >> >> *SolDevelo* Sp. z o.o. [LLC] / www.soldevelo.com >> Al. Zwycięstwa 96/98, 81-451, Gdynia, Poland >> Phone: +48 58 782 45 40 / Fax: +48 58 782 45 41 >> > > > -- > > *Jakub Sławiński* > Chief Technical Officer > [email protected] / +48 514 780 384 > > > *SolDevelo* Sp. z o.o. [LLC] / www.soldevelo.com > Al. Zwycięstwa 96/98, 81-451, Gdynia, Poland > Phone: +48 58 782 45 40 / Fax: +48 58 782 45 41 > -- * SolDevelo* Sp. z o.o. [LLC] / www.soldevelo.com <http://www.soldevelo.com> Al. Zwycięstwa 96/98, 81-451, Gdynia, Poland Phone: +48 58 782 45 40 / Fax: +48 58 782 45 41
