Re: [DISCUSSION] Performance issues with data-index persistence addon

Francisco Javier Tirado Sarti Wed, 21 Feb 2024 03:52:57 -0800

Hi Enrique,
If we configure such policies through properties, it won't be enough to
define a naming convention and rely on the target platform configuration
capabilities (SpringBoot or Quarkus)?.


On Wed, Feb 21, 2024 at 12:36 PM Enrique Gonzalez Martinez <
[email protected]> wrote:

> Hi Francisco
>
> The discussion we need to have before how to achieve certain features, is
> the overall of the user experience. If you want to clean it up that way it
> is fine by me. I have nothing against setting policy related to some sort
> of clean up for a process completed some time ago.
>
> As we discussed previously the data index is a snapshot of the last state
> of the process instance included completed, but nothing was said once
> completed when we need to clean that up so any policy is welcome. How to
> achieve that policy is something completely different.
>
> The main problem is how every microservice is exposing that API to the end
> user or consumer (other system) and which system is going to be the façade
> for complex deployments and operations. That is the discussion I was
> mentioning before.
>
> What I want to avoid is to set different policies among components and we
> should strive to be as much as possible to offer certain capabilities in
> the same fashion, e.g: clean up mechanism.
> In the same way I want to avoid making the current system more complex than
> they are. So far they are aligned offering one simple responsibility but I
> would be against mixing for instance job service with data index.
>
> El mié, 21 feb 2024 a las 12:13, Francisco Javier Tirado Sarti (<
> [email protected]>) escribió:
>
> > Hi Enrique,
> > In the case of data index I think the data to be purged is finished
> process
> > instances (I do not think we should remove process instance that has been
> > alive for ages, even if it is very likely they are not going to be ever
> > completed)
> > Once you delete those process instances, you also delete the associated
> > user tasks and jobs.
> > Therefore the problem is relatively simple, to be able to configure how
> > much time a completed process instance should remain in the data index
> > database. We can take a simple approach: a property with a min duration
> > that cannot be changed once data index is started; a slightly complex
> one:
> > the same property but watching it to react for changes; or the full
> suite:
> > an admin API to be able to change the policy at any moment.
> > I think this discussion is worth having.
> >
> > On Wed, Feb 21, 2024 at 6:15 AM Enrique Gonzalez Martinez <
> > [email protected]> wrote:
> >
> > > Hi Martin, the main problem regarding the purge is because it is still
> > > unclear the policy for what tech to use and the future components.
> > >
> > > Recently we had a discussion about proposing graphql for this sort of
> > admin
> > > tasks. So far for subsystems we have been using rest endpoints (like
> > update
> > > timers or modify human task or change processes). There is one
> exception
> > > which is the gateway that is pure graphql and somehow uses graphql for
> > > everything making complex operations under the hood.
> > >
> > > This has somehow frozen the purge for data audit for a bit and the
> > proposal
> > > was to use rest endpoints to do the clean up in the component and offer
> > the
> > > graphql counterpart in the gateway promoting it to a first class
> citizen
> > > component instead of having it embedded in the data index.
> > >
> > > I would suggest to come up at least with a policy first regarding the
> > > convention every component should address this.
> > >
> > >
> > > El mar, 20 feb 2024, 23:39, Martin Weiler <[email protected]>
> > > escribió:
> > >
> > > > IMO, it is good to have this discussion around data sanity now
> instead
> > of
> > > > putting it off until later when data has already accumulated in
> > > production
> > > > environments.
> > > >
> > > > Based on the input here, we are dealing with three types of data:
> > > > 1. Runtime data - active instances only, engine cleans up the data
> > > > automatically at process instance end
> > > > 2. Historic log data - data created by data-audit intended for long
> > term
> > > > storage
> > > > 3. Data-index data - somehow this data falls in between the two
> > > > aforementioned categories, with the idea of the data being "recent",
> > but
> > > > not restricted to active instances only
> > > >
> > > > We'd need purge strategies for both #2 and #3 (perhaps different
> ones,
> > or
> > > > with different config settings) in order to prevent unlimited data
> > > growth.
> > > >
> > > > ________________________________________
> > > > From: Enrique Gonzalez Martinez <[email protected]>
> > > > Sent: Monday, February 19, 2024 7:11 AM
> > > > To: [email protected]
> > > > Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues with
> data-index
> > > > persistence addon
> > > >
> > > > Hi Francisco,
> > > > To give you more context about this.
> > > >
> > > > STP is a concept, a process with certain constraints: no persistence
> > and
> > > > returning the outcome in the call (sync execution with no idle
> states).
> > > It
> > > > was a requirement from a user in the past. One of the requirements
> was
> > > > leaving no trail. In v7 was easy because you could disable the audit
> in
> > > > that case. Actually we have the same way to do what we did in v7 in
> > here
> > > as
> > > > you can add/remove index just removing deps.
> > > >
> > > > We have the same outcome with different approaches and STP is already
> > > > delivered.
> > > >
> > > > El lun, 19 feb 2024 a las 14:46, Francisco Javier Tirado Sarti (<
> > > > [email protected]>) escribió:
> > > >
> > > > > Regarding STP (which is not a concept that we have in the code. I
> > mean
> > > > STP
> > > > > are processes as nonSTP are), I guess, as all processes, they were
> > kept
> > > > in
> > > > > DataIndex once completed because users wanted (and still wants) to
> > > check
> > > > > the result once the call had been performed. If we want to leave no
> > > trace
> > > > > of them in DataIndex for some reason, we will need to make it a
> > > > > Runtimes concept so DataIndex can handle them in a different way.
> > > > >
> > > > > On Mon, Feb 19, 2024 at 2:27 PM Enrique Gonzalez Martinez <
> > > > > [email protected]> wrote:
> > > > >
> > > > > > Alex:
> > > > > > Right now the data index is working in the same way as it did in
> v7
> > > > with
> > > > > > the emitters. The only difference between two impl is that in
> here
> > > the
> > > > > > storage is pgsql instead elastic search.  You are right regarding
> > is
> > > a
> > > > > > snapshot of the last state of the process but we did never define
> > how
> > > > > long
> > > > > > would be alive that dats Honestly i am happy right now with the
> way
> > > it
> > > > > > works. The clean up mechanism is still tbd because we still need
> to
> > > > > discuss
> > > > > > other stuff first.
> > > > > >
> > > > > >
> > > > > > Regarding stp is to leave no trail because u can get the outcome
> > > > directly
> > > > > > from the call. It was defined like that in v7. So there is no use
> > for
> > > > the
> > > > > > index or the audit.
> > > > > >
> > > > > > El lun, 19 feb 2024, 14:13, Francisco Javier Tirado Sarti <
> > > > > > [email protected]> escribió:
> > > > > >
> > > > > > > Hi Alex,
> > > > > > > There has been some confusion about the purpose of DataIndex.
> To
> > be
> > > > > > honest
> > > > > > > I believe they were already sorted out, but your e-mail makes
> me
> > > > think
> > > > > > that
> > > > > > > is not the case ;). I let Kris to clarify that with you. My
> view
> > is
> > > > > that
> > > > > > > data-index is a way to query recently closed and active
> processes
> > > > (the
> > > > > > key
> > > > > > > here is the definition of recently, which in my opinion should
> be
> > > > > > > configurable)
> > > > > > > But, besides that discussion and being pragmatic, keeping
> > finishing
> > > > > > process
> > > > > > > instances "for a while" in DataIndex was the only way for users
> > to
> > > > > query
> > > > > > > the result of straight through processes. That's a function
> that
> > > > cannot
> > > > > > be
> > > > > > > removed right now
> > > > > > >
> > > > > > > On Mon, Feb 19, 2024 at 1:33 PM Alex Porcelli <
> > [email protected]
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > if data index was supposed to provide snapshot view of the
> > > process
> > > > > > > > instance… why do we keep it after the process instance is
> > > finished?
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Feb 19, 2024 at 7:12 AM Francisco Javier Tirado
> Sarti <
> > > > > > > > [email protected]> wrote:
> > > > > > > >
> > > > > > > > > Hi Martin.
> > > > > > > > > After taking a deeper look at this, I realize that the
> > > behaviour
> > > > is
> > > > > > the
> > > > > > > > > expected one.
> > > > > > > > > Runtimes DB does not track the completed process instance
> > > (that's
> > > > > > what
> > > > > > > > the
> > > > > > > > > JDBCProcessInstances warn is telling us), but DataIndex, as
> > > > > expected,
> > > > > > > is
> > > > > > > > > tracking it in processes and nodes table. And yes it will
> > grow
> > > > over
> > > > > > > time.
> > > > > > > > > What we need is some configurable purge mechanism for
> > > DataIndex,
> > > > so
> > > > > > it
> > > > > > > > > eventually removes older completed process instances.
> > > > > > > > >
> > > > > > > > > On Tue, Feb 13, 2024 at 12:59 PM Francisco Javier Tirado
> > Sarti
> > > <
> > > > > > > > > [email protected]> wrote:
> > > > > > > > >
> > > > > > > > > > Hi Martin,
> > > > > > > > > > Good catch!. Looks like the skipping performed for
> process
> > > > > > instances
> > > > > > > is
> > > > > > > > > > not applied to node instances. Something we definitely
> need
> > > to
> > > > > > review
> > > > > > > > on
> > > > > > > > > > the runtimes side.
> > > > > > > > > >
> > > > > > > > > > On Mon, Feb 12, 2024 at 11:59 PM Martin Weiler
> > > > > > > <[email protected]
> > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > >> On a somewhat related note, testing a simple workflow
> > (start
> > > > ->
> > > > > > > script
> > > > > > > > > >> node -> end), I see the following messages in the logs:
> > > > > > > > > >> 2024-02-12 22:49:50,493 28758dde544c WARN
> > > > > > > > > >>
> [org.kie.kogito.persistence.jdbc.JDBCProcessInstances:-1]
> > > > > > > > > >> (executor-thread-3) Skipping create of process instance
> > id:
> > > > > > > > > >> 7083088e-b899-47cb-b85c-5d9ccb0aa166, state: 2
> > > > > > > > > >>
> > > > > > > > > >> So far, so good. And I'd expect to see no trace of this
> > > > process
> > > > > in
> > > > > > > the
> > > > > > > > > >> database if I don't have data audit enabled.
> > > > > > > > > >>
> > > > > > > > > >> However, the 'processes' table contains a row with
> > state=2,
> > > > with
> > > > > > > > related
> > > > > > > > > >> entries in the 'nodes' table. In a load test, I see
> these
> > > > tables
> > > > > > > grow
> > > > > > > > > >> significantly over time. Am I missing something to have
> > > these
> > > > > > > entries
> > > > > > > > > >> cleaned up automatically?
> > > > > > > > > >>
> > > > > > > > > >> ________________________________________
> > > > > > > > > >> From: Martin Weiler <[email protected]>
> > > > > > > > > >> Sent: Monday, February 12, 2024 3:40 PM
> > > > > > > > > >> To: [email protected]
> > > > > > > > > >> Subject: [EXTERNAL] RE: [DISCUSSION] Performance issues
> > with
> > > > > > > > data-index
> > > > > > > > > >> persistence addon
> > > > > > > > > >>
> > > > > > > > > >> Thanks everyone for your input. Based on this
> discussion,
> > I
> > > > > opened
> > > > > > > the
> > > > > > > > > >> following PR:
> > > > > > > > > >>
> > > https://github.com/apache/incubator-kie-kogito-apps/pull/1985
> > > > > > > > > >>
> > > > > > > > > >> With this change, the performance seems to be stable
> over
> > > > time:
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1zkullvfrJpRp7TRjxDa41ok6kEIR7Fty/view?usp=sharing
> > > > > > > > > >>
> > > > > > > > > >> Martin
> > > > > > > > > >>
> > > > > > > > > >> ________________________________________
> > > > > > > > > >> From: Gonzalo Muñoz <[email protected]>
> > > > > > > > > >> Sent: Friday, February 9, 2024 9:42 AM
> > > > > > > > > >> To: [email protected]
> > > > > > > > > >> Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues
> > with
> > > > > > > > data-index
> > > > > > > > > >> persistence addon
> > > > > > > > > >>
> > > > > > > > > >> Great work Francisco,
> > > > > > > > > >> Martin, take a look at this link with some related tips
> > (in
> > > > case
> > > > > > you
> > > > > > > > > find
> > > > > > > > > >> it useful):
> > > > > > > > > >>
> > > > https://www.cybertec-postgresql.com/en/index-your-foreign-key/
> > > > > > > > > >>
> > > > > > > > > >> El vie, 9 feb 2024 a las 17:20, Francisco Javier Tirado
> > > Sarti
> > > > (<
> > > > > > > > > >> [email protected]>) escribió:
> > > > > > > > > >>
> > > > > > > > > >> > For the moment being, we will keep JPA till we exhaust
> > all
> > > > > > > > > >> possibilities,
> > > > > > > > > >> > let's call switching from jpa to jdbc our hidden plan
> B
> > ;)
> > > > > > > > > >> > I already told Martin, but in order everyone to know,
> > just
> > > > > after
> > > > > > > > > writing
> > > > > > > > > >> > the previous email, I thought "what if Postgres is not
> > > > > > > automatically
> > > > > > > > > >> > indexing foreign keys like mysql?" and, eureka
> > > > > > > > > >> > Postgres doc
> > > > > > > > > >> >
> > > > https://www.postgresql.org/docs/current/ddl-constraints.html
> > > > > > > > > >> > Mysql doc
> > > > > > > > > >> >
> > > > > > >
> > > https://dev.mysql.com/doc/refman/8.0/en/constraint-foreign-key.html
> > > > > > > > > >> > These are the relevant excerpt
> > > > > > > > > >> >
> > > > > > > > > >> > *Postgresql*
> > > > > > > > > >> > *A foreign key must reference columns that either are
> a
> > > > > primary
> > > > > > > key
> > > > > > > > or
> > > > > > > > > >> form
> > > > > > > > > >> > a unique constraint, or are columns from a non-partial
> > > > unique
> > > > > > > index.
> > > > > > > > > >> This
> > > > > > > > > >> > means that the referenced columns always have an index
> > to
> > > > > allow
> > > > > > > > > >> efficient
> > > > > > > > > >> > lookups on whether a referencing row has a match.
> Since
> > a
> > > > > DELETE
> > > > > > > of
> > > > > > > > a
> > > > > > > > > >> row
> > > > > > > > > >> > from the referenced table or an UPDATE of a referenced
> > > > column
> > > > > > will
> > > > > > > > > >> require
> > > > > > > > > >> > a scan of the referencing table for rows matching the
> > old
> > > > > value,
> > > > > > > it
> > > > > > > > is
> > > > > > > > > >> > often a good idea to index the referencing columns
> too.
> > > > > Because
> > > > > > > this
> > > > > > > > > is
> > > > > > > > > >> not
> > > > > > > > > >> > always needed, and there are many choices available on
> > how
> > > > to
> > > > > > > index,
> > > > > > > > > the
> > > > > > > > > >> > declaration of a foreign key constraint does not
> > > > automatically
> > > > > > > > create
> > > > > > > > > an
> > > > > > > > > >> > index on the referencing columns.*
> > > > > > > > > >> > *Mysql*
> > > > > > > > > >> > *MySQL requires that foreign key columns be indexed;
> if
> > > you
> > > > > > > create a
> > > > > > > > > >> table
> > > > > > > > > >> > with a foreign key constraint but no index on a given
> > > > column,
> > > > > an
> > > > > > > > index
> > > > > > > > > >> is
> > > > > > > > > >> > created. *
> > > > > > > > > >> >
> > > > > > > > > >> > So I asked Martin to especially create an index for
> > > > > > > > > process_instance_id
> > > > > > > > > >> > column on nodes table
> > > > > > > > > >> > I think that will fix the problem detected on the
> thread
> > > > dump.
> > > > > > > > > >> > The simpler process test to verify queries are fine
> > still
> > > > > > stands,
> > > > > > > > > >> though ;)
> > > > > > > > > >> >
> > > > > > > > > >> >
> > > > > > > > > >> > On Fri, Feb 9, 2024 at 5:10 PM Tibor Zimányi <
> > > > > > [email protected]
> > > > > > > >
> > > > > > > > > >> wrote:
> > > > > > > > > >> >
> > > > > > > > > >> > > I always preferred pure JDBC over Hibernate myself,
> > just
> > > > for
> > > > > > the
> > > > > > > > > sake
> > > > > > > > > >> of
> > > > > > > > > >> > > control of what is happening :) So I would not -1
> that
> > > > > myself.
> > > > > > > > > >> > >
> > > > > > > > > >> > > Tibor
> > > > > > > > > >> > >
> > > > > > > > > >> > > Dňa pi 9. 2. 2024, 17:00 Francisco Javier Tirado
> > Sarti <
> > > > > > > > > >> > > [email protected]>
> > > > > > > > > >> > > napísal(a):
> > > > > > > > > >> > >
> > > > > > > > > >> > > > Hi,
> > > > > > > > > >> > > > Usually I do not want to talk about work in
> progress
> > > > > because
> > > > > > > > > >> > preliminary
> > > > > > > > > >> > > > conclusions are pretty volatile but, well, there
> > are a
> > > > > > couple
> > > > > > > of
> > > > > > > > > >> things
> > > > > > > > > >> > > > that can be concluded from the really valuable
> > > > information
> > > > > > > that
> > > > > > > > > >> Martin
> > > > > > > > > >> > > > provided.
> > > > > > > > > >> > > > 1) In order to be able to determine if the number
> of
> > > > > > > statements
> > > > > > > > is
> > > > > > > > > >> > larger
> > > > > > > > > >> > > > than expected, I asked Martin to test with a
> simpler
> > > > > process
> > > > > > > > > >> > definition.
> > > > > > > > > >> > > > One with just three nodes: start, script and end.
> > The
> > > > > script
> > > > > > > one
> > > > > > > > > >> should
> > > > > > > > > >> > > > change just one variable. This way we can analyze
> if
> > > the
> > > > > > > number
> > > > > > > > of
> > > > > > > > > >> > > queries
> > > > > > > > > >> > > > is the expected one. From the single log (audit
> was
> > > > > > activated
> > > > > > > > > them)
> > > > > > > > > >> my
> > > > > > > > > >> > > > conclusion is that the number of insert/updates
> over
> > > > > > processes
> > > > > > > > and
> > > > > > > > > >> > nodes
> > > > > > > > > >> > > > (there a lot over task, that I will prefer to skip
> > for
> > > > > now,
> > > > > > > baby
> > > > > > > > > >> steps)
> > > > > > > > > >> > > is
> > > > > > > > > >> > > > the expected one.
> > > > > > > > > >> > > > 2) Analysing the thread dump, we see around 15
> > threads
> > > > > > > executing
> > > > > > > > > >> this
> > > > > > > > > >> > > line
> > > > > > > > > >> > > > at
> > > > > > > > > >> > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.kie.kogito.index.jpa.storage.ProcessInstanceEntityStorage.indexNode(ProcessInstanceEntityStorage.java:125),
> > > > > > > > > >> > > > so its pretty clear the code to be optimized ;).
> I'm
> > > > > > > evaluating
> > > > > > > > > >> > > > possibilities within JPA/Hibernate, but I'm
> starting
> > > to
> > > > > > think
> > > > > > > > that
> > > > > > > > > >> it
> > > > > > > > > >> > > might
> > > > > > > > > >> > > > be better to switch to JDBC and skip hibernate.
> Our
> > > > lives
> > > > > > will
> > > > > > > > be
> > > > > > > > > >> > > simpler,
> > > > > > > > > >> > > > especially with a schema relatively simple like
> ours
> > > > (that
> > > > > > > will
> > > > > > > > be
> > > > > > > > > >> my
> > > > > > > > > >> > > > recommendation if I was an external consultant)
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > On Fri, Feb 9, 2024 at 4:15 PM Tibor Zimányi <
> > > > > > > > [email protected]
> > > > > > > > > >
> > > > > > > > > >> > > wrote:
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > > Hi,
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > this will be a bit off-topic. However as far as
> > > > > > > performance, I
> > > > > > > > > >> think
> > > > > > > > > >> > we
> > > > > > > > > >> > > > > should think about that we have string primary
> > keys
> > > > > > (IDs). I
> > > > > > > > > would
> > > > > > > > > >> > > expect
> > > > > > > > > >> > > > > the database systems are much better with
> indexing
> > > > > numeric
> > > > > > > > keys
> > > > > > > > > >> than
> > > > > > > > > >> > > > > strings. I remember from the past, when I was
> > > working
> > > > > with
> > > > > > > > DBs,
> > > > > > > > > >> that
> > > > > > > > > >> > > > using
> > > > > > > > > >> > > > > strings as keys or indexes was a discouraged
> > > practice.
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > Best regards,
> > > > > > > > > >> > > > > Tibor
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > Dňa št 8. 2. 2024, 22:45 Martin Weiler
> > > > > > > > <[email protected]
> > > > > > > > > >
> > > > > > > > > >> > > > > napísal(a):
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > > I changed the test to use MongoDB [1] and I
> > don't
> > > > see
> > > > > a
> > > > > > > > > >> performance
> > > > > > > > > >> > > > > > degradation with this setup [2].
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > Please keep us posted of your findings.
> Thanks!
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > Martin
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > [1]
> > > > > > > > > >> > > > >
> > > > > > > > > >> >
> > > > > > > > >
> > > > > >
> > > https://github.com/martinweiler/job-service-refactor-test/tree/mongodb
> > > > > > > > > >> > > > > > [2]
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1NfacXaxJlgRMw4OQ5S20cvkzvaUKUVFj/view?usp=sharing
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > ________________________________________
> > > > > > > > > >> > > > > > From: Francisco Javier Tirado Sarti <
> > > > > > [email protected]>
> > > > > > > > > >> > > > > > Sent: Wednesday, February 7, 2024 11:40 AM
> > > > > > > > > >> > > > > > To: [email protected]
> > > > > > > > > >> > > > > > Subject: [EXTERNAL] Re: [DISCUSSION]
> Performance
> > > > > issues
> > > > > > > with
> > > > > > > > > >> > > data-index
> > > > > > > > > >> > > > > > persistence addon
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > yes, it can be index degradation because of
> > size,
> > > > but
> > > > > I
> > > > > > > > > believe
> > > > > > > > > >> (I
> > > > > > > > > >> > > > might
> > > > > > > > > >> > > > > be
> > > > > > > > > >> > > > > > wrong) the db is too small (yet) for that.
> > > > > > > > > >> > > > > > But, eventually, Postgres, when the DB is huge
> > > > enough,
> > > > > > > > > >> unavoidably
> > > > > > > > > >> > > will
> > > > > > > > > >> > > > > > behave like the graphic that Martin sent.
> > > > > > > > > >> > > > > > Since I believe we are not huge enough (yet),
> > lets
> > > > > rule
> > > > > > > out
> > > > > > > > > >> another
> > > > > > > > > >> > > > issue
> > > > > > > > > >> > > > > > by analysing the sql logs (I requested those
> to
> > > > Martin
> > > > > > > > offline
> > > > > > > > > >> and
> > > > > > > > > >> > he
> > > > > > > > > >> > > > is
> > > > > > > > > >> > > > > > going to kindly collect them).
> > > > > > > > > >> > > > > > Also Im curious to know if Mongo behave in the
> > > same
> > > > > way.
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > On Wed, Feb 7, 2024 at 7:25 PM Enrique
> Gonzalez
> > > > > > Martinez <
> > > > > > > > > >> > > > > > [email protected]> wrote:
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > > Hi Francisco,
> > > > > > > > > >> > > > > > > I would highly recommend to check indexes
> and
> > > how
> > > > > the
> > > > > > > > > updates
> > > > > > > > > >> > work
> > > > > > > > > >> > > in
> > > > > > > > > >> > > > > > data
> > > > > > > > > >> > > > > > > index to avoid full scan table and lock the
> > full
> > > > > > table.
> > > > > > > > Some
> > > > > > > > > >> db
> > > > > > > > > >> > are
> > > > > > > > > >> > > > > very
> > > > > > > > > >> > > > > > > sensitive to that.
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > > > El mié, 7 feb 2024, 18:41, Francisco Javier
> > > Tirado
> > > > > > > Sarti <
> > > > > > > > > >> > > > > > > [email protected]> escribió:
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > > > > Hi Martin,
> > > > > > > > > >> > > > > > > > While I analyze the data, let me ask you
> if
> > it
> > > > is
> > > > > > > > possible
> > > > > > > > > >> to
> > > > > > > > > >> > > > perform
> > > > > > > > > >> > > > > > > > another check (similar in a way to
> disabling
> > > > > > > data-index
> > > > > > > > > like
> > > > > > > > > >> > you
> > > > > > > > > >> > > > do)
> > > > > > > > > >> > > > > > Can
> > > > > > > > > >> > > > > > > > you switch to MongoDB persistence and
> check
> > if
> > > > the
> > > > > > > same
> > > > > > > > > >> > > degradation
> > > > > > > > > >> > > > > > that
> > > > > > > > > >> > > > > > > is
> > > > > > > > > >> > > > > > > > there for postgres remains?
> > > > > > > > > >> > > > > > > > I do not know if this is feasible but will
> > > > > certainly
> > > > > > > > > >> indicate
> > > > > > > > > >> > the
> > > > > > > > > >> > > > > > problem
> > > > > > > > > >> > > > > > > > is on the postgres storage layer and I do
> > not
> > > > > have a
> > > > > > > > clear
> > > > > > > > > >> > > > prediction
> > > > > > > > > >> > > > > > of
> > > > > > > > > >> > > > > > > > what we will see when doing this switch.
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > > > On Wed, Feb 7, 2024 at 6:37 PM Martin
> Weiler
> > > > > > > > > >> > > > <[email protected]
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > > > wrote:
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > > > > Hi Francisco,
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > thanks for your work on this important
> > > topic!
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > I would like to share some test results
> > > here,
> > > > > > which
> > > > > > > > > might
> > > > > > > > > >> > help
> > > > > > > > > >> > > to
> > > > > > > > > >> > > > > > > improve
> > > > > > > > > >> > > > > > > > > the codebase even further. I am using
> the
> > > > jmeter
> > > > > > > based
> > > > > > > > > >> test
> > > > > > > > > >> > > case
> > > > > > > > > >> > > > > from
> > > > > > > > > >> > > > > > > > Pere
> > > > > > > > > >> > > > > > > > > and Enrique (thanks guys!) [1] which
> uses
> > a
> > > > load
> > > > > > of
> > > > > > > 30
> > > > > > > > > >> > threads
> > > > > > > > > >> > > to
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > 1) start a new process instance (POST)
> > > > > > > > > >> > > > > > > > > 2) retrieve tasks for a user (GET)
> > > > > > > > > >> > > > > > > > > 3) fetches task details (GET)
> > > > > > > > > >> > > > > > > > > 4) complete a task (POST)
> > > > > > > > > >> > > > > > > > > 5) execute a query on data-audit
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > With this test setup, I noticed that the
> > > > > > performance
> > > > > > > > for
> > > > > > > > > >> the
> > > > > > > > > >> > > POST
> > > > > > > > > >> > > > > > > > > requests, in particular the one to
> start a
> > > new
> > > > > > > process
> > > > > > > > > >> > > instance,
> > > > > > > > > >> > > > > > > degrades
> > > > > > > > > >> > > > > > > > > over time - see graph [2]. If I run the
> > same
> > > > > test
> > > > > > > > > without
> > > > > > > > > >> > > > > data-index,
> > > > > > > > > >> > > > > > > > then
> > > > > > > > > >> > > > > > > > > there is no such performance degradation
> > > [3].
> > > > > You
> > > > > > > can
> > > > > > > > > >> find a
> > > > > > > > > >> > > > thread
> > > > > > > > > >> > > > > > > dump
> > > > > > > > > >> > > > > > > > > captured a few minutes into the first
> test
> > > > here
> > > > > > [4]
> > > > > > > > that
> > > > > > > > > >> > might
> > > > > > > > > >> > > > help
> > > > > > > > > >> > > > > > to
> > > > > > > > > >> > > > > > > > see
> > > > > > > > > >> > > > > > > > > some of the contention points.
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > I'd appreciate if you could take a look
> > and
> > > > see
> > > > > if
> > > > > > > > there
> > > > > > > > > >> is
> > > > > > > > > >> > > > > something
> > > > > > > > > >> > > > > > > > that
> > > > > > > > > >> > > > > > > > > can be further improved based on your
> > > previous
> > > > > > work.
> > > > > > > > If
> > > > > > > > > >> you
> > > > > > > > > >> > > need
> > > > > > > > > >> > > > > any
> > > > > > > > > >> > > > > > > > > additional data, let me know, but
> > otherwise
> > > it
> > > > > is
> > > > > > > > > >> > > straightforward
> > > > > > > > > >> > > > > to
> > > > > > > > > >> > > > > > > run
> > > > > > > > > >> > > > > > > > > the jmeter test as well.
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > Thanks,
> > > > > > > > > >> > > > > > > > > Martin
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > [1]
> > > > > > > > > >> https://github.com/pefernan/job-service-refactor-test/
> > > > > > > > > >> > > > > > > > > [2]
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1Gqn-ixE05kXv2jdssAUlnMuUVcHxIYZ0/view?usp=sharing
> > > > > > > > > >> > > > > > > > > [3]
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/10gVNyb4JYg_bA18bNhY9dEDbPn3TOxL7/view?usp=sharing
> > > > > > > > > >> > > > > > > > > [4]
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1jVrtsO49gCvUlnaC9AUAtkVKTm4PbdUv/view?usp=sharing
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > ________________________________________
> > > > > > > > > >> > > > > > > > > From: Francisco Javier Tirado Sarti <
> > > > > > > > > [email protected]>
> > > > > > > > > >> > > > > > > > > Sent: Wednesday, January 17, 2024 9:13
> AM
> > > > > > > > > >> > > > > > > > > To: [email protected]
> > > > > > > > > >> > > > > > > > > Cc: Pere Fernandez Perez
> > > > > > > > > >> > > > > > > > > Subject: [EXTERNAL] Re: [DISCUSSION]
> > > > Performance
> > > > > > > > issues
> > > > > > > > > >> with
> > > > > > > > > >> > > > > > data-index
> > > > > > > > > >> > > > > > > > > persistence addon
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > Hi Alex,
> > > > > > > > > >> > > > > > > > > I did not take times (which depends on a
> > > > number
> > > > > of
> > > > > > > > > >> variables
> > > > > > > > > >> > > that
> > > > > > > > > >> > > > > > > > > drastically change between
> environments),
> > > but
> > > > > > verify
> > > > > > > > > that
> > > > > > > > > >> the
> > > > > > > > > >> > > > > number
> > > > > > > > > >> > > > > > of
> > > > > > > > > >> > > > > > > > > updates has been reduced drastically
> > without
> > > > > > losing
> > > > > > > > > >> > > > functionality,
> > > > > > > > > >> > > > > > > which
> > > > > > > > > >> > > > > > > > is
> > > > > > > > > >> > > > > > > > > objectively a good thing. If before the
> > > > change,
> > > > > > for
> > > > > > > > > every
> > > > > > > > > >> > node
> > > > > > > > > >> > > > > > > executed,
> > > > > > > > > >> > > > > > > > we
> > > > > > > > > >> > > > > > > > > have an update for every node previously
> > > > > executed,
> > > > > > > so
> > > > > > > > > if a
> > > > > > > > > >> > > > process
> > > > > > > > > >> > > > > > have
> > > > > > > > > >> > > > > > > > 50
> > > > > > > > > >> > > > > > > > > nodes to execute, we were performing
> > nearly
> > > > > > 50*51/2
> > > > > > > > > >> updates,
> > > > > > > > > >> > > > which
> > > > > > > > > >> > > > > > > gives
> > > > > > > > > >> > > > > > > > us
> > > > > > > > > >> > > > > > > > > a total of  1275 updates, now we have
> just
> > > one
> > > > > for
> > > > > > > > every
> > > > > > > > > >> node
> > > > > > > > > >> > > > being
> > > > > > > > > >> > > > > > > > > executed, implying a total of 50
> updates.
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > On Wed, Jan 17, 2024 at 3:18 PM Alex
> > > Porcelli
> > > > <
> > > > > > > > > >> > > [email protected]>
> > > > > > > > > >> > > > > > > wrote:
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > > Francisco,
> > > > > > > > > >> > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > I noticed that your PR has been
> merged,
> > > but
> > > > I
> > > > > > was
> > > > > > > > > >> expecting
> > > > > > > > > >> > > (at
> > > > > > > > > >> > > > > > least
> > > > > > > > > >> > > > > > > > > > was my understanding from this thread)
> > > that
> > > > > > before
> > > > > > > > > >> merging
> > > > > > > > > >> > > some
> > > > > > > > > >> > > > > > > > > > benchmark data would be shared in
> > advance
> > > -
> > > > to
> > > > > > > > assess
> > > > > > > > > >> the
> > > > > > > > > >> > > > > > > cost/benefit
> > > > > > > > > >> > > > > > > > > > of such a decent size change.
> > > > > > > > > >> > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > Do you have any information to share?
> > > > > > > > > >> > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > On Sat, Dec 23, 2023 at 4:02 AM
> > Francisco
> > > > > Javier
> > > > > > > > > Tirado
> > > > > > > > > >> > Sarti
> > > > > > > > > >> > > > > > > > > > <[email protected]> wrote:
> > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > Yes, as intended, now we have one
> > select
> > > > and
> > > > > > one
> > > > > > > > > >> > > > insert/update
> > > > > > > > > >> > > > > > per
> > > > > > > > > >> > > > > > > > node
> > > > > > > > > >> > > > > > > > > > > event.
> > > > > > > > > >> > > > > > > > > > > I moved the PR as ready for review
> and
> > > > give
> > > > > > > @Pere
> > > > > > > > > >> > Fernandez
> > > > > > > > > >> > > > > Perez
> > > > > > > > > >> > > > > > > > > > > <[email protected]> permission to
> > the
> > > > > > branch
> > > > > > > so
> > > > > > > > > he
> > > > > > > > > >> can
> > > > > > > > > >> > > > edit
> > > > > > > > > >> > > > > it
> > > > > > > > > >> > > > > > > in
> > > > > > > > > >> > > > > > > > > the
> > > > > > > > > >> > > > > > > > > > > next two weeks (Ill be on PTO)  if
> > > > desired,
> > > > > > > before
> > > > > > > > > >> > merging.
> > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > On Thu, Dec 21, 2023 at 5:58 PM Alex
> > > > > Porcelli
> > > > > > <
> > > > > > > > > >> > > > > [email protected]>
> > > > > > > > > >> > > > > > > > > wrote:
> > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > Cool, thank you Francisco!
> > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > Did you manage to get some
> > preliminary
> > > > > data
> > > > > > > > about
> > > > > > > > > >> > > > > improvements?
> > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > On Thu, Dec 21, 2023 at 11:52 AM
> > > > Francisco
> > > > > > > > Javier
> > > > > > > > > >> > Tirado
> > > > > > > > > >> > > > > Sarti
> > > > > > > > > >> > > > > > > > > > > > <[email protected]> wrote:
> > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > Yes, after some delay because of
> > > > > quarkus 3
> > > > > > > > > >> migration.
> > > > > > > > > >> > > Im
> > > > > > > > > >> > > > > > > refining
> > > > > > > > > >> > > > > > > > > > this
> > > > > > > > > >> > > > > > > > > > > > > draft PR
> > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > >
> https://github.com/apache/incubator-kie-kogito-apps/pull/1941
> > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > On Thu, Dec 21, 2023 at 5:48 PM
> > Alex
> > > > > > > Porcelli
> > > > > > > > <
> > > > > > > > > >> > > > > > > [email protected]>
> > > > > > > > > >> > > > > > > > > > wrote:
> > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > Any update or new findings on
> > this
> > > > > > topic?
> > > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > On Tue, Nov 28, 2023 at
> 8:38 AM
> > > > > > Francisco
> > > > > > > > > Javier
> > > > > > > > > >> > > Tirado
> > > > > > > > > >> > > > > > Sarti
> > > > > > > > > >> > > > > > > > > > > > > > <[email protected]> wrote:
> > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > > Hi Alex,
> > > > > > > > > >> > > > > > > > > > > > > > > After considering different
> > > > options
> > > > > to
> > > > > > > > > improve
> > > > > > > > > >> > > > > > performance,
> > > > > > > > > >> > > > > > > > we
> > > > > > > > > >> > > > > > > > > > feel
> > > > > > > > > >> > > > > > > > > > > > that
> > > > > > > > > >> > > > > > > > > > > > > > it
> > > > > > > > > >> > > > > > > > > > > > > > > is time to "partially" move
> > away
> > > > > from
> > > > > > > the
> > > > > > > > > >> current
> > > > > > > > > >> > > Map
> > > > > > > > > >> > > > > > style
> > > > > > > > > >> > > > > > > > > > > > interface (
> > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > >
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-kie-kogito-apps/blob/main/persistence-commons/persistence-commons-api/src/main/java/org/kie/kogito/persistence/api/Storage.java
> > > > > > > > > >> > > > > > > > > > > > > > )
> > > > > > > > > >> > > > > > > > > > > > > > > which was shared with
> Trusty,
> > to
> > > > one
> > > > > > > more
> > > > > > > > > >> > suitable
> > > > > > > > > >> > > > for
> > > > > > > > > >> > > > > > > usage
> > > > > > > > > >> > > > > > > > > > with a
> > > > > > > > > >> > > > > > > > > > > > > > > relational DB like
> postgresql
> > > (but
> > > > > > still
> > > > > > > > > >> > compatible
> > > > > > > > > >> > > > > with
> > > > > > > > > >> > > > > > > big
> > > > > > > > > >> > > > > > > > > > table
> > > > > > > > > >> > > > > > > > > > > > dbs).
> > > > > > > > > >> > > > > > > > > > > > > > > The idea will be to replace
> > > > generic
> > > > > > > > Storage
> > > > > > > > > >> > > interface
> > > > > > > > > >> > > > > by
> > > > > > > > > >> > > > > > > four
> > > > > > > > > >> > > > > > > > > > > > specific
> > > > > > > > > >> > > > > > > > > > > > > > > interfaces (which will
> inherit
> > > > from
> > > > > a
> > > > > > > > common
> > > > > > > > > >> one
> > > > > > > > > >> > > that
> > > > > > > > > >> > > > > > keeps
> > > > > > > > > >> > > > > > > > the
> > > > > > > > > >> > > > > > > > > > query
> > > > > > > > > >> > > > > > > > > > > > > > part
> > > > > > > > > >> > > > > > > > > > > > > > > at is it. with get and query
> > > > > methods),
> > > > > > > > that
> > > > > > > > > >> will
> > > > > > > > > >> > > > > include
> > > > > > > > > >> > > > > > > the
> > > > > > > > > >> > > > > > > > > > required
> > > > > > > > > >> > > > > > > > > > > > > > > modification operations for
> > the
> > > > four
> > > > > > > > > DataIndex
> > > > > > > > > >> > > > > "domains":
> > > > > > > > > >> > > > > > > > > > > > > > processinstance,
> > > > > > > > > >> > > > > > > > > > > > > > > usertask, processdefinitions
> > and
> > > > > jobs.
> > > > > > > > Those
> > > > > > > > > >> > > > interfaces
> > > > > > > > > >> > > > > > > will
> > > > > > > > > >> > > > > > > > > > define
> > > > > > > > > >> > > > > > > > > > > > > > methods
> > > > > > > > > >> > > > > > > > > > > > > > > like addNode, addVariable,
> > > > > updateTask,
> > > > > > > > > >> > > > > addAttachment.....
> > > > > > > > > >> > > > > > > > that
> > > > > > > > > >> > > > > > > > > > will
> > > > > > > > > >> > > > > > > > > > > > allow
> > > > > > > > > >> > > > > > > > > > > > > > > the persistent layer
> > > > implementation
> > > > > > to
> > > > > > > > just
> > > > > > > > > >> > update
> > > > > > > > > >> > > > the
> > > > > > > > > >> > > > > > > > needed
> > > > > > > > > >> > > > > > > > > > info
> > > > > > > > > >> > > > > > > > > > > > in
> > > > > > > > > >> > > > > > > > > > > > > > the
> > > > > > > > > >> > > > > > > > > > > > > > > DB  (for example, for
> addNode
> > in
> > > > > > > Postgres,
> > > > > > > > > >> just
> > > > > > > > > >> > > > insert
> > > > > > > > > >> > > > > a
> > > > > > > > > >> > > > > > > row
> > > > > > > > > >> > > > > > > > > into
> > > > > > > > > >> > > > > > > > > > > > nodes
> > > > > > > > > >> > > > > > > > > > > > > > > table, for addNode in Mongo,
> > > > > basically
> > > > > > > the
> > > > > > > > > >> same
> > > > > > > > > >> > > > atomic
> > > > > > > > > >> > > > > > > upsert
> > > > > > > > > >> > > > > > > > > > > > operation
> > > > > > > > > >> > > > > > > > > > > > > > > that is currently done).
> > > > Therefore,
> > > > > we
> > > > > > > > > >> increase
> > > > > > > > > >> > > > > > performance
> > > > > > > > > >> > > > > > > > for
> > > > > > > > > >> > > > > > > > > > > > Postgres
> > > > > > > > > >> > > > > > > > > > > > > > > and keep the current one for
> > > > Mongo.
> > > > > > The
> > > > > > > > > >> current
> > > > > > > > > >> > DB
> > > > > > > > > >> > > > > > schemas
> > > > > > > > > >> > > > > > > > > won't
> > > > > > > > > >> > > > > > > > > > be
> > > > > > > > > >> > > > > > > > > > > > > > > touched.
> > > > > > > > > >> > > > > > > > > > > > > > > Since the code change is
> > large,
> > > I
> > > > do
> > > > > > not
> > > > > > > > > think
> > > > > > > > > >> > I'll
> > > > > > > > > >> > > > be
> > > > > > > > > >> > > > > > able
> > > > > > > > > >> > > > > > > > to
> > > > > > > > > >> > > > > > > > > > have
> > > > > > > > > >> > > > > > > > > > > > the
> > > > > > > > > >> > > > > > > > > > > > > > PR
> > > > > > > > > >> > > > > > > > > > > > > > > ready till next week.
> > > > > > > > > >> > > > > > > > > > > > > > > But before starting, please
> > let
> > > me
> > > > > > know
> > > > > > > if
> > > > > > > > > >> that
> > > > > > > > > >> > > > > approach
> > > > > > > > > >> > > > > > is
> > > > > > > > > >> > > > > > > > > fine
> > > > > > > > > >> > > > > > > > > > for
> > > > > > > > > >> > > > > > > > > > > > you.
> > > > > > > > > >> > > > > > > > > > > > > > > Best regards.
> > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at
> > 6:55 PM
> > > > Alex
> > > > > > > > > Porcelli
> > > > > > > > > >> <
> > > > > > > > > >> > > > > > > > > [email protected]>
> > > > > > > > > >> > > > > > > > > > > > wrote:
> > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > > > Thank you Francisco to
> > getting
> > > > > > deeper
> > > > > > > on
> > > > > > > > > >> this…
> > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > > > Looking forward to see the
> > > > results
> > > > > > of
> > > > > > > > your
> > > > > > > > > >> > > > suggested
> > > > > > > > > >> > > > > > > > > > improvements.
> > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at
> > > 9:40 AM
> > > > > > > > Francisco
> > > > > > > > > >> > Javier
> > > > > > > > > >> > > > > Tirado
> > > > > > > > > >> > > > > > > > > Sarti <
> > > > > > > > > >> > > > > > > > > > > > > > > > [email protected]>
> wrote:
> > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > > > > I forgot to attach the
> > > queries
> > > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at
> > > > 3:04 PM
> > > > > > > > > Francisco
> > > > > > > > > >> > > Javier
> > > > > > > > > >> > > > > > Tirado
> > > > > > > > > >> > > > > > > > > > Sarti <
> > > > > > > > > >> > > > > > > > > > > > > > > > > [email protected]>
> > wrote:
> > > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > > > >> Hi,
> > > > > > > > > >> > > > > > > > > > > > > > > > >> A brief update on this
> > > topic.
> > > > > > > > > >> > > > > > > > > > > > > > > > >> After doing a simple
> test
> > > > with
> > > > > > > > example
> > > > > > > > > >> > > > > > > > > > > > > > > > >>
> > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > >
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-kie-kogito-examples/tree/stable/serverless-workflow-examples/serverless-workflow-data-index-quarkus
> > > > > > > > > >> > > > > > > > > > > > > > > > ,
> > > > > > > > > >> > > > > > > > > > > > > > > > >> the number of updates
> > over
> > > > > Nodes
> > > > > > > > table
> > > > > > > > > is
> > > > > > > > > >> > n*n,
> > > > > > > > > >> > > > so
> > > > > > > > > >> > > > > we
> > > > > > > > > >> > > > > > > > > manage
> > > > > > > > > >> > > > > > > > > > to
> > > > > > > > > >> > > > > > > > > > > > > > obtain a
> > > > > > > > > >> > > > > > > > > > > > > > > > >> perfect quadratic
> > > performance
> > > > > > > > > >> degradation.
> > > > > > > > > >> > The
> > > > > > > > > >> > > > > > problem
> > > > > > > > > >> > > > > > > > is
> > > > > > > > > >> > > > > > > > > > worse
> > > > > > > > > >> > > > > > > > > > > > in
> > > > > > > > > >> > > > > > > > > > > > > > the
> > > > > > > > > >> > > > > > > > > > > > > > > > case
> > > > > > > > > >> > > > > > > > > > > > > > > > >> of Serverless Workflow
> > than
> > > > in
> > > > > > BPMN
> > > > > > > > > >> because
> > > > > > > > > >> > we
> > > > > > > > > >> > > > the
> > > > > > > > > >> > > > > > > > number
> > > > > > > > > >> > > > > > > > > of
> > > > > > > > > >> > > > > > > > > > > > nodes
> > > > > > > > > >> > > > > > > > > > > > > > is
> > > > > > > > > >> > > > > > > > > > > > > > > > >> greater than the number
> > of
> > > > > > states.
> > > > > > > In
> > > > > > > > > >> that
> > > > > > > > > >> > > > > example N
> > > > > > > > > >> > > > > > > is
> > > > > > > > > >> > > > > > > > > 16,
> > > > > > > > > >> > > > > > > > > > but
> > > > > > > > > >> > > > > > > > > > > > for
> > > > > > > > > >> > > > > > > > > > > > > > a
> > > > > > > > > >> > > > > > > > > > > > > > > > more
> > > > > > > > > >> > > > > > > > > > > > > > > > >> complex workflow it
> would
> > > be
> > > > > > > > certainly
> > > > > > > > > >> > large.
> > > > > > > > > >> > > > > > > > > > > > > > > > >> I think that this is
> more
> > > > > related
> > > > > > > to
> > > > > > > > > how
> > > > > > > > > >> we
> > > > > > > > > >> > > are
> > > > > > > > > >> > > > > > > handling
> > > > > > > > > >> > > > > > > > > > JPA in
> > > > > > > > > >> > > > > > > > > > > > the
> > > > > > > > > >> > > > > > > > > > > > > > > > code,
> > > > > > > > > >> > > > > > > > > > > > > > > > >> in particular the
> mapping
> > > > from
> > > > > > > model
> > > > > > > > to
> > > > > > > > > >> > entity
> > > > > > > > > >> > > > > > > > (basically
> > > > > > > > > >> > > > > > > > > > JPA is
> > > > > > > > > >> > > > > > > > > > > > > > blind
> > > > > > > > > >> > > > > > > > > > > > > > > > and
> > > > > > > > > >> > > > > > > > > > > > > > > > >> has to update all nodes
> > for
> > > > > every
> > > > > > > > write
> > > > > > > > > >> > > because
> > > > > > > > > >> > > > it
> > > > > > > > > >> > > > > > > > > believes
> > > > > > > > > >> > > > > > > > > > the
> > > > > > > > > >> > > > > > > > > > > > > > node has
> > > > > > > > > >> > > > > > > > > > > > > > > > >> been updated, although
> it
> > > is
> > > > > not)
> > > > > > > > than
> > > > > > > > > an
> > > > > > > > > >> > > issue
> > > > > > > > > >> > > > in
> > > > > > > > > >> > > > > > the
> > > > > > > > > >> > > > > > > > > table
> > > > > > > > > >> > > > > > > > > > > > > > definition.
> > > > > > > > > >> > > > > > > > > > > > > > > > >> In fact, when using
> JPA,
> > > > > > separating
> > > > > > > > the
> > > > > > > > > >> > server
> > > > > > > > > >> > > > > model
> > > > > > > > > >> > > > > > > > from
> > > > > > > > > >> > > > > > > > > > the
> > > > > > > > > >> > > > > > > > > > > > JPA
> > > > > > > > > >> > > > > > > > > > > > > > > > entity is
> > > > > > > > > >> > > > > > > > > > > > > > > > >> not a good idea,
> > especially
> > > > if
> > > > > > the
> > > > > > > > > entity
> > > > > > > > > >> > > > contains
> > > > > > > > > >> > > > > > > > > > collections.
> > > > > > > > > >> > > > > > > > > > > > I
> > > > > > > > > >> > > > > > > > > > > > > > will
> > > > > > > > > >> > > > > > > > > > > > > > > > try
> > > > > > > > > >> > > > > > > > > > > > > > > > >> to change that without
> > > > breaking
> > > > > > > > > anything.
> > > > > > > > > >> > > > > > > > > > > > > > > > >>
> > > > > > > > > >> > > > > > > > > > > > > > > > >> On Wed, Nov 22, 2023 at
> > > > > 12:10 PM
> > > > > > > > > Enrique
> > > > > > > > > >> > > > Gonzalez
> > > > > > > > > >> > > > > > > > > Martinez <
> > > > > > > > > >> > > > > > > > > > > > > > > > >> [email protected]>
> > > wrote:
> > > > > > > > > >> > > > > > > > > > > > > > > > >>
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> After the events split
> > you
> > > > now
> > > > > > > will
> > > > > > > > > >> need to
> > > > > > > > > >> > > > > create
> > > > > > > > > >> > > > > > a
> > > > > > > > > >> > > > > > > > node
> > > > > > > > > >> > > > > > > > > > > > instance
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> model instance of
> making
> > > > > > > independent
> > > > > > > > > >> from
> > > > > > > > > >> > the
> > > > > > > > > >> > > > > > process
> > > > > > > > > >> > > > > > > > > > instance.
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> That should do the
> > trick.
> > > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> Regarding
> > > deleting/inserting
> > > > > it
> > > > > > > was
> > > > > > > > > >> fixed
> > > > > > > > > >> > at
> > > > > > > > > >> > > > some
> > > > > > > > > >> > > > > > > > point.
> > > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> El mar, 21 nov 2023 a
> > las
> > > > > 20:22,
> > > > > > > > > >> Francisco
> > > > > > > > > >> > > > Javier
> > > > > > > > > >> > > > > > > > Tirado
> > > > > > > > > >> > > > > > > > > > Sarti
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> (<[email protected]
> >)
> > > > > > escribió:
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > Hi Martin,
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > I have a task to
> > review
> > > > > > > > performance
> > > > > > > > > of
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > > ProcessInstanceNodeDataEventMerger
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > My idea is to reduce
> > the
> > > > > > number
> > > > > > > of
> > > > > > > > > >> delete
> > > > > > > > > >> > > > > inserts
> > > > > > > > > >> > > > > > > > when
> > > > > > > > > >> > > > > > > > > > > > processing
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> events
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > and try to do it
> > > > > incremental.
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > That should improve
> > > > > > performance.
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > PS:
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > I was planning to
> send
> > > an
> > > > > > e-mail
> > > > > > > > > >> tomorrow
> > > > > > > > > >> > > > > > > announcing
> > > > > > > > > >> > > > > > > > > > that in
> > > > > > > > > >> > > > > > > > > > > > > > case you
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> were
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > already working on a
> > fix
> > > > for
> > > > > > > > that. I
> > > > > > > > > >> > assume
> > > > > > > > > >> > > > you
> > > > > > > > > >> > > > > > are
> > > > > > > > > >> > > > > > > > not
> > > > > > > > > >> > > > > > > > > > and I
> > > > > > > > > >> > > > > > > > > > > > > > would
> > > > > > > > > >> > > > > > > > > > > > > > > > be
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > sending a PR soon.
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > On Tue, Nov 21, 2023
> > at
> > > > > > 6:09 PM
> > > > > > > > > Martin
> > > > > > > > > >> > > Weiler
> > > > > > > > > >> > > > > > > > > > > > > > > > <[email protected]
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > wrote:
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > I looked into the
> > new
> > > > > > examples
> > > > > > > > > using
> > > > > > > > > >> > > > > data-index
> > > > > > > > > >> > > > > > > > > > persistence
> > > > > > > > > >> > > > > > > > > > > > > > addon -
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> Neus'
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > PR#1813 [1] for
> > > > serverless
> > > > > > and
> > > > > > > > > >> Pere's
> > > > > > > > > >> > > > branch
> > > > > > > > > >> > > > > > [2]
> > > > > > > > > >> > > > > > > > for
> > > > > > > > > >> > > > > > > > > > > > workflow
> > > > > > > > > >> > > > > > > > > > > > > > > > (great
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> job
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > both!) - and they
> > work
> > > > > > without
> > > > > > > > > >> issues
> > > > > > > > > >> > > using
> > > > > > > > > >> > > > > > > single
> > > > > > > > > >> > > > > > > > > > > > requests.
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> However, under
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > some load (I used
> > 'ab'
> > > > for
> > > > > > > > testing
> > > > > > > > > >> > with a
> > > > > > > > > >> > > > > light
> > > > > > > > > >> > > > > > > > > > > > concurrency of
> > > > > > > > > >> > > > > > > > > > > > > > 10
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> parallel
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > requests) I ran
> into
> > > the
> > > > > > > > following
> > > > > > > > > >> > > > problems:
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > (1) Large number
> of
> > > > > > > > insert/delete
> > > > > > > > > >> calls
> > > > > > > > > >> > > > (eg.
> > > > > > > > > >> > > > > > for
> > > > > > > > > >> > > > > > > > > tables
> > > > > > > > > >> > > > > > > > > > > > such as
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> nodes,
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > definitions, etc)
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > (2) Hibernate
> > > > > > > > > >> OptimisticLockExceptions
> > > > > > > > > >> > /
> > > > > > > > > >> > > > > > > > > > > > StaleStateExceptions
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > (3) DB deadlocks
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > (4) Error
> responses,
> > > > slow
> > > > > > > > response
> > > > > > > > > >> > times
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > The reason I am
> > > reaching
> > > > > out
> > > > > > > > with
> > > > > > > > > >> this
> > > > > > > > > >> > > > topic
> > > > > > > > > >> > > > > > here
> > > > > > > > > >> > > > > > > > is
> > > > > > > > > >> > > > > > > > > to
> > > > > > > > > >> > > > > > > > > > > > find
> > > > > > > > > >> > > > > > > > > > > > > > out if
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> we are
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > aware of this
> issue,
> > > and
> > > > > if
> > > > > > > > > someone
> > > > > > > > > >> is
> > > > > > > > > >> > > > > already
> > > > > > > > > >> > > > > > > > > looking
> > > > > > > > > >> > > > > > > > > > > > into or
> > > > > > > > > >> > > > > > > > > > > > > > > > being
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > assigned to it?
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > Thanks,
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > Martin
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > [1]
> > > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> >
> > > > > >
> https://github.com/apache/incubator-kie-kogito-examples/pull/1813
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > [2]
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > >
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/pefernan/kogito-examples/tree/example_data-index_persistence
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >>
> > > > > > >
> > > ---------------------------------------------------------------------
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > To unsubscribe,
> > > e-mail:
> > > > > > > > > >> > > > > > > > > [email protected]
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > > For additional
> > > commands,
> > > > > > > e-mail:
> > > > > > > > > >> > > > > > > > > > [email protected]
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > > > >> >
> > > > > > > >
> > > > ---------------------------------------------------------------------
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> To unsubscribe,
> e-mail:
> > > > > > > > > >> > > > > > > [email protected]
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> For additional
> commands,
> > > > > e-mail:
> > > > > > > > > >> > > > > > > > [email protected]
> > > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >>
> > > > > > >
> > > ---------------------------------------------------------------------
> > > > > > > > > >> > > > > > > > > > > > > > > > > To unsubscribe, e-mail:
> > > > > > > > > >> > > > > > [email protected]
> > > > > > > > > >> > > > > > > > > > > > > > > > > For additional commands,
> > > > e-mail:
> > > > > > > > > >> > > > > > > [email protected]
> > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > > > >> >
> > > > > > > >
> > > > ---------------------------------------------------------------------
> > > > > > > > > >> > > > > > > > > > > > > > To unsubscribe, e-mail:
> > > > >
> > >
> >
>

Re: [DISCUSSION] Performance issues with data-index persistence addon

Reply via email to