Re: [DISCUSSION] Performance issues with data-index persistence addon

Francisco Javier Tirado Sarti Thu, 22 Feb 2024 04:04:07 -0800

Yes, I understood the idea was to use k8s cronjob (
https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/) , but
we should not allow users to setup DB deletion on their own, therefore we
need to provide the script  cronjob manifest will invoke to delete the db
rows (and one of the parameters of the script has to be the duration
interval). I got your point, this way we avoid adding a background thread
to our microservice dedicated to the purge (and base on a config value).
About bulk, note that both operations I described in my latest e-mail ( a
list of process instances to be deleted or a calculated list of process
instances to be deleted based on completion date) are bulk. My point was
that we need to carefully decide which  BULK API this REST interface
provides.


On Thu, Feb 22, 2024 at 12:18 PM Alex Porcelli <[email protected]> wrote:

> Francisco,
>
> We wont use cron ourselves, we’ll recommend to users if they want a regular
> cleanup. We don’t need to own this, Kubernetes has native cron and OSs also
> have their native cron…. So we don’t need replicate such functionality.
>
> I think the cleanup we are talking about here is a bulk operation,
> individual instances deletion should be handled differently.
>
> On Thu, Feb 22, 2024 at 6:03 AM Francisco Javier Tirado Sarti <
> [email protected]> wrote:
>
> > Ok, so we have two flavours, clean up on demand through REST user call
> > (that's fine, the only issue is the scope of it. For example, should we
> > allow a bulk delete api where you specify the process instances, if the
> > user knows them somehow, that user wants to delete or just, as we are
> > proposing, delete all instances that have been there for a while?) and
> cron
> > process (that is a replacement of the properties approach I recommend).
> > In my opinion, if we have the explicit REST API, we should not use the
> > cron.
> >
> > On Thu, Feb 22, 2024 at 6:32 AM Enrique Gonzalez Martinez <
> > [email protected]> wrote:
> >
> > > +1 alex
> > >
> > > El jue, 22 feb 2024, 0:14, Alex Porcelli <[email protected]> escribió:
> > >
> > > > I think I need to clarify what I mean by REST...
> > > >
> > > > The REST endpoint I have in mind would execute the necessary cleaning
> > > > based on an input parameter (ie. instanced closed for more than X
> days
> > > > or something like that).
> > > >
> > > > The use of cron is that an external cron could call the cleaning REST
> > > > endpoint from time to time (once a month, on every sunday night,
> etc).
> > > >
> > > >
> > > > On Wed, Feb 21, 2024 at 10:50 AM Francisco Javier Tirado Sarti
> > > > <[email protected]> wrote:
> > > > >
> > > > > Hi Alex,
> > > > > I'm wondering how cron and rest can live together.
> > > > > With cron the responsibility to do the clean up relies on a process
> > > > > external to the microservice accessing the db to be purged, while
> > where
> > > > > using REST API the process performing the clean up is typically
> (but
> > > not
> > > > > necessarily) the same microservice that accesses the DB. In that
> > sense,
> > > > > REST API is similar to using properties (in both approaches the
> > > > responsible
> > > > > to clean up the data is the microservice), but they are different
> > > because
> > > > > REST is more sophisticated (but compatible, implementation wise we
> > can
> > > > use
> > > > > a ConfigSource, that is updated by the REST and read, as a
> property,
> > by
> > > > the
> > > > > cleanup routine). So, probably I'm missing something but I see the
> > cron
> > > > > approach not reconcilable with the REST approach.
> > > > > Anyway just a detail, I agree the purge approach should be similar
> > for
> > > > all
> > > > > microservices
> > > > >
> > > > > On Wed, Feb 21, 2024 at 4:37 PM Alex Porcelli <[email protected]
> >
> > > > wrote:
> > > > >
> > > > > > +1 for provide consistent behavior across the board
> > > > > >
> > > > > > my 2c:
> > > > > >
> > > > > > +1 for REST interface (preferred for external general management
> > > > > > perspective)
> > > > > > -1 for properties to auto clean; this can be achieved with cron
> and
> > > > rest
> > > > > > call
> > > > > >
> > > > > > Now about GraphQL: the platform today is heavily invested in such
> > > > > > interface, so it might make sense. BUT I’d consider it in a
> second
> > > > moment.
> > > > > >
> > > > > >
> > > > > > On Wed, Feb 21, 2024 at 6:52 AM Francisco Javier Tirado Sarti <
> > > > > > [email protected]> wrote:
> > > > > >
> > > > > > > Hi Enrique,
> > > > > > > If we configure such policies through properties, it won't be
> > > enough
> > > > to
> > > > > > > define a naming convention and rely on the target platform
> > > > configuration
> > > > > > > capabilities (SpringBoot or Quarkus)?.
> > > > > > >
> > > > > > > On Wed, Feb 21, 2024 at 12:36 PM Enrique Gonzalez Martinez <
> > > > > > > [email protected]> wrote:
> > > > > > >
> > > > > > > > Hi Francisco
> > > > > > > >
> > > > > > > > The discussion we need to have before how to achieve certain
> > > > features,
> > > > > > is
> > > > > > > > the overall of the user experience. If you want to clean it
> up
> > > > that way
> > > > > > > it
> > > > > > > > is fine by me. I have nothing against setting policy related
> to
> > > > some
> > > > > > sort
> > > > > > > > of clean up for a process completed some time ago.
> > > > > > > >
> > > > > > > > As we discussed previously the data index is a snapshot of
> the
> > > last
> > > > > > state
> > > > > > > > of the process instance included completed, but nothing was
> > said
> > > > once
> > > > > > > > completed when we need to clean that up so any policy is
> > welcome.
> > > > How
> > > > > > to
> > > > > > > > achieve that policy is something completely different.
> > > > > > > >
> > > > > > > > The main problem is how every microservice is exposing that
> API
> > > to
> > > > the
> > > > > > > end
> > > > > > > > user or consumer (other system) and which system is going to
> be
> > > the
> > > > > > > façade
> > > > > > > > for complex deployments and operations. That is the
> discussion
> > I
> > > > was
> > > > > > > > mentioning before.
> > > > > > > >
> > > > > > > > What I want to avoid is to set different policies among
> > > components
> > > > and
> > > > > > we
> > > > > > > > should strive to be as much as possible to offer certain
> > > > capabilities
> > > > > > in
> > > > > > > > the same fashion, e.g: clean up mechanism.
> > > > > > > > In the same way I want to avoid making the current system
> more
> > > > complex
> > > > > > > than
> > > > > > > > they are. So far they are aligned offering one simple
> > > > responsibility
> > > > > > but
> > > > > > > I
> > > > > > > > would be against mixing for instance job service with data
> > index.
> > > > > > > >
> > > > > > > > El mié, 21 feb 2024 a las 12:13, Francisco Javier Tirado
> Sarti
> > (<
> > > > > > > > [email protected]>) escribió:
> > > > > > > >
> > > > > > > > > Hi Enrique,
> > > > > > > > > In the case of data index I think the data to be purged is
> > > > finished
> > > > > > > > process
> > > > > > > > > instances (I do not think we should remove process instance
> > > that
> > > > has
> > > > > > > been
> > > > > > > > > alive for ages, even if it is very likely they are not
> going
> > to
> > > > be
> > > > > > ever
> > > > > > > > > completed)
> > > > > > > > > Once you delete those process instances, you also delete
> the
> > > > > > associated
> > > > > > > > > user tasks and jobs.
> > > > > > > > > Therefore the problem is relatively simple, to be able to
> > > > configure
> > > > > > how
> > > > > > > > > much time a completed process instance should remain in the
> > > data
> > > > > > index
> > > > > > > > > database. We can take a simple approach: a property with a
> > min
> > > > > > duration
> > > > > > > > > that cannot be changed once data index is started; a
> slightly
> > > > complex
> > > > > > > > one:
> > > > > > > > > the same property but watching it to react for changes; or
> > the
> > > > full
> > > > > > > > suite:
> > > > > > > > > an admin API to be able to change the policy at any moment.
> > > > > > > > > I think this discussion is worth having.
> > > > > > > > >
> > > > > > > > > On Wed, Feb 21, 2024 at 6:15 AM Enrique Gonzalez Martinez <
> > > > > > > > > [email protected]> wrote:
> > > > > > > > >
> > > > > > > > > > Hi Martin, the main problem regarding the purge is
> because
> > it
> > > > is
> > > > > > > still
> > > > > > > > > > unclear the policy for what tech to use and the future
> > > > components.
> > > > > > > > > >
> > > > > > > > > > Recently we had a discussion about proposing graphql for
> > this
> > > > sort
> > > > > > of
> > > > > > > > > admin
> > > > > > > > > > tasks. So far for subsystems we have been using rest
> > > endpoints
> > > > > > (like
> > > > > > > > > update
> > > > > > > > > > timers or modify human task or change processes). There
> is
> > > one
> > > > > > > > exception
> > > > > > > > > > which is the gateway that is pure graphql and somehow
> uses
> > > > graphql
> > > > > > > for
> > > > > > > > > > everything making complex operations under the hood.
> > > > > > > > > >
> > > > > > > > > > This has somehow frozen the purge for data audit for a
> bit
> > > and
> > > > the
> > > > > > > > > proposal
> > > > > > > > > > was to use rest endpoints to do the clean up in the
> > component
> > > > and
> > > > > > > offer
> > > > > > > > > the
> > > > > > > > > > graphql counterpart in the gateway promoting it to a
> first
> > > > class
> > > > > > > > citizen
> > > > > > > > > > component instead of having it embedded in the data
> index.
> > > > > > > > > >
> > > > > > > > > > I would suggest to come up at least with a policy first
> > > > regarding
> > > > > > the
> > > > > > > > > > convention every component should address this.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > El mar, 20 feb 2024, 23:39, Martin Weiler
> > > > <[email protected]
> > > > > > >
> > > > > > > > > > escribió:
> > > > > > > > > >
> > > > > > > > > > > IMO, it is good to have this discussion around data
> > sanity
> > > > now
> > > > > > > > instead
> > > > > > > > > of
> > > > > > > > > > > putting it off until later when data has already
> > > accumulated
> > > > in
> > > > > > > > > > production
> > > > > > > > > > > environments.
> > > > > > > > > > >
> > > > > > > > > > > Based on the input here, we are dealing with three
> types
> > of
> > > > data:
> > > > > > > > > > > 1. Runtime data - active instances only, engine cleans
> up
> > > the
> > > > > > data
> > > > > > > > > > > automatically at process instance end
> > > > > > > > > > > 2. Historic log data - data created by data-audit
> > intended
> > > > for
> > > > > > long
> > > > > > > > > term
> > > > > > > > > > > storage
> > > > > > > > > > > 3. Data-index data - somehow this data falls in between
> > the
> > > > two
> > > > > > > > > > > aforementioned categories, with the idea of the data
> > being
> > > > > > > "recent",
> > > > > > > > > but
> > > > > > > > > > > not restricted to active instances only
> > > > > > > > > > >
> > > > > > > > > > > We'd need purge strategies for both #2 and #3 (perhaps
> > > > different
> > > > > > > > ones,
> > > > > > > > > or
> > > > > > > > > > > with different config settings) in order to prevent
> > > unlimited
> > > > > > data
> > > > > > > > > > growth.
> > > > > > > > > > >
> > > > > > > > > > > ________________________________________
> > > > > > > > > > > From: Enrique Gonzalez Martinez <[email protected]>
> > > > > > > > > > > Sent: Monday, February 19, 2024 7:11 AM
> > > > > > > > > > > To: [email protected]
> > > > > > > > > > > Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues
> > > with
> > > > > > > > data-index
> > > > > > > > > > > persistence addon
> > > > > > > > > > >
> > > > > > > > > > > Hi Francisco,
> > > > > > > > > > > To give you more context about this.
> > > > > > > > > > >
> > > > > > > > > > > STP is a concept, a process with certain constraints:
> no
> > > > > > > persistence
> > > > > > > > > and
> > > > > > > > > > > returning the outcome in the call (sync execution with
> no
> > > > idle
> > > > > > > > states).
> > > > > > > > > > It
> > > > > > > > > > > was a requirement from a user in the past. One of the
> > > > > > requirements
> > > > > > > > was
> > > > > > > > > > > leaving no trail. In v7 was easy because you could
> > disable
> > > > the
> > > > > > > audit
> > > > > > > > in
> > > > > > > > > > > that case. Actually we have the same way to do what we
> > did
> > > > in v7
> > > > > > in
> > > > > > > > > here
> > > > > > > > > > as
> > > > > > > > > > > you can add/remove index just removing deps.
> > > > > > > > > > >
> > > > > > > > > > > We have the same outcome with different approaches and
> > STP
> > > is
> > > > > > > already
> > > > > > > > > > > delivered.
> > > > > > > > > > >
> > > > > > > > > > > El lun, 19 feb 2024 a las 14:46, Francisco Javier
> Tirado
> > > > Sarti (<
> > > > > > > > > > > [email protected]>) escribió:
> > > > > > > > > > >
> > > > > > > > > > > > Regarding STP (which is not a concept that we have in
> > the
> > > > > > code. I
> > > > > > > > > mean
> > > > > > > > > > > STP
> > > > > > > > > > > > are processes as nonSTP are), I guess, as all
> > processes,
> > > > they
> > > > > > > were
> > > > > > > > > kept
> > > > > > > > > > > in
> > > > > > > > > > > > DataIndex once completed because users wanted (and
> > still
> > > > wants)
> > > > > > > to
> > > > > > > > > > check
> > > > > > > > > > > > the result once the call had been performed. If we
> want
> > > to
> > > > > > leave
> > > > > > > no
> > > > > > > > > > trace
> > > > > > > > > > > > of them in DataIndex for some reason, we will need to
> > > make
> > > > it a
> > > > > > > > > > > > Runtimes concept so DataIndex can handle them in a
> > > > different
> > > > > > way.
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Feb 19, 2024 at 2:27 PM Enrique Gonzalez
> > > Martinez <
> > > > > > > > > > > > [email protected]> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Alex:
> > > > > > > > > > > > > Right now the data index is working in the same way
> > as
> > > > it did
> > > > > > > in
> > > > > > > > v7
> > > > > > > > > > > with
> > > > > > > > > > > > > the emitters. The only difference between two impl
> is
> > > > that in
> > > > > > > > here
> > > > > > > > > > the
> > > > > > > > > > > > > storage is pgsql instead elastic search.  You are
> > right
> > > > > > > regarding
> > > > > > > > > is
> > > > > > > > > > a
> > > > > > > > > > > > > snapshot of the last state of the process but we
> did
> > > > never
> > > > > > > define
> > > > > > > > > how
> > > > > > > > > > > > long
> > > > > > > > > > > > > would be alive that dats Honestly i am happy right
> > now
> > > > with
> > > > > > the
> > > > > > > > way
> > > > > > > > > > it
> > > > > > > > > > > > > works. The clean up mechanism is still tbd because
> we
> > > > still
> > > > > > > need
> > > > > > > > to
> > > > > > > > > > > > discuss
> > > > > > > > > > > > > other stuff first.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Regarding stp is to leave no trail because u can
> get
> > > the
> > > > > > > outcome
> > > > > > > > > > > directly
> > > > > > > > > > > > > from the call. It was defined like that in v7. So
> > there
> > > > is no
> > > > > > > use
> > > > > > > > > for
> > > > > > > > > > > the
> > > > > > > > > > > > > index or the audit.
> > > > > > > > > > > > >
> > > > > > > > > > > > > El lun, 19 feb 2024, 14:13, Francisco Javier Tirado
> > > > Sarti <
> > > > > > > > > > > > > [email protected]> escribió:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Alex,
> > > > > > > > > > > > > > There has been some confusion about the purpose
> of
> > > > > > DataIndex.
> > > > > > > > To
> > > > > > > > > be
> > > > > > > > > > > > > honest
> > > > > > > > > > > > > > I believe they were already sorted out, but your
> > > e-mail
> > > > > > makes
> > > > > > > > me
> > > > > > > > > > > think
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > is not the case ;). I let Kris to clarify that
> with
> > > > you. My
> > > > > > > > view
> > > > > > > > > is
> > > > > > > > > > > > that
> > > > > > > > > > > > > > data-index is a way to query recently closed and
> > > active
> > > > > > > > processes
> > > > > > > > > > > (the
> > > > > > > > > > > > > key
> > > > > > > > > > > > > > here is the definition of recently, which in my
> > > opinion
> > > > > > > should
> > > > > > > > be
> > > > > > > > > > > > > > configurable)
> > > > > > > > > > > > > > But, besides that discussion and being pragmatic,
> > > > keeping
> > > > > > > > > finishing
> > > > > > > > > > > > > process
> > > > > > > > > > > > > > instances "for a while" in DataIndex was the only
> > way
> > > > for
> > > > > > > users
> > > > > > > > > to
> > > > > > > > > > > > query
> > > > > > > > > > > > > > the result of straight through processes. That's
> a
> > > > function
> > > > > > > > that
> > > > > > > > > > > cannot
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > removed right now
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, Feb 19, 2024 at 1:33 PM Alex Porcelli <
> > > > > > > > > [email protected]
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > if data index was supposed to provide snapshot
> > view
> > > > of
> > > > > > the
> > > > > > > > > > process
> > > > > > > > > > > > > > > instance… why do we keep it after the process
> > > > instance is
> > > > > > > > > > finished?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Mon, Feb 19, 2024 at 7:12 AM Francisco
> Javier
> > > > Tirado
> > > > > > > > Sarti <
> > > > > > > > > > > > > > > [email protected]> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi Martin.
> > > > > > > > > > > > > > > > After taking a deeper look at this, I realize
> > > that
> > > > the
> > > > > > > > > > behaviour
> > > > > > > > > > > is
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > expected one.
> > > > > > > > > > > > > > > > Runtimes DB does not track the completed
> > process
> > > > > > instance
> > > > > > > > > > (that's
> > > > > > > > > > > > > what
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > JDBCProcessInstances warn is telling us), but
> > > > > > DataIndex,
> > > > > > > as
> > > > > > > > > > > > expected,
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > tracking it in processes and nodes table. And
> > yes
> > > > it
> > > > > > will
> > > > > > > > > grow
> > > > > > > > > > > over
> > > > > > > > > > > > > > time.
> > > > > > > > > > > > > > > > What we need is some configurable purge
> > mechanism
> > > > for
> > > > > > > > > > DataIndex,
> > > > > > > > > > > so
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > eventually removes older completed process
> > > > instances.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Tue, Feb 13, 2024 at 12:59 PM Francisco
> > Javier
> > > > > > Tirado
> > > > > > > > > Sarti
> > > > > > > > > > <
> > > > > > > > > > > > > > > > [email protected]> wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi Martin,
> > > > > > > > > > > > > > > > > Good catch!. Looks like the skipping
> > performed
> > > > for
> > > > > > > > process
> > > > > > > > > > > > > instances
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > not applied to node instances. Something we
> > > > > > definitely
> > > > > > > > need
> > > > > > > > > > to
> > > > > > > > > > > > > review
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > the runtimes side.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Mon, Feb 12, 2024 at 11:59 PM Martin
> > Weiler
> > > > > > > > > > > > > > <[email protected]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> On a somewhat related note, testing a
> simple
> > > > > > workflow
> > > > > > > > > (start
> > > > > > > > > > > ->
> > > > > > > > > > > > > > script
> > > > > > > > > > > > > > > > >> node -> end), I see the following messages
> > in
> > > > the
> > > > > > > logs:
> > > > > > > > > > > > > > > > >> 2024-02-12 22:49:50,493 28758dde544c WARN
> > > > > > > > > > > > > > > > >>
> > > > > > > > [org.kie.kogito.persistence.jdbc.JDBCProcessInstances:-1]
> > > > > > > > > > > > > > > > >> (executor-thread-3) Skipping create of
> > process
> > > > > > > instance
> > > > > > > > > id:
> > > > > > > > > > > > > > > > >> 7083088e-b899-47cb-b85c-5d9ccb0aa166,
> > state: 2
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> So far, so good. And I'd expect to see no
> > > trace
> > > > of
> > > > > > > this
> > > > > > > > > > > process
> > > > > > > > > > > > in
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> database if I don't have data audit
> enabled.
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> However, the 'processes' table contains a
> > row
> > > > with
> > > > > > > > > state=2,
> > > > > > > > > > > with
> > > > > > > > > > > > > > > related
> > > > > > > > > > > > > > > > >> entries in the 'nodes' table. In a load
> > test,
> > > I
> > > > see
> > > > > > > > these
> > > > > > > > > > > tables
> > > > > > > > > > > > > > grow
> > > > > > > > > > > > > > > > >> significantly over time. Am I missing
> > > something
> > > > to
> > > > > > > have
> > > > > > > > > > these
> > > > > > > > > > > > > > entries
> > > > > > > > > > > > > > > > >> cleaned up automatically?
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> ________________________________________
> > > > > > > > > > > > > > > > >> From: Martin Weiler
> <[email protected]
> > >
> > > > > > > > > > > > > > > > >> Sent: Monday, February 12, 2024 3:40 PM
> > > > > > > > > > > > > > > > >> To: [email protected]
> > > > > > > > > > > > > > > > >> Subject: [EXTERNAL] RE: [DISCUSSION]
> > > Performance
> > > > > > > issues
> > > > > > > > > with
> > > > > > > > > > > > > > > data-index
> > > > > > > > > > > > > > > > >> persistence addon
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> Thanks everyone for your input. Based on
> > this
> > > > > > > > discussion,
> > > > > > > > > I
> > > > > > > > > > > > opened
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> following PR:
> > > > > > > > > > > > > > > > >>
> > > > > > > > > >
> > > https://github.com/apache/incubator-kie-kogito-apps/pull/1985
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> With this change, the performance seems to
> > be
> > > > stable
> > > > > > > > over
> > > > > > > > > > > time:
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1zkullvfrJpRp7TRjxDa41ok6kEIR7Fty/view?usp=sharing
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> Martin
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> ________________________________________
> > > > > > > > > > > > > > > > >> From: Gonzalo Muñoz <[email protected]>
> > > > > > > > > > > > > > > > >> Sent: Friday, February 9, 2024 9:42 AM
> > > > > > > > > > > > > > > > >> To: [email protected]
> > > > > > > > > > > > > > > > >> Subject: [EXTERNAL] Re: [DISCUSSION]
> > > Performance
> > > > > > > issues
> > > > > > > > > with
> > > > > > > > > > > > > > > data-index
> > > > > > > > > > > > > > > > >> persistence addon
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> Great work Francisco,
> > > > > > > > > > > > > > > > >> Martin, take a look at this link with some
> > > > related
> > > > > > > tips
> > > > > > > > > (in
> > > > > > > > > > > case
> > > > > > > > > > > > > you
> > > > > > > > > > > > > > > > find
> > > > > > > > > > > > > > > > >> it useful):
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > >
> > > > https://www.cybertec-postgresql.com/en/index-your-foreign-key/
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> El vie, 9 feb 2024 a las 17:20, Francisco
> > > Javier
> > > > > > > Tirado
> > > > > > > > > > Sarti
> > > > > > > > > > > (<
> > > > > > > > > > > > > > > > >> [email protected]>) escribió:
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> > For the moment being, we will keep JPA
> > till
> > > we
> > > > > > > exhaust
> > > > > > > > > all
> > > > > > > > > > > > > > > > >> possibilities,
> > > > > > > > > > > > > > > > >> > let's call switching from jpa to jdbc
> our
> > > > hidden
> > > > > > > plan
> > > > > > > > B
> > > > > > > > > ;)
> > > > > > > > > > > > > > > > >> > I already told Martin, but in order
> > everyone
> > > > to
> > > > > > > know,
> > > > > > > > > just
> > > > > > > > > > > > after
> > > > > > > > > > > > > > > > writing
> > > > > > > > > > > > > > > > >> > the previous email, I thought "what if
> > > > Postgres is
> > > > > > > not
> > > > > > > > > > > > > > automatically
> > > > > > > > > > > > > > > > >> > indexing foreign keys like mysql?" and,
> > > eureka
> > > > > > > > > > > > > > > > >> > Postgres doc
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > >
> > > https://www.postgresql.org/docs/current/ddl-constraints.html
> > > > > > > > > > > > > > > > >> > Mysql doc
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >
> > > > > > > > > >
> > > > > >
> > https://dev.mysql.com/doc/refman/8.0/en/constraint-foreign-key.html
> > > > > > > > > > > > > > > > >> > These are the relevant excerpt
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > >> > *Postgresql*
> > > > > > > > > > > > > > > > >> > *A foreign key must reference columns
> that
> > > > either
> > > > > > > are
> > > > > > > > a
> > > > > > > > > > > > primary
> > > > > > > > > > > > > > key
> > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > >> form
> > > > > > > > > > > > > > > > >> > a unique constraint, or are columns
> from a
> > > > > > > non-partial
> > > > > > > > > > > unique
> > > > > > > > > > > > > > index.
> > > > > > > > > > > > > > > > >> This
> > > > > > > > > > > > > > > > >> > means that the referenced columns always
> > > have
> > > > an
> > > > > > > index
> > > > > > > > > to
> > > > > > > > > > > > allow
> > > > > > > > > > > > > > > > >> efficient
> > > > > > > > > > > > > > > > >> > lookups on whether a referencing row
> has a
> > > > match.
> > > > > > > > Since
> > > > > > > > > a
> > > > > > > > > > > > DELETE
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > >> row
> > > > > > > > > > > > > > > > >> > from the referenced table or an UPDATE
> of
> > a
> > > > > > > referenced
> > > > > > > > > > > column
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > >> require
> > > > > > > > > > > > > > > > >> > a scan of the referencing table for rows
> > > > matching
> > > > > > > the
> > > > > > > > > old
> > > > > > > > > > > > value,
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > >> > often a good idea to index the
> referencing
> > > > columns
> > > > > > > > too.
> > > > > > > > > > > > Because
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > >> not
> > > > > > > > > > > > > > > > >> > always needed, and there are many
> choices
> > > > > > available
> > > > > > > on
> > > > > > > > > how
> > > > > > > > > > > to
> > > > > > > > > > > > > > index,
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> > declaration of a foreign key constraint
> > does
> > > > not
> > > > > > > > > > > automatically
> > > > > > > > > > > > > > > create
> > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > >> > index on the referencing columns.*
> > > > > > > > > > > > > > > > >> > *Mysql*
> > > > > > > > > > > > > > > > >> > *MySQL requires that foreign key columns
> > be
> > > > > > indexed;
> > > > > > > > if
> > > > > > > > > > you
> > > > > > > > > > > > > > create a
> > > > > > > > > > > > > > > > >> table
> > > > > > > > > > > > > > > > >> > with a foreign key constraint but no
> index
> > > on
> > > > a
> > > > > > > given
> > > > > > > > > > > column,
> > > > > > > > > > > > an
> > > > > > > > > > > > > > > index
> > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > >> > created. *
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > >> > So I asked Martin to especially create
> an
> > > > index
> > > > > > for
> > > > > > > > > > > > > > > > process_instance_id
> > > > > > > > > > > > > > > > >> > column on nodes table
> > > > > > > > > > > > > > > > >> > I think that will fix the problem
> detected
> > > on
> > > > the
> > > > > > > > thread
> > > > > > > > > > > dump.
> > > > > > > > > > > > > > > > >> > The simpler process test to verify
> queries
> > > are
> > > > > > fine
> > > > > > > > > still
> > > > > > > > > > > > > stands,
> > > > > > > > > > > > > > > > >> though ;)
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > >> > On Fri, Feb 9, 2024 at 5:10 PM Tibor
> > > Zimányi <
> > > > > > > > > > > > > [email protected]
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > >> > > I always preferred pure JDBC over
> > > Hibernate
> > > > > > > myself,
> > > > > > > > > just
> > > > > > > > > > > for
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > sake
> > > > > > > > > > > > > > > > >> of
> > > > > > > > > > > > > > > > >> > > control of what is happening :) So I
> > would
> > > > not
> > > > > > -1
> > > > > > > > that
> > > > > > > > > > > > myself.
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> > > Tibor
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> > > Dňa pi 9. 2. 2024, 17:00 Francisco
> > Javier
> > > > Tirado
> > > > > > > > > Sarti <
> > > > > > > > > > > > > > > > >> > > [email protected]>
> > > > > > > > > > > > > > > > >> > > napísal(a):
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> > > > Hi,
> > > > > > > > > > > > > > > > >> > > > Usually I do not want to talk about
> > work
> > > > in
> > > > > > > > progress
> > > > > > > > > > > > because
> > > > > > > > > > > > > > > > >> > preliminary
> > > > > > > > > > > > > > > > >> > > > conclusions are pretty volatile but,
> > > well,
> > > > > > there
> > > > > > > > > are a
> > > > > > > > > > > > > couple
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > >> things
> > > > > > > > > > > > > > > > >> > > > that can be concluded from the
> really
> > > > valuable
> > > > > > > > > > > information
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > >> Martin
> > > > > > > > > > > > > > > > >> > > > provided.
> > > > > > > > > > > > > > > > >> > > > 1) In order to be able to determine
> if
> > > the
> > > > > > > number
> > > > > > > > of
> > > > > > > > > > > > > > statements
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > >> > larger
> > > > > > > > > > > > > > > > >> > > > than expected, I asked Martin to
> test
> > > > with a
> > > > > > > > simpler
> > > > > > > > > > > > process
> > > > > > > > > > > > > > > > >> > definition.
> > > > > > > > > > > > > > > > >> > > > One with just three nodes: start,
> > script
> > > > and
> > > > > > > end.
> > > > > > > > > The
> > > > > > > > > > > > script
> > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > >> should
> > > > > > > > > > > > > > > > >> > > > change just one variable. This way
> we
> > > can
> > > > > > > analyze
> > > > > > > > if
> > > > > > > > > > the
> > > > > > > > > > > > > > number
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > >> > > queries
> > > > > > > > > > > > > > > > >> > > > is the expected one. From the single
> > log
> > > > > > (audit
> > > > > > > > was
> > > > > > > > > > > > > activated
> > > > > > > > > > > > > > > > them)
> > > > > > > > > > > > > > > > >> my
> > > > > > > > > > > > > > > > >> > > > conclusion is that the number of
> > > > > > insert/updates
> > > > > > > > over
> > > > > > > > > > > > > processes
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > >> > nodes
> > > > > > > > > > > > > > > > >> > > > (there a lot over task, that I will
> > > > prefer to
> > > > > > > skip
> > > > > > > > > for
> > > > > > > > > > > > now,
> > > > > > > > > > > > > > baby
> > > > > > > > > > > > > > > > >> steps)
> > > > > > > > > > > > > > > > >> > > is
> > > > > > > > > > > > > > > > >> > > > the expected one.
> > > > > > > > > > > > > > > > >> > > > 2) Analysing the thread dump, we see
> > > > around 15
> > > > > > > > > threads
> > > > > > > > > > > > > > executing
> > > > > > > > > > > > > > > > >> this
> > > > > > > > > > > > > > > > >> > > line
> > > > > > > > > > > > > > > > >> > > > at
> > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> >
> org.kie.kogito.index.jpa.storage.ProcessInstanceEntityStorage.indexNode(ProcessInstanceEntityStorage.java:125),
> > > > > > > > > > > > > > > > >> > > > so its pretty clear the code to be
> > > > optimized
> > > > > > ;).
> > > > > > > > I'm
> > > > > > > > > > > > > > evaluating
> > > > > > > > > > > > > > > > >> > > > possibilities within JPA/Hibernate,
> > but
> > > > I'm
> > > > > > > > starting
> > > > > > > > > > to
> > > > > > > > > > > > > think
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > >> it
> > > > > > > > > > > > > > > > >> > > might
> > > > > > > > > > > > > > > > >> > > > be better to switch to JDBC and skip
> > > > > > hibernate.
> > > > > > > > Our
> > > > > > > > > > > lives
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > >> > > simpler,
> > > > > > > > > > > > > > > > >> > > > especially with a schema relatively
> > > simple
> > > > > > like
> > > > > > > > ours
> > > > > > > > > > > (that
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > >> my
> > > > > > > > > > > > > > > > >> > > > recommendation if I was an external
> > > > > > consultant)
> > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > >> > > > On Fri, Feb 9, 2024 at 4:15 PM Tibor
> > > > Zimányi <
> > > > > > > > > > > > > > > [email protected]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > > wrote:
> > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > >> > > > > Hi,
> > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > >> > > > > this will be a bit off-topic.
> > However
> > > > as far
> > > > > > > as
> > > > > > > > > > > > > > performance, I
> > > > > > > > > > > > > > > > >> think
> > > > > > > > > > > > > > > > >> > we
> > > > > > > > > > > > > > > > >> > > > > should think about that we have
> > string
> > > > > > primary
> > > > > > > > > keys
> > > > > > > > > > > > > (IDs). I
> > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > >> > > expect
> > > > > > > > > > > > > > > > >> > > > > the database systems are much
> better
> > > > with
> > > > > > > > indexing
> > > > > > > > > > > > numeric
> > > > > > > > > > > > > > > keys
> > > > > > > > > > > > > > > > >> than
> > > > > > > > > > > > > > > > >> > > > > strings. I remember from the past,
> > > when
> > > > I
> > > > > > was
> > > > > > > > > > working
> > > > > > > > > > > > with
> > > > > > > > > > > > > > > DBs,
> > > > > > > > > > > > > > > > >> that
> > > > > > > > > > > > > > > > >> > > > using
> > > > > > > > > > > > > > > > >> > > > > strings as keys or indexes was a
> > > > discouraged
> > > > > > > > > > practice.
> > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > >> > > > > Best regards,
> > > > > > > > > > > > > > > > >> > > > > Tibor
> > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > >> > > > > Dňa št 8. 2. 2024, 22:45 Martin
> > Weiler
> > > > > > > > > > > > > > > <[email protected]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > > > > napísal(a):
> > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > >> > > > > > I changed the test to use
> MongoDB
> > > [1]
> > > > and
> > > > > > I
> > > > > > > > > don't
> > > > > > > > > > > see
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > > >> performance
> > > > > > > > > > > > > > > > >> > > > > > degradation with this setup [2].
> > > > > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > > > > >> > > > > > Please keep us posted of your
> > > > findings.
> > > > > > > > Thanks!
> > > > > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > > > > >> > > > > > Martin
> > > > > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > > > > >> > > > > > [1]
> > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > >
> https://github.com/martinweiler/job-service-refactor-test/tree/mongodb
> > > > > > > > > > > > > > > > >> > > > > > [2]
> > > > > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1NfacXaxJlgRMw4OQ5S20cvkzvaUKUVFj/view?usp=sharing
> > > > > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > > > > >> > > > > >
> > > > ________________________________________
> > > > > > > > > > > > > > > > >> > > > > > From: Francisco Javier Tirado
> > Sarti
> > > <
> > > > > > > > > > > > > [email protected]>
> > > > > > > > > > > > > > > > >> > > > > > Sent: Wednesday, February 7,
> 2024
> > > > 11:40 AM
> > > > > > > > > > > > > > > > >> > > > > > To: [email protected]
> > > > > > > > > > > > > > > > >> > > > > > Subject: [EXTERNAL] Re:
> > [DISCUSSION]
> > > > > > > > Performance
> > > > > > > > > > > > issues
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > >> > > data-index
> > > > > > > > > > > > > > > > >> > > > > > persistence addon
> > > > > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > > > > >> > > > > > yes, it can be index degradation
> > > > because
> > > > > > of
> > > > > > > > > size,
> > > > > > > > > > > but
> > > > > > > > > > > > I
> > > > > > > > > > > > > > > > believe
> > > > > > > > > > > > > > > > >> (I
> > > > > > > > > > > > > > > > >> > > > might
> > > > > > > > > > > > > > > > >> > > > > be
> > > > > > > > > > > > > > > > >> > > > > > wrong) the db is too small (yet)
> > for
> > > > that.
> > > > > > > > > > > > > > > > >> > > > > > But, eventually, Postgres, when
> > the
> > > > DB is
> > > > > > > huge
> > > > > > > > > > > enough,
> > > > > > > > > > > > > > > > >> unavoidably
> > > > > > > > > > > > > > > > >> > > will
> > > > > > > > > > > > > > > > >> > > > > > behave like the graphic that
> > Martin
> > > > sent.
> > > > > > > > > > > > > > > > >> > > > > > Since I believe we are not huge
> > > enough
> > > > > > > (yet),
> > > > > > > > > lets
> > > > > > > > > > > > rule
> > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > >> another
> > > > > > > > > > > > > > > > >> > > > issue
> > > > > > > > > > > > > > > > >> > > > > > by analysing the sql logs (I
> > > requested
> > > > > > those
> > > > > > > > to
> > > > > > > > > > > Martin
> > > > > > > > > > > > > > > offline
> > > > > > > > > > > > > > > > >> and
> > > > > > > > > > > > > > > > >> > he
> > > > > > > > > > > > > > > > >> > > > is
> > > > > > > > > > > > > > > > >> > > > > > going to kindly collect them).
> > > > > > > > > > > > > > > > >> > > > > > Also Im curious to know if Mongo
> > > > behave in
> > > > > > > the
> > > > > > > > > > same
> > > > > > > > > > > > way.
> > > > > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > > > > >> > > > > > On Wed, Feb 7, 2024 at 7:25 PM
> > > Enrique
> > > > > > > > Gonzalez
> > > > > > > > > > > > > Martinez <
> > > > > > > > > > > > > > > > >> > > > > > [email protected]> wrote:
> > > > > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > > > > >> > > > > > > Hi Francisco,
> > > > > > > > > > > > > > > > >> > > > > > > I would highly recommend to
> > check
> > > > > > indexes
> > > > > > > > and
> > > > > > > > > > how
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > updates
> > > > > > > > > > > > > > > > >> > work
> > > > > > > > > > > > > > > > >> > > in
> > > > > > > > > > > > > > > > >> > > > > > data
> > > > > > > > > > > > > > > > >> > > > > > > index to avoid full scan table
> > and
> > > > lock
> > > > > > > the
> > > > > > > > > full
> > > > > > > > > > > > > table.
> > > > > > > > > > > > > > > Some
> > > > > > > > > > > > > > > > >> db
> > > > > > > > > > > > > > > > >> > are
> > > > > > > > > > > > > > > > >> > > > > very
> > > > > > > > > > > > > > > > >> > > > > > > sensitive to that.
> > > > > > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > > El mié, 7 feb 2024, 18:41,
> > > Francisco
> > > > > > > Javier
> > > > > > > > > > Tirado
> > > > > > > > > > > > > > Sarti <
> > > > > > > > > > > > > > > > >> > > > > > > [email protected]>
> escribió:
> > > > > > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > > > Hi Martin,
> > > > > > > > > > > > > > > > >> > > > > > > > While I analyze the data,
> let
> > me
> > > > ask
> > > > > > you
> > > > > > > > if
> > > > > > > > > it
> > > > > > > > > > > is
> > > > > > > > > > > > > > > possible
> > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > >> > > > perform
> > > > > > > > > > > > > > > > >> > > > > > > > another check (similar in a
> > way
> > > to
> > > > > > > > disabling
> > > > > > > > > > > > > > data-index
> > > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > >> > you
> > > > > > > > > > > > > > > > >> > > > do)
> > > > > > > > > > > > > > > > >> > > > > > Can
> > > > > > > > > > > > > > > > >> > > > > > > > you switch to MongoDB
> > > persistence
> > > > and
> > > > > > > > check
> > > > > > > > > if
> > > > > > > > > > > the
> > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > >> > > degradation
> > > > > > > > > > > > > > > > >> > > > > > that
> > > > > > > > > > > > > > > > >> > > > > > > is
> > > > > > > > > > > > > > > > >> > > > > > > > there for postgres remains?
> > > > > > > > > > > > > > > > >> > > > > > > > I do not know if this is
> > > feasible
> > > > but
> > > > > > > will
> > > > > > > > > > > > certainly
> > > > > > > > > > > > > > > > >> indicate
> > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > >> > > > > > problem
> > > > > > > > > > > > > > > > >> > > > > > > > is on the postgres storage
> > layer
> > > > and I
> > > > > > > do
> > > > > > > > > not
> > > > > > > > > > > > have a
> > > > > > > > > > > > > > > clear
> > > > > > > > > > > > > > > > >> > > > prediction
> > > > > > > > > > > > > > > > >> > > > > > of
> > > > > > > > > > > > > > > > >> > > > > > > > what we will see when doing
> > this
> > > > > > switch.
> > > > > > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > > > On Wed, Feb 7, 2024 at
> 6:37 PM
> > > > Martin
> > > > > > > > Weiler
> > > > > > > > > > > > > > > > >> > > > <[email protected]
> > > > > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > > > > >> > > > > > > > wrote:
> > > > > > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > > > > Hi Francisco,
> > > > > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > > > > thanks for your work on
> this
> > > > > > important
> > > > > > > > > > topic!
> > > > > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > > > > I would like to share some
> > > test
> > > > > > > results
> > > > > > > > > > here,
> > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > might
> > > > > > > > > > > > > > > > >> > help
> > > > > > > > > > > > > > > > >> > > to
> > > > > > > > > > > > > > > > >> > > > > > > improve
> > > > > > > > > > > > > > > > >> > > > > > > > > the codebase even
> further. I
> > > am
> > > > > > using
> > > > > > > > the
> > > > > > > > > > > jmeter
> > > > > > > > > > > > > > based
> > > > > > > > > > > > > > > > >> test
> > > > > > > > > > > > > > > > >> > > case
> > > > > > > > > > > > > > > > >> > > > > from
> > > > > > > > > > > > > > > > >> > > > > > > > Pere
> > > > > > > > > > > > > > > > >> > > > > > > > > and Enrique (thanks guys!)
> > [1]
> > > > which
> > > > > > > > uses
> > > > > > > > > a
> > > > > > > > > > > load
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > 30
> > > > > > > > > > > > > > > > >> > threads
> > > > > > > > > > > > > > > > >> > > to
> > > > > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > > > > 1) start a new process
> > > instance
> > > > > > (POST)
> > > > > > > > > > > > > > > > >> > > > > > > > > 2) retrieve tasks for a
> user
> > > > (GET)
> > > > > > > > > > > > > > > > >> > > > > > > > > 3) fetches task details
> > (GET)
> > > > > > > > > > > > > > > > >> > > > > > > > > 4) complete a task (POST)
> > > > > > > > > > > > > > > > >> > > > > > > > > 5) execute a query on
> > > data-audit
> > > > > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > > > > With this test setup, I
> > > noticed
> > > > that
> > > > > > > the
> > > > > > > > > > > > > performance
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > >> > > POST
> > > > > > > > > > > > > > > > >> > > > > > > > > requests, in particular
> the
> > > one
> > > > to
> > > > > > > > start a
> > > > > > > > > > new
> > > > > > > > > > > > > > process
> > > > > > > > > > > > > > > > >> > > instance,
> > > > > > > > > > > > > > > > >> > > > > > > degrades
> > > > > > > > > > > > > > > > >> > > > > > > > > over time - see graph [2].
> > If
> > > I
> > > > run
> > > > > > > the
> > > > > > > > > same
> > > > > > > > > > > > test
> > > > > > > > > > > > > > > > without
> > > > > > > > > > > > > > > > >> > > > > data-index,
> > > > > > > > > > > > > > > > >> > > > > > > > then
> > > > > > > > > > > > > > > > >> > > > > > > > > there is no such
> performance
> > > > > > > degradation
> > > > > > > > > > [3].
> > > > > > > > > > > > You
> > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > >> find a
> > > > > > > > > > > > > > > > >> > > > thread
> > > > > > > > > > > > > > > > >> > > > > > > dump
> > > > > > > > > > > > > > > > >> > > > > > > > > captured a few minutes
> into
> > > the
> > > > > > first
> > > > > > > > test
> > > > > > > > > > > here
> > > > > > > > > > > > > [4]
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > >> > might
> > > > > > > > > > > > > > > > >> > > > help
> > > > > > > > > > > > > > > > >> > > > > > to
> > > > > > > > > > > > > > > > >> > > > > > > > see
> > > > > > > > > > > > > > > > >> > > > > > > > > some of the contention
> > points.
> > > > > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > > > > I'd appreciate if you
> could
> > > > take a
> > > > > > > look
> > > > > > > > > and
> > > > > > > > > > > see
> > > > > > > > > > > > if
> > > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > >> > > > > something
> > > > > > > > > > > > > > > > >> > > > > > > > that
> > > > > > > > > > > > > > > > >> > > > > > > > > can be further improved
> > based
> > > on
> > > > > > your
> > > > > > > > > > previous
> > > > > > > > > > > > > work.
> > > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > >> you
> > > > > > > > > > > > > > > > >> > > need
> > > > > > > > > > > > > > > > >> > > > > any
> > > > > > > > > > > > > > > > >> > > > > > > > > additional data, let me
> > know,
> > > > but
> > > > > > > > > otherwise
> > > > > > > > > > it
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > >> > > straightforward
> > > > > > > > > > > > > > > > >> > > > > to
> > > > > > > > > > > > > > > > >> > > > > > > run
> > > > > > > > > > > > > > > > >> > > > > > > > > the jmeter test as well.
> > > > > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > >> > > > > > > > > Martin
> > > > > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > > > > [1]
> > > > > > > > > > > > > > > > >>
> > > > > > > https://github.com/pefernan/job-service-refactor-test/
> > > > > > > > > > > > > > > > >> > > > > > > > > [2]
> > > > > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1Gqn-ixE05kXv2jdssAUlnMuUVcHxIYZ0/view?usp=sharing
> > > > > > > > > > > > > > > > >> > > > > > > > > [3]
> > > > > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/10gVNyb4JYg_bA18bNhY9dEDbPn3TOxL7/view?usp=sharing
> > > > > > > > > > > > > > > > >> > > > > > > > > [4]
> > > > > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1jVrtsO49gCvUlnaC9AUAtkVKTm4PbdUv/view?usp=sharing
> > > > > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > ________________________________________
> > > > > > > > > > > > > > > > >> > > > > > > > > From: Francisco Javier
> > Tirado
> > > > Sarti
> > > > > > <
> > > > > > > > > > > > > > > > [email protected]>
> > > > > > > > > > > > > > > > >> > > > > > > > > Sent: Wednesday, January
> 17,
> > > > 2024
> > > > > > 9:13
> > > > > > > > AM
> > > > > > > > > > > > > > > > >> > > > > > > > > To: [email protected]
> > > > > > > > > > > > > > > > >> > > > > > > > > Cc: Pere Fernandez Perez
> > > > > > > > > > > > > > > > >> > > > > > > > > Subject: [EXTERNAL] Re:
> > > > [DISCUSSION]
> > > > > > > > > > > Performance
> > > > > > > > > > > > > > > issues
> > > > > > > > > > > > > > > > >> with
> > > > > > > > > > > > > > > > >> > > > > > data-index
> > > > > > > > > > > > > > > > >> > > > > > > > > persistence addon
> > > > > > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > > > > > >> > > > > > > > > Hi Alex,
> > > > > > > > > > > > > > > > >> > > > > > > > > I did not take times
> (which
> > > > depends
> > > > > > > on a
> > > > > > > > > > > number
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > > >> variables
> > > > > > > > > > > > > > > > >> > > that
> > > > > > > > > > > > > > > > >> > > > > > > > > drastically change between
> > > > > > > > environments),
> > > > > > > > > > but
> > > > > > > > > > > > > verify
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > >> > > > > number
> > > > > > > > > > > > > > > > >> > > > > > of
> > > > > > > > > > > > > > > > >> > > > > > > > > updates has been reduced
> > > > drastically
> > > > > > > > > without
> > > > > > > > > > > > > losing
> > > > > > > > > > > > > > > > >> > > > functionality,
> > > > > > > > > > > > > > > > >> > > > > > > which
> > > > > > > > > > > > > > > > >> > > > > > > > is
> > > > > > > > > > > > > > > > >> > > > > > > > > objectively a good thing.
> If
> > > > before
> > > > > > > the
> > > > > > > > > > > change,
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > every
> > > > > > > > > > > > > > > > >> > node
> > > > > > > > > > > > > > > > >> > > > > > > executed,
> > > > > > > > > > > > > > > > >> > > > > > > > we
> > > > > > > > > > > > > > > > >> > > > > > > > > have an update for every
> > node
> > > > > > > previously
> > > > > > > > > > > > executed,
> > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > if a
> > > > > > > > > > > > > > > > >> > > > process
> > > > > > > > > > > > > > > > >> > > > > > have
> > > > > > > > > > > > > > > > >> > > > > > > > 50
> > > > > > > > > > > > > > > > >> > > > > > > > > nodes to execute, we were
> > > > performing
> > > > > > > > > nearly
> > > > > > > > > > > > > 50*51/2
> > > > > > > > > > > > > > > > >> updates,
> > > > > > > > > > > > > > > > >> > > > which
> > > > > > > > > > > > > > > > >> > > > > > > gives
> > > > > > > > > > > > > > > > >> > > > > > > > us
> > > > > > > > > > > > > > > > >> > > > > > > > > a total of  1275 updates,
> > now
> > > we
> > > > > > have
> > > > > > > > just
> > > > > > > > > > one
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > every
> > > > > > > > > > > > > > > > >> node
> > > > > > > > > > > > > > > > >> > > > being
> > > > > > > > > > > > > > > > >> > > > > > > > > executed, implying a total
> > of
> > > 50
> >
>

Re: [DISCUSSION] Performance issues with data-index persistence addon

Reply via email to