[389-devel] Re: Implementing Referential Integrity by using a queue instead of a file

Ludwig Mon, 10 Jul 2017 01:03:08 -0700


On 07/10/2017 12:13 AM, William Brown wrote:

On Fri, 2017-07-07 at 11:44 +0200, Ludwig Krispenz wrote:

On 07/07/2017 10:44 AM, Ludwig Krispenz wrote:

On 07/07/2017 07:10 AM, William Brown wrote:

Any thoughts or objections on the above would be welcome.

The only problem with going to a queue is if the server goes down
unexpectedly.  In such a case those RI updates would be lost.

We already have this issue because there is a delay between the change
to the object and the log being sync() to disk. So we can already lose
changes here. TBH the only fix is ot remove the async model. I actually
question why we still need async/delay processing of the refint
plugin ...

Historically speaking, a long time ago, we used to see high CPU when the
RI plugin was engaged.  Setting the delay to 1 second, and allowing the
log thread to do the work, improved performance.  Of course this is now
obsolete with the betxn plugin model and other improvements, but I
wanted to share why the feature even existed.

I guess that would be related to internal op searches / lack of
indexing. These days it's not as big of an issue.

boldly said. How do you know, did you verify it ?
we have seen many customer issues with refereint which were resolved
by making it async, just removing this option without proof of a
better solution is no good.
I also am not sure if we need to tie anything into the betxn. There
are operations, which, in my opinion, can be delayed, even redone by
fixup tasks, so it is not necessary to have it in one txn, and if the
option is there to delay it if you want, we should not take it away

So, lets open a ticket to remove delayed processing mode then?

you can, but I will oppose to implement it :-)

I would even go and suggest to implement the delay feature for memberof,
like a continuous fixup task.

I disagree: Right now a delayed / async mode causes us issue because you
have a seperation between "the change" then the database reflecting
that. Be it through memberof or refint. We have lost the atomic nature
of the DB as a result.

yes and no. A client operation does not know which plugins are enabled,if they are delayed, betxn or postop plugins, or if they are temporarilydisabled. We still have the ACID property for the client.if we set the delay the main operation and the secondary operation egmeberof or refint are no longer atomic. But this is deliberate decision,you also cannot prevent an admin to disable memberof for some time andlater run fixup, in that period group changes and memberof also are notatomic.


Memberof especially so because SSSD uses this for membership. If we
implemented this we could cause undue delays to access controls be it
addition or removal while a consumer waits for the processing to occur.

who said that we have to use a delay in this deployment ?


Additionally, the async modes *force* the plugin to run on a single
master, rather than all masters lest we cause many replication
conflicts.

No.


For the sake of replication master consistency being able to run the
same configuration on all masters is extremely important.

yes, but you're argument about repl conflicts is not correct


Finally, any delays in refint/memberof processing:

* We have a better MO algorithm there, it's just not been implemented
yet.
* Perhaps refint is missing indexes, or needs an algorithmic change of
it's own.


I think that it's a better investment of time to *fix* our problems,
rather than hide and delay them behind async processing tasks (which
arguably, will cause other random delays by being batched, and holding
the write lock for a longer time period than a single write).

I fully agree. but there is no need for investment for a delay inrefint, it is there and is used by a few deployments - and you want toinvest into takeing this away and/or replcaing it while there is no needor requirement to do so


So, I will open this ticket, and I think that given your concerns about
this Ludwig, we'll need to keep in mind that refint processing
performance is a key aspect of this change, and that we will need to
perform some significant load testing to guarantee we will not cause a
regression in this space.



_______________________________________________
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org

_______________________________________________
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org

[389-devel] Re: Implementing Referential Integrity by using a queue instead of a file

Reply via email to