On Thu, Jun 23, 2016 at 12:40:28PM +0200, Sumit Bose wrote:
> On Thu, Jun 23, 2016 at 09:52:46AM +0200, Jakub Hrozek wrote:
> > On Tue, Jun 21, 2016 at 10:28:52PM +0200, Lukas Slebodnik wrote:
> > > On (21/06/16 22:08), Jakub Hrozek wrote:
> > > >On Tue, Jun 21, 2016 at 11:46:51AM +0200, Jakub Hrozek wrote:
> > > >> On Mon, Jun 20, 2016 at 06:57:55PM +0200, Jakub Hrozek wrote:
> > > >> > On Fri, Jun 10, 2016 at 01:33:49PM +0200, Jakub Hrozek wrote:
> > > >> > > On Fri, Jun 03, 2016 at 12:43:41PM +0200, Jakub Hrozek wrote:
> > > >> > > > Hi,
> > > >> > > > 
> > > >> > > > the attached patchset improves SSSD by performance by adding a 
> > > >> > > > second
> > > >> > > > ldb cache file that is opened with LDB_FLG_NOSYNC and contains 
> > > >> > > > empeheral
> > > >> > > > data like timestamps. The reason is to avoid updating the 
> > > >> > > > sync-cache in
> > > >> > > > case nothing in fact changed and SSSD is only refreshing the 
> > > >> > > > cache to
> > > >> > > > add new timestamps. Especially for group lookups, this can 
> > > >> > > > improve
> > > >> > > > performance a bit. The first lookup, or any where the actual data
> > > >> > > > changes is still slow, though.
> > > >> > > > 
> > > >> > > > I don't have a CI link yet, because I'm having issues installing 
> > > >> > > > pyldb on
> > > >> > > > our test machines.
> > > >> > > > 
> > > >> > > > The last patch doesn't touch sysdb, but the LDAP provider. I 
> > > >> > > > hope it's
> > > >> > > > fine to include it in the same set, though.
> > > >> > > > 
> > > >> > > > The full design document is at:
> > > >> > > >     
> > > >> > > > https://fedorahosted.org/sssd/wiki/DesignDocs/OneFourteenPerformanceImprovements
> > > >> > > 
> > > >> > > the attached patches fix one bug in the NSS responder related to 
> > > >> > > MPG
> > > >> > > domains and one issue with adding incomplete groups.
> > > >> > 
> > > >> > I found another bug:
> > > >> >     if you upgrade to this version from one that didn't have the
> > > >> >     timestamp cache, sss_cache doesn't work until the cache is
> > > >> >     invalidated.
> > > >> > 
> > > >> > I'm not sure how to fix this yet, whether to create the timestamp
> > > >> > entries on upgrade to be safe or maybe invalidate all entries during
> > > >> > upgrade..but I would like to wait if there are any review comments 
> > > >> > from
> > > >> > anybody before proceeding.
> > > >> 
> > > >> Attached are patches rebased on today's origin/master.
> > > >
> > > >btw the integration test I wrote seems a bit flaky in CI:
> > > >    http://sssd-ci.duckdns.org/logs/job/45/68/rhel6/ci.html
> > > >although it just works for me locally. I wonder if it would be
> > > >acceptable to use the test only during patch review and fix it up later.
> > > 
> > > You needn't remove test from patch.
> > > Just remove file with test.
> > > It should not start with "test_" or finish with "_test.py"
> > 
> > sleep(1) after the modification and before requesting the entry "solved"
> > it.
> > 
> > The attached patches also fix several Coverity issues and other review
> > comments Sumit found during his review.
> > 
> > CI: http://sssd-ci.duckdns.org/logs/job/45/80/summary.html
> 
> Thank you for the patches. In my tests I've seen 5-10x faster lookups of
> expired cached objects which didn't change on the server.
> 
> All my comments I discussed with Jakub on irc are included in the latest
> patch set and CI in general passed as well
> http://sssd-ci.duckdns.org/logs/job/45/81/rhel7/ci.html (the failure on
> RHEL7 is in test_add_remove_membership_rfc2307_bis where I've seen
> similar issues with other runs, maybe it is a timing issue as well, i.e.
> we re-read the data from the server before 389ds was able to properly
> update the related object?).
> 
> ACK
> 
> bye,
> Sumit

Thank you very much for the review. I pushed all the patches to master:
    * beec1ee5799570f34a51ea57674c7291c15f7022
    * b8946a5dbde01a87465de707092716349a35248b
    * d36f4db9bb5efc63b94190cca25adb08ee56971c
    * 3bd9da80f71a6794af0a6b3fbc11bc3a2da64638
    * 4016c7dd288d379118b47ecbe7d8f46cfcb0d400
    * 40de79d69860ec7f04bf7795bd88b641ec42fd23
    * a257259b05d62ebe548b6c798a3aa03a97dbc0c2
    * dd285415d7a8d8376207960cfa3e977524c3b98c
    * 13d7df10bf4d76c333a9169f9fcbeb891d870351
    * f983b400bf4f6fb14a2174d6f58071e06e9ec832
    * f21b3cce14055e77af8ccb98dd8e0fa1ec1f7944
    * 72dbcd0a3361f1c0f0c3e348aa2fbcabd926188b
    * e732d23f3ec986a463d757781a334040e03d1f59
    * 6e9d7cbe43fdfc866b18f9ef0779bbfc10ad6f3a

On IRC we talked about adding more debugging to the code. Because I'm sure
I would forgot otherwise, I filed:
  https://fedorahosted.org/sssd/ticket/3060 - better debugging of timestamp
  cache modifications
and also:
  https://fedorahosted.org/sssd/ticket/3061 - Add systemtap probes into
  the top-level data provider requests
The latter would be useful in case some users don't see the promised
performance enhancement, at least it would allow us to see which requests
in general take so long. We already have probes for LDAP provider, but
not the others.
_______________________________________________
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org

Reply via email to