On Wed, 2011-12-21 at 15:33 -0500, Dan Scott wrote: > On Wed, Dec 21, 2011 at 14:10, Dan Scott <danieljamessc...@gmail.com> wrote: > > On Mon, Dec 19, 2011 at 15:26, Dan Scott <danieljamessc...@gmail.com> wrote: > >> On Mon, Dec 19, 2011 at 14:14, Simo Sorce <s...@redhat.com> wrote: > >>> On Mon, 2011-12-19 at 11:01 -0500, Dan Scott wrote: > >>>> On Thu, Dec 15, 2011 at 11:51, Rich Megginson <rmegg...@redhat.com> > >>>> wrote: > >>>> > On 12/15/2011 09:48 AM, Dan Scott wrote: > >>>> >> > >>>> >> Hi, > >>>> >> > >>>> >> On Thu, Dec 15, 2011 at 10:58, Rich Megginson<rmegg...@redhat.com> > >>>> >> wrote: > >>>> >>> > >>>> >>> On 12/15/2011 08:41 AM, Dan Scott wrote: > >>>> >>>> > >>>> >>>> Hi, > >>>> >>>> > >>>> >>>> On my Fedora 15 FreeIPA server, I'm having some problems with > >>>> >>>> stability. The server appears to 'hang' and stops responding to LDAP > >>>> >>>> lookups. When I restart the dirsrv service, I get: > >>>> >>>> > >>>> >>>> Dec 15 09:40:02 ohm kernel: [254566.011404] ns-slapd[28910]: > >>>> >>>> segfault > >>>> >>>> at 17d ip 00007f00dbc0208c sp 00007fff929b7848 error 4 in > >>>> >>>> libc-2.14.so[7f00dbb87000+18f000] > >>>> >>>> > >>>> >>>> and the /var/log/dirsrv/slapd-EXAMPLE-COM/errors contains > >>>> >>>> > >>>> >>>> [15/Dec/2011:09:47:35 -0500] set_krb5_creds - Could not get initial > >>>> >>>> credentials for principal [ldap/example....@example.com] in keytab > >>>> >>>> [WRFILE:/etc/dirsrv/ds.keytab]: -1765328228 (Cannot contact any KDC > >>>> >>>> for requested realm) > >>>> >>>> [15/Dec/2011:09:47:35 -0500] slapd_ldap_sasl_interactive_bind - > >>>> >>>> Error: > >>>> >>>> could not perform interactive bind for id [] mech [GSSAPI]: error -2 > >>>> >>>> (Local error) (SASL(-1): generic failure: GSSAPI Error: Unspecified > >>>> >>>> GSS failure. Minor code may provide more information (Credentials > >>>> >>>> cache file '/tmp/krb5cc_496' not found)) > >>>> >>>> > >>>> >>>> This is happening very frequently, I'm having to restart the dirsrv > >>>> >>>> process once an hour, otherwise people start complaining. > >>>> >>>> > >>>> >>>> I experienced similar problems with FreeIPA 1, when I was using > >>>> >>>> Fedora > >>>> >>>> 14 and earlier, and had to regularly (also once per hour) restart > >>>> >>>> the > >>>> >>>> dirsrv process. Could this be related? > >>>> >>>> > >>>> >>>> I also noticed this: > >>>> >>>> https://bugzilla.redhat.com/show_bug.cgi?id=730387 > >>>> >>>> > >>>> >>>> There are updates in 'updates-testing' which I believe fix the above > >>>> >>>> issue, but I'm reluctant to install from a testing repo on my > >>>> >>>> production server, can anyone report any feedback on this? > >>>> >>> > >>>> >>> The above bug does not cause a segfault. > >>>> >>> What version of 389-ds-base are you using? > >>>> >> > >>>> >> [root@ohm ~]# rpm -qa|grep 389 > >>>> >> 389-ds-base-libs-1.2.10-0.4.a4.fc15.x86_64 > >>>> >> 389-ds-base-1.2.10-0.4.a4.fc15.x86_64 > >>>> >> [root@ohm ~]# > >>>> > > >>>> > a4 is alpha software. Not sure how that got released to stable. > >>>> > > >>>> >>> Please enable the collection of core dumps so we can debug the crash > >>>> >>> - > >>>> >>> see > >>>> >>> http://directory.fedoraproject.org/wiki/FAQ#Debugging_Crashes > >>>> >> > >>>> >> OK. I think there is a small typo in the instructions: > >>>> >> > >>>> >> 'debuginfo-install 389-ds-base-debuginfo' should be 'debuginfo-install > >>>> >> 389-ds-base' > >>>> > > >>>> > Thanks. Fixed. > >>>> > > >>>> >> I managed to get the core dump (attached - so I only sent this message > >>>> >> to you, not the list as well), but it doesn't contain much > >>>> >> information. > >>>> > > >>>> > This is https://bugzilla.redhat.com/show_bug.cgi?id=755725 > >>>> > > >>>> > Will be fixed in 1.2.10.a6 > >>>> > > >>>> > But this still doesn't explain your kerberos errors. > >>>> > >>>> An additional problem is also occurring. I've been finding that the: > >>>> > >>>> /etc/dirsrv/slapd-EXAMPLE-COM/dse.ldif > >>>> > >>>> file is empty and prevents dirsrv from starting. I can restore it from > >>>> dse.ldif.bak or dse.ldif.startOK, but this may be related to the LDAP > >>>> problems that I'm having? > >>> > >>> This is an upgrade time problem, it should be fixed in latest packages. > >>> Did you recently upgrade freeipa packages if so from what version to > >>> what version ? > >> > >> The 0 length file doesn't appear related to upgrades. Possibly it only > >> happens on the first service restart after an upgrade? > >> > >> It's happened at least 4 times since the last freeipa package upgrade > >> on 4th November, so it seems to be happening too regularly to be the > >> result of an upgrade. > >> > >> [root@curie ~]# grep freeipa /var/log/yum.log > >> Sep 06 16:56:51 Installed: freeipa-python-2.0.1-2.fc15.x86_64 > >> Sep 06 17:00:13 Installed: freeipa-client-2.0.1-2.fc15.x86_64 > >> Sep 06 17:00:14 Installed: freeipa-admintools-2.0.1-2.fc15.x86_64 > >> Sep 06 17:01:52 Installed: freeipa-server-selinux-2.0.1-2.fc15.x86_64 > >> Sep 06 17:01:56 Installed: freeipa-server-2.0.1-2.fc15.x86_64 > >> Sep 08 11:23:35 Updated: freeipa-python-2.1.0-1.fc15.x86_64 > >> Sep 08 11:23:41 Updated: freeipa-client-2.1.0-1.fc15.x86_64 > >> Sep 08 11:23:41 Updated: freeipa-admintools-2.1.0-1.fc15.x86_64 > >> Sep 08 11:25:00 Updated: freeipa-server-selinux-2.1.0-1.fc15.x86_64 > >> Sep 08 11:26:06 Updated: freeipa-server-2.1.0-1.fc15.x86_64 > >> Nov 04 15:46:43 Updated: freeipa-python-2.1.3-2.fc15.x86_64 > >> Nov 04 15:52:48 Updated: freeipa-client-2.1.3-2.fc15.x86_64 > >> Nov 04 15:52:48 Updated: freeipa-admintools-2.1.3-2.fc15.x86_64 > >> Nov 04 15:54:47 Updated: freeipa-server-2.1.3-2.fc15.x86_64 > >> Nov 04 15:56:02 Updated: freeipa-server-selinux-2.1.3-2.fc15.x86_64 > >> > >> Dan > > > > I'm still having fairly serious problems. I keep getting: > > > > ipa: ERROR: Kerberos error: Kerberos error: ('Unspecified GSS failure. > > Minor code may provide more information', 851968)/('Cannot contact > > any KDC for requested realm', -1765328228)/ > > > > Whenever I try and run IPA commands on either of my servers, or a > > client with the admin tools installed. > > > > The server logs contain: > > > > slapd_ldap_sasl_interactive_bind - Error: could not perform > > interactive bind for id [] mech [GSSAPI]: error -1 (Can't contact LDAP > > server) ((null)) > > slapi_ldap_bind - Error: could not perform interactive bind for id [] > > mech [GSSAPI]: error -1 (Can't contact LDAP server) > > > > And I can't create new replicas because they fail with: > > > > 2011-12-21 11:25:58,356 DEBUG Failed to start replication > > File "/usr/sbin/ipa-replica-install", line 484, in <module> > > main() > > > > File "/usr/sbin/ipa-replica-install", line 435, in main > > ds = install_replica_ds(config) > > > > File "/usr/sbin/ipa-replica-install", line 137, in install_replica_ds > > pkcs12_info) > > > > File "/usr/lib/python2.7/site-packages/ipaserver/install/dsinstance.py", > > line 284, in create_replica > > self.start_creation("Configuring directory server", 60) > > > > File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", > > line 248, in start_creation > > method() > > > > File "/usr/lib/python2.7/site-packages/ipaserver/install/dsinstance.py", > > line 297, in __setup_replica > > r_bindpw=self.dm_password) > > > > File "/usr/lib/python2.7/site-packages/ipaserver/install/replication.py", > > line 694, in setup_replication > > raise RuntimeError("Failed to start replication") > > > > Can someone help me? This is getting fairly serious because I can't > > create/modify anything and I'm worried that there will be problems > > with existing users soon as well. > > OK, I think I'm narrowing in on this. It looks like the replication > agreement is broken and the servers have got out of sync:
odd > On the 'master' server (which contains the PKI dirsrv process): The PKI instance uses a diffeent set of replication agreementsso you can't see those agreements with ipa-replica-manage which handles only the IPA Idm instance. > [root@fileserver1 ~]# ipa-replica-manage list > fileserver1.example.com: master > > On the other server: > > [root@fileserver2 ~]# ipa-replica-manage list > fileserver1.example.com: master > fileserver2.example.com: master strange indeed. > When I try and add the missing replication: > > [root@fileserver1 ~]# ipa-replica-manage connect fileserver2.example.com > unexpected error: list index out of range > > Do I need to delete the replication from fileserver2? You can't remove a replication agreement if it is the only agreement you have. This is to avoid split-brain situations. Not sure how to handle a disappeared agreement though it's theorethically not possible unless you 'inadvertently' ran ipa-replica-manage --force del fileserver2 on fileserver1 ... Can you look into cn=config and see if you have references toi fileserver2 ? Maybe it is just a bug in displaying actually active replicas. > As an aside, there are some errors in the documentation for > ipa-replica-manage. Some of the examples have 'ipa replica-manage' > instead of 'ipa-replica-manage' (space instead of '-'). Thanks will file a doc bug. Simo. -- Simo Sorce * Red Hat, Inc * New York _______________________________________________ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users