Hi Thierry, Coz of the issues we had to revert back to earlier running openldap in production.
I have now done a few TCP related changes in sysctl.conf and have also increased the nsslapd-dbcachesize and nsslapd-cachememsize to 200MB I will again start migrating hosts back to IPA and see if I face the earlier issue. I will update back once I have something Thanks, Rakesh On Thu, Aug 25, 2016 at 2:17 PM, thierry bordaz <tbor...@redhat.com> wrote: > > > On 08/25/2016 10:15 AM, Rakesh Rajasekharan wrote: > > All of the troubleshooting seems fine. > > > However, Running libconv.pl gives me this output > > ----- Recommendations ----- > > 1. You have unindexed components, this can be caused from a search on an > unindexed attribute, or your returned results exceeded the > allidsthreshold. Unindexed components are not recommended. To refuse > unindexed searches, switch 'nsslapd-require-index' to 'on' under your > database entry (e.g. cn=UserRoot,cn=ldbm database,cn=plugins,cn=config). > > 2. You have a significant difference between binds and unbinds. You may > want to investigate this difference. > > > I feel, this could be a pointer to things going slow.. and IPA hanging. I > think i now have something that I can try and nail down this issue. > > On a sidenote, I was earlier running openldap and migrated over to > Freeipa, > > Thanks > Rakesh > > > > On Wed, Aug 24, 2016 at 12:38 PM, Petr Spacek <pspa...@redhat.com> wrote: > >> On 23.8.2016 18:44, Rakesh Rajasekharan wrote: >> > I think thers something seriously wrong with my system >> > >> > not able to run any IPA commands >> > >> > klist >> > Ticket cache: KEYRING:persistent:0:0 >> > Default principal: ad...@xyz.com >> > >> > Valid starting Expires Service principal >> > 2016-08-23T16:26:36 2016-08-24T16:26:22 krbtgt/ <xyz....@xyz.com> >> xyz....@xyz.com >> > >> > >> > [root@prod-ipa-master-1a :~] ipactl status >> > Directory Service: RUNNING >> > krb5kdc Service: RUNNING >> > kadmin Service: RUNNING >> > ipa_memcached Service: RUNNING >> > httpd Service: RUNNING >> > pki-tomcatd Service: RUNNING >> > ipa-otpd Service: RUNNING >> > ipa: INFO: The ipactl command was successful >> > >> > >> > >> > [root@prod-ipa-master :~] ipa user-find p-testuser >> > ipa: ERROR: Kerberos error: ('Unspecified GSS failure. Minor code may >> > provide more information', 851968)/("Cannot contact any KDC for realm ' >> > XYZ.COM'", -1765328228) >> > > Hi Rakesh, > > Having a reproducible test case would you rerun the command above. > During its processing you may monitor DS process load (top). If it is > high, you may get some pstacks of it. > Also would you attach the part of DS access logs taken during the command. > > regards > thierry > > > >> >> This is weird because the server seems to be up. >> >> Please follow >> http://www.freeipa.org/page/Troubleshooting#Authentication.2FKerberos >> >> Petr^2 Spacek >> >> > >> > >> > Thanks >> > >> > Rakesh >> > >> > On Tue, Aug 23, 2016 at 10:01 PM, Rakesh Rajasekharan < >> > rakesh.rajasekha...@gmail.com> wrote: >> > >> >> i changed the loggin level to 4 . Modifying nsslapd-accesslog-level >> >> >> >> But, the hang is still there. though I dont see the sigfault now >> >> >> >> >> >> >> >> >> >> On Tue, Aug 23, 2016 at 9:02 PM, Rakesh Rajasekharan < >> >> rakesh.rajasekha...@gmail.com> wrote: >> >> >> >>> My disk was getting filled too fast >> >>> >> >>> logs under /var/log/dirsrv was coming around 5 gb quickly filling up >> >>> >> >>> Is there a way to make the logging less verbose >> >>> >> >>> >> >>> >> >>> On Tue, Aug 23, 2016 at 6:41 PM, Petr Spacek <pspa...@redhat.com> >> wrote: >> >>> >> >>>> On 23.8.2016 15:07, Rakesh Rajasekharan wrote: >> >>>>> I was able to fix that may be temporarily... when i checked the >> >>>> network.. >> >>>>> there was another process that was running and consuming a lot of >> >>>> network ( >> >>>>> i have no idea who did that. I need to seriously start restricting >> >>>> people >> >>>>> access to this machine ) >> >>>>> >> >>>>> after killing that perfomance improved drastically >> >>>>> >> >>>>> But now, suddenly I started experiencing the same hang. >> >>>>> >> >>>>> This time , I gert the following error when checked dmesg >> >>>>> >> >>>>> [ 301.236976] ns-slapd[3124]: segfault at 0 ip 00007f1de416951c sp >> >>>>> 00007f1dee1dba70 error 4 in libcos-plugin.so[7f1de4166000+b000] >> >>>>> [ 1116.248431] TCP: request_sock_TCP: Possible SYN flooding on port >> 88. >> >>>>> Sending cookies. Check SNMP counters. >> >>>>> [11831.397037] ns-slapd[22550]: segfault at 0 ip 00007f533d82251c sp >> >>>>> 00007f5347894a70 error 4 in libcos-plugin.so[7f533d81f000+b000] >> >>>>> [11832.727989] ns-slapd[22606]: segfault at 0 ip 00007f6231eb951c sp >> >>>>> 00007f623bf2ba70 error 4 in libcos-plugin.so[7f6231eb6000+b00 >> >>>> >> >>>> Okay, this one is serious. The LDAP server crashed. >> >>>> >> >>>> 1. Make sure all your packages are up-to-date. >> >>>> >> >>>> Please see >> >>>> http://directory.fedoraproject.org/docs/389ds/FAQ/faq.html#d >> >>>> ebugging-crashes >> >>>> for further instructions how to debug this. >> >>>> >> >>>> Petr^2 Spacek >> >>>> >> >>>>> >> >>>>> and in /var/log/dirsrv/example-com/errors >> >>>>> >> >>>>> [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord: >> >>>> could >> >>>>> not delete change record 3291138 (rc: 32) >> >>>>> [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord: >> >>>> could >> >>>>> not delete change record 3291139 (rc: 32) >> >>>>> [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord: >> >>>> could >> >>>>> not delete change record 3291140 (rc: 32) >> >>>>> [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord: >> >>>> could >> >>>>> not delete change record 3291141 (rc: 32) >> >>>>> [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord: >> >>>> could >> >>>>> not delete change record 3291142 (rc: 32) >> >>>>> [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord: >> >>>> could >> >>>>> not delete change record 3291143 (rc: 32) >> >>>>> [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord: >> >>>> could >> >>>>> not delete change record 3291144 (rc: 32) >> >>>>> [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord: >> >>>> could >> >>>>> not delete change record 3291145 (rc: 32) >> >>>>> [23/Aug/2016:12:49:50 +0000] - Retry count exceeded in delete >> >>>>> [23/Aug/2016:12:49:50 +0000] DSRetroclPlugin - delete_changerecord: >> >>>> could >> >>>>> not delete change record 3292734 (rc: 51) >> >>>>> >> >>>>> >> >>>>> Can i do something about this error.. I treid to restart ipa a >> couple >> >>>> of >> >>>>> time but that did not help >> >>>>> >> >>>>> Thanks >> >>>>> Rakesh >> >>>>> >> >>>>> On Mon, Aug 22, 2016 at 2:27 PM, Petr Spacek <pspa...@redhat.com> >> >>>> wrote: >> >>>>> >> >>>>>> On 19.8.2016 19:32, Rakesh Rajasekharan wrote: >> >>>>>>> I am running my set up on AWS cloud, and entropy is low at around >> >>>> 180 . >> >>>>>>> >> >>>>>>> I plan to increase it bu installing haveged . But, would low >> entropy >> >>>> by >> >>>>>> any >> >>>>>>> chance cause this issue of intermittent hang . >> >>>>>>> Also, the hang is mostly observed when registering around 20 >> clients >> >>>>>>> together >> >>>>>> >> >>>>>> Possibly, I'm not sure. If you want to dig into this, I would do >> this: >> >>>>>> 1. look what process hangs on client (using pstree command or so) >> >>>>>> $ pstree >> >>>>>> >> >>>>>> 2. look to what server and port is the hanging client connected to >> >>>>>> $ lsof -p <PID of the hanging process> >> >>>>>> >> >>>>>> 3. jump to server and see what process is bound to the target port >> >>>>>> $ netstat -pn >> >>>>>> >> >>>>>> 4. see where the process if hanging >> >>>>>> $ strace -p <PID of the hanging process> >> >>>>>> >> >>>>>> I hope it helps. >> >>>>>> >> >>>>>> Petr^2 Spacek >> >>>>>> >> >>>>>>> On Fri, Aug 19, 2016 at 7:24 PM, Rakesh Rajasekharan < >> >>>>>>> rakesh.rajasekha...@gmail.com> wrote: >> >>>>>>> >> >>>>>>>> yes there seems to be something thats worrying.. I have faced >> this >> >>>> today >> >>>>>>>> as well. >> >>>>>>>> There are few hosts around 280 odd left and when i try adding >> them >> >>>> to >> >>>>>> IPA >> >>>>>>>> , the slowness begins.. >> >>>>>>>> >> >>>>>>>> all the ipa commands like ipa user-find.. etc becomes very slow >> in >> >>>>>>>> responding. >> >>>>>>>> >> >>>>>>>> the SYNC_RECV are not many though just around 80-90 and today >> that >> >>>> was >> >>>>>>>> around 20 only >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> I have for now increased tcp_max_syn_backlog to 5000. >> >>>>>>>> For now the slowness seems to have gone.. but I will do a try >> >>>> adding the >> >>>>>>>> clients again tomorrow and see how it goes >> >>>>>>>> >> >>>>>>>> Thanks >> >>>>>>>> Rakesh >> >>>>>>>> >> >>>>>>>> The issues >> >>>>>>>> >> >>>>>>>> On Fri, Aug 19, 2016 at 12:58 PM, Petr Spacek < >> pspa...@redhat.com> >> >>>>>> wrote: >> >>>>>>>> >> >>>>>>>>> On 18.8.2016 17:23, Rakesh Rajasekharan wrote: >> >>>>>>>>>> Hi >> >>>>>>>>>> >> >>>>>>>>>> I am migrating to freeipa from openldap and have around 4000 >> >>>> clients >> >>>>>>>>>> >> >>>>>>>>>> I had openned a another thread on that, but chose to start a >> new >> >>>> one >> >>>>>>>>> here >> >>>>>>>>>> as its a separate issue >> >>>>>>>>>> >> >>>>>>>>>> I was able to change the nssslapd-maxdescriptors adding an ldif >> >>>> file >> >>>>>>>>>> >> >>>>>>>>>> cat nsslapd-modify.ldif >> >>>>>>>>>> dn: cn=config >> >>>>>>>>>> changetype: modify >> >>>>>>>>>> replace: nsslapd-maxdescriptors >> >>>>>>>>>> nsslapd-maxdescriptors: 17000 >> >>>>>>>>>> >> >>>>>>>>>> and running the ldapmodify command >> >>>>>>>>>> >> >>>>>>>>>> I have now started moving clients running an openldap to >> Freeipa >> >>>> and >> >>>>>>>>> have >> >>>>>>>>>> today moved close to 2000 clients >> >>>>>>>>>> >> >>>>>>>>>> However, I have noticed that IPA hangs intermittently. >> >>>>>>>>>> >> >>>>>>>>>> running a kinit admin returns the below error >> >>>>>>>>>> kinit: Generic error (see e-text) while getting initial >> >>>> credentials >> >>>>>>>>>> >> >>>>>>>>>> from the /var/log/messages, I see this entry >> >>>>>>>>>> >> >>>>>>>>>> prod-ipa-master-int kernel: [104090.315801] TCP: >> >>>> request_sock_TCP: >> >>>>>>>>>> Possible SYN flooding on port 88. Sending cookies. Check SNMP >> >>>>>> counters. >> >>>>>>>>> >> >>>>>>>>> I would be worried about this message. Maybe kernel/firewall is >> >>>> doing >> >>>>>>>>> something fishy behind your back and blocking some connections >> or >> >>>> so. >> >>>>>>>>> >> >>>>>>>>> Petr^2 Spacek >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>>> Aug 18 13:00:01 prod-ipa-master-int systemd[1]: Started Session >> >>>> 4885 >> >>>>>> of >> >>>>>>>>>> user root. >> >>>>>>>>>> Aug 18 13:00:01 prod-ipa-master-int systemd[1]: Starting >> Session >> >>>> 4885 >> >>>>>> of >> >>>>>>>>>> user root. >> >>>>>>>>>> Aug 18 13:01:01 prod-ipa-master-int systemd[1]: Started Session >> >>>> 4886 >> >>>>>> of >> >>>>>>>>>> user root. >> >>>>>>>>>> Aug 18 13:01:01 prod-ipa-master-int systemd[1]: Starting >> Session >> >>>> 4886 >> >>>>>> of >> >>>>>>>>>> user root. >> >>>>>>>>>> Aug 18 13:02:40 prod-ipa-master-int python[28984]: >> ansible-command >> >>>>>>>>> Invoked >> >>>>>>>>>> with creates=None executable=None shell=True args= removes=None >> >>>>>>>>> warn=True >> >>>>>>>>>> chdir=None >> >>>>>>>>>> Aug 18 13:04:37 prod-ipa-master-int sssd_be: GSSAPI Error: >> >>>> Unspecified >> >>>>>>>>> GSS >> >>>>>>>>>> failure. Minor code may provide more information (KDC returned >> >>>> error >> >>>>>>>>>> string: PROCESS_TGS) >> >>>>>>>>>> >> >>>>>>>>>> Could it be possible that its due to the initial load of adding >> >>>> the >> >>>>>>>>> clients >> >>>>>>>>>> or is there something else that I need to take care of. >> > > > > >
-- Manage your subscription for the Freeipa-users mailing list: https://www.redhat.com/mailman/listinfo/freeipa-users Go to http://freeipa.org for more info on the project