Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-09-14 Thread Ludwig Krispenz

Hi,
On 09/13/2016 07:37 PM, Rakesh Rajasekharan wrote:

Hi All,

Have finally made some progress with this.. after changing the 
checkpoint interval to 180, my hangs have gone down now..


However, I faced a similar hang yesterday... users were not able to 
login.. , though this time the ns-slapd did not had any issues and 
ldapsearch worked fine possibly due to the changes in checpoint. So, I 
think I hit some other issue this time


this is a bit confusing, if your server crashes with the attached 
stacktrace ldapsearch cannot work.


About the core, it looks like you are hitting this  issue: 
https://fedorahosted.org/389/ticket/48388


I had a core genrated and this is the stacktrace of it.. can you 
please go through this and help me identify what could be causing the 
issue this time.. I have put in lot of efforts to debug and really 
would love to have this working in my prod env.. as it does in my 
other envs...


GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-80.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 


This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from /usr/sbin/ns-slapd...
warning: the debug information found in 
"/usr/lib/debug//usr/sbin/ns-slapd.debug" does not match 
"/usr/sbin/ns-slapd" (CRC mismatch).



warning: the debug information found in 
"/usr/lib/debug/usr/sbin/ns-slapd.debug" does not match 
"/usr/sbin/ns-slapd" (CRC mismatch).


Reading symbols from /usr/sbin/ns-slapd...(no debugging symbols 
found)...done.

(no debugging symbols found)...done.
[New LWP 15255]
[New LWP 15286]
[New LWP 15245]
[New LWP 15246]
[New LWP 15247]
[New LWP 15248]
[New LWP 15243]

warning: the debug information found in 
"/usr/lib/debug//usr/lib64/dirsrv/libslapd.so.0.0.0.debug" does not 
match "/usr/lib64/dirsrv/libslapd.so.0" (CRC mismatch).



warning: the debug information found in 
"/usr/lib/debug/usr/lib64/dirsrv/libslapd.so.0.0.0.debug" does not 
match "/usr/lib64/dirsrv/libslapd.so.0" (CRC mismatch).


[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

warning: the debug information found in 
"/usr/lib/debug//usr/lib64/dirsrv/plugins/libsyntax-plugin.so.debug" 
does not match "/usr/lib64/dirsrv/plugins/libsyntax-plugin.so" (CRC 
mismatch).



warning: the debug information found in 
"/usr/lib/debug/usr/lib64/dirsrv/plugins/libsyntax-plugin.so.debug" 
does not match "/usr/lib64/dirsrv/plugins/libsyntax-plugin.so" (CRC 
mismatch).



warning: the debug information found in 
"/usr/lib/debug//usr/lib64/dirsrv/plugins/libbitwise-plugin.so.debug" 
does not match "/usr/lib64/dirsrv/plugins/libbitwise-plugin.so" (CRC 
mismatch).



warning: the debug information found in 
"/usr/lib/debug/usr/lib64/dirsrv/plugins/libbitwise-plugin.so.debug" 
does not match "/usr/lib64/dirsrv/plugins/libbitwise-plugin.so" (CRC 
mismatch).


...skipping...
-rw---. 1 dirsrv dirsrv  0 Sep  8 02:55 audit
-rw---. 1 dirsrv dirsrv 2551824384 Sep 12 17:32 core.10450
-rw---. 1 dirsrv dirsrv 1464463360 Sep 12 19:35 core.14709
-rw---. 1 dirsrv dirsrv 4483862528 Sep 13 01:05 core.15243
-rw---. 1 dirsrv dirsrv   66288165 Sep 13 02:10 errors
-rw---. 1 dirsrv dirsrv  104964391 Sep 13 08:30 access.20160913-074214
-rw---. 1 dirsrv dirsrv  105021859 Sep 13 09:26 access.20160913-083046
-rw---. 1 dirsrv dirsrv  104861746 Sep 13 10:31 access.20160913-092646
-rw---. 1 dirsrv dirsrv  105069140 Sep 13 11:36 access.20160913-103137
-rw---. 1 dirsrv dirsrv  104913480 Sep 13 12:41 access.20160913-113638
-rw---. 1 dirsrv dirsrv  105186788 Sep 13 13:46 access.20160913-124118
-rw---. 1 dirsrv dirsrv  105162159 Sep 13 14:51 access.20160913-134619
-rw---. 1 dirsrv dirsrv  105256624 Sep 13 15:56 access.20160913-145120
-rw---. 1 dirsrv dirsrv  105231158 Sep 13 17:01 access.20160913-155620
-rw---. 1 dirsrv dirsrv   1044 Sep 13 17:01 access.rotationinfo
-rw-r--r--. 1 root   root19287 Sep 13 17:28 
stacktrace.1473787719.txt

-rw---. 1 dirsrv dirsrv   45608914 Sep 13 17:29 access
[root@prod-ipa-master-int slapd-SPRINKLR-COM]# gdb -ex 'set confirm 
off' -ex 'set pagination off' -ex 'thread apply all bt full' -ex 
'quit' /usr/sbin/ns-slapd 
/var/log/dirsrv/slapd-SPRINKLR-COM/core.15243 stacktrace.`date 
+%s`.txt 2>&1^C
[root@prod-ipa-master-int slapd-SPRINKLR-COM]# gdb -ex 'set confirm 
off' -ex 'set pagination off' -ex 'thread apply all bt full' -ex 
'quit' /usr/sbin/ns-slapd 
/var/log/dirsrv/slapd-SPRINKLR-COM/core.15243 > stacktrace.`date 
+%s`.txt 2>&1

[root@prod-ipa-master-int slapd-SPRINKLR-COM]# ls -ltr
total 6404952
-rw---. 1 dirsrv dirsrv   

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-09-07 Thread Rob Crittenden

Rakesh Rajasekharan wrote:

I feel since I migrated from my earlier openldap.. There might be some
issues with the indexes.. So may be creating fresh users might help..
Is there a way to  read clear text user password..  That way I could
have just recreated the users and set them the older password...


Migrated users are added one-by-one in pretty much the same way that ipa 
user-add works so there shouldn't be any issues with the indexes, at 
least not due to the migration.


There is no way to retrieve the cleartext password.

rob



Just trying what ever I could think of to fix this :)


On Sep 7, 2016 5:16 PM, "Rakesh Rajasekharan"
mailto:rakesh.rajasekha...@gmail.com>>
wrote:

I changed the nsslapd-db-checkpoint interval first to 80 and
increased till 160.. however the hang has not gone away yet


I migrated my data from a openldap, can that cause issues like
this.. If so, i could try creating users from scratch directly on
IPA rather than migrating and test it

But before that , have taken a fresh pstack just incase if it helps
to nail down the issue

Thread 44 (Thread 0x7f53ed52f700 (LWP 128165)):
#0  0x7f54047649b3 in select () from /lib64/libc.so.6
#1  0x7f5406eef0e9 in DS_Sleep () from
/usr/lib64/dirsrv/libslapd.so.0
#2  0x7f53fa53d907 in deadlock_threadmain () from
/usr/lib64/dirsrv/plugins/libback-ldbm.so
#3  0x7f540509e7bb in _pt_root () from /lib64/libnspr4.so
#4  0x7f5404a3fdc5 in start_thread () from /lib64/libpthread.so.0
#5  0x7f540476d28d in clone () from /lib64/libc.so.6
Thread 43 (Thread 0x7f53ecd2e700 (LWP 128166)):
#0  0x7f54047649b3 in select () from /lib64/libc.so.6
#1  0x7f53ff0cbbad in __os_yield () from /lib64/libdb-5.3.so

#2  0x7f53ff0c72b3 in __memp_sync_int () from
/lib64/libdb-5.3.so 
#3  0x7f53ff0d7752 in __txn_checkpoint () from
/lib64/libdb-5.3.so 
#4  0x7f53ff0d7b74 in __txn_checkpoint_pp () from
/lib64/libdb-5.3.so 
#5  0x7f53fa541a87 in checkpoint_threadmain () from
/usr/lib64/dirsrv/plugins/libback-ldbm.so
#6  0x7f540509e7bb in _pt_root () from /lib64/libnspr4.so
#7  0x7f5404a3fdc5 in start_thread () from /lib64/libpthread.so.0
#8  0x7f540476d28d in clone () from /lib64/libc.so.6
Thread 42 (Thread 0x7f53ec52d700 (LWP 128167)):
#0  0x7f54047649b3 in select () from /lib64/libc.so.6
#1  0x7f5406eef0e9 in DS_Sleep () from
/usr/lib64/dirsrv/libslapd.so.0
#2  0x7f53fa53db7f in trickle_threadmain () from
/usr/lib64/dirsrv/plugins/libback-ldbm.so
#3  0x7f540509e7bb in _pt_root () from /lib64/libnspr4.so
#4  0x7f5404a3fdc5 in start_thread () from /lib64/libpthread.so.0
#5  0x7f540476d28d in clone () from /lib64/libc.so.6
Thread 41 (Thread 0x7f53ebd2c700 (LWP 128168)):
#0  0x7f54047649b3 in select () from /lib64/libc.so.6
#1  0x7f5406eef0e9 in DS_Sleep () from
/usr/lib64/dirsrv/libslapd.so.0
#2  0x7f53fa538707 in perf_threadmain () from
/usr/lib64/dirsrv/plugins/libback-ldbm.so
#3  0x7f540509e7bb in _pt_root () from /lib64/libnspr4.so
#4  0x7f5404a3fdc5 in start_thread () from /lib64/libpthread.so.0
#5  0x7f540476d28d in clone () from /lib64/libc.so.6
Thread 40 (Thread 0x7f53eb322700 (LWP 128204)):
#0  0x7f5404a436d5 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1  0x7f5405099050 in PR_WaitCondVar () from /lib64/libnspr4.so
#2  0x7f5406ede198 in slapi_wait_condvar () from
/usr/lib64/dirsrv/libslapd.so.0
#3  0x7f53fd27062e in cos_cache_wait_on_change () from
/usr/lib64/dirsrv/plugins/libcos-plugin.so
#4  0x7f540509e7bb in _pt_root () from /lib64/libnspr4.so
#5  0x7f5404a3fdc5 in start_thread () from /lib64/libpthread.so.0
#6  0x7f540476d28d in clone () from /lib64/libc.so.6
Thread 39 (Thread 0x7f53eab21700 (LWP 128206)):
#0  0x7f5404a436d5 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1  0x7f5405099050 in PR_WaitCondVar () from /lib64/libnspr4.so
#2  0x7f5406ede198 in slapi_wait_condvar () from
/usr/lib64/dirsrv/libslapd.so.0
#3  0x7f53f8bdbeed in roles_cache_wait_on_change () from
/usr/lib64/dirsrv/plugins/libroles-plugin.so
#4  0x7f540509e7bb in _pt_root () from /lib64/libnspr4.so
#5  0x7f5404a3fdc5 in start_thread () from /lib64/libpthread.so.0
#6  0x7f540476d28d in clone () from /lib64/libc.so.6
Thread 38 (Thread 0x7f53ea320700 (LWP 128207)):
#0  0x7f5404a436d5 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1  0x7f5405099050 in PR_WaitCondVar () from /lib64/libnspr4.so
#2  0x7f5406ede198 in slapi_wait_condvar () from
/usr/lib64/dirsrv/libslapd.so.0
#3  

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-09-05 Thread Rakesh Rajasekharan
Hi Thierry,


I was getting the hang issue while running ipa-client-install
simultaneously on few clients..
However, today, I am not able to replicate that.

I could not get a gdb . But i will try getting that the next time I face
this issue.

The CPU does not stay high.. it just momentarily touches a high value and
then drops down to around 2-7%

One question I have is , is it ok to set it nsslapd-threadnumber to a very
high value .
I have around 4000 clients and with nsslapd-maxthreadsperconn set to 5..So,
can I set nsslapd-threadnumber to around 25000.

Thanks

On Mon, Sep 5, 2016 at 1:03 PM, thierry bordaz  wrote:

>
> Hi Rakesh,
>
> Were you able to get a pstack or full stack with gdb (
> http://www.port389.org/docs/389ds/FAQ/faq.html#debugging-crashes) when
> the server hangs ?
>
> If it happens with 500 threads as well as with 30, using 30 threads is a
> better choice to debug this issue.
> I will try to reproduce using 150 parallel 'ipa user-find p-testipa'
> commands
>
> Something I am unsure is if the CPU consumption stays high (you mentioned
> 340% CPU usage) as long as the hang happens or if after a suddent shot up
> to 340% (that marks the beginning of the hang) it drops and stay hanging ?
>
> thanks
> thierry
>
> On 09/04/2016 08:40 PM, Rakesh Rajasekharan wrote:
>
> starce on the slapd process actually had this in the output..
> FUTEX_WAIT_PRIVATE
>
> and checking for the number of threads slapd had.. there were 5015 threads
>
> ps -efL|grep slapd|wc -l
> 5015
>
> strace on most of the threads gave this output
>
> strace -p 67411
> Process 67411 attached
> futex(0x7f3f0226b41c, FUTEX_WAIT_PRIVATE, 1, NULL) = -1 EAGAIN (Resource
> temporarily unavailable)
> futex(0x7f3f0226b41c, FUTEX_WAIT_PRIVATE, 2, NULL^CProcess 67411 detached
>
>
>
>
>
> On Sun, Sep 4, 2016 at 5:34 PM, Rakesh Rajasekharan <
> rakesh.rajasekha...@gmail.com> wrote:
>
>> I have again got the issue of IPA hanging.. The issue came up when i
>> tried to run ipa-client-isntall on 142 clients simultaneously
>>
>>
>> None of the IPA commands are responding,  and I see this error
>>
>> ipa user-find p-testipa
>> ipa: ERROR: Insufficient access: SASL(-1): generic failure: GSSAPI Error:
>> Unspecified GSS failure.  Minor code may provide more information (KDC
>> returned error string: PROCESS_TGS)
>>
>>  KRB5_TRACE=/dev/stdout kinit admin
>> [41178] 1472984115.233214: Getting initial credentials for ad...@xyz.com
>> [41178] 1472984115.235257: Sending request (167 bytes) to XYZ.COM
>> [41178] 1472984115.235419: Initiating TCP connection to stream
>> 10.1.3.36:88
>> [41178] 1472984115.235685: Sending TCP request to stream 10.1.3.36:88
>> [41178] 1472984120.238914: Received answer (174 bytes) from stream
>> 10.1.3.36:88
>> [41178] 1472984120.238925: Terminating TCP connection to stream
>> 10.1.3.36:88
>> [41178] 1472984120.238993: Response was from master KDC
>> [41
>>
>>
>> Running an ldapsearch to see the db.. does not give any results and just
>> hangs there
>>
>> ldapsearch -x -D 'cn=Directory Manager' -W -s one -b
>> 'cn=kerberos,dc=xyz,dc=com'
>> Enter LDAP Password:
>>
>> even an ldapsearch -x does not respond
>> At this point, am sure that slapd is the one causing issues
>>
>> Running an strace against the hung slapd itself seems to get stuck does
>> not proceed after saying "attaching to process"
>>
>> From some others posts I read Thierry suggesting to increase the
>> nsslapd-threadnumber value
>>
>> It was set to 30, I think that might be too low.
>>
>> I have raised it to  500
>>
>> Now after restarting the service .. ldapsearch starts responding.
>> But running the test to add a sudden high number of clients again left
>> ns-slapd to hung state
>>
>> When i attempted adding the clients.. the ns-slapd cpu usage shot up to
>> 340% and after that ns-slapd stopped responding
>>
>> So now, atleast I know what might be causing the issue and I can now
>> easily reproduce it.
>>
>> Is there a way I can make ns-slapd handle a sudden bump in incoming
>> request for ipa-client-install
>>
>> Thanks
>> Rakesh
>>
>>
>>
>>
>>
>>
>> On Mon, Aug 29, 2016 at 11:18 PM, Rich Megginson < 
>> rmegg...@redhat.com> wrote:
>>
>>> On 08/29/2016 10:53 AM, Rakesh Rajasekharan wrote:
>>>
>>> Hi Thierry,
>>>
>>> My machine has 30GB RAM ..and  389-ds version is 1.3.4
>>>
>>> ldapsearch shows the values for nsslapd-cachememsize updated to 200MB.
>>>
>>> ldapsearch -LLL -o ldif-wrap=no -D "cn=directory manager" -w
>>> 'mypassword' -b 'cn=userRoot,cn=ldbm database,cn=plugins,cn=config'|grep
>>> nsslapd-cachememsize
>>> nsslapd-cachememsize: 209715200
>>>
>>>
>>> So, it seems to have updated though seeing that warning(WARNING: ipaca:
>>> entry cache size 10485760B is less than db size 11599872B) in the log
>>> confuses me a bit.
>>>
>>> Thers one more entry that I found from the ldapsearch to be bit low
>>>
>>> nsslapd-dncachememsize: 10485760
>>> maxdncachesize: 10485760
>>>
>>> Should I update these as well to a higher value
>>>
>>> At the time when the is

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-09-05 Thread thierry bordaz


Hi Rakesh,

Were you able to get a pstack or full stack with gdb 
(http://www.port389.org/docs/389ds/FAQ/faq.html#debugging-crashes) when 
the server hangs ?


If it happens with 500 threads as well as with 30, using 30 threads is a 
better choice to debug this issue.
I will try to reproduce using 150 parallel 'ipa user-find p-testipa' 
commands


Something I am unsure is if the CPU consumption stays high (you 
mentioned 340% CPU usage) as long as the hang happens or if after a 
suddent shot up to 340% (that marks the beginning of the hang) it drops 
and stay hanging ?


thanks
thierry

On 09/04/2016 08:40 PM, Rakesh Rajasekharan wrote:

starce on the slapd process actually had this in the output..
FUTEX_WAIT_PRIVATE

and checking for the number of threads slapd had.. there were 5015 threads

ps -efL|grep slapd|wc -l
5015

strace on most of the threads gave this output

strace -p 67411
Process 67411 attached
futex(0x7f3f0226b41c, FUTEX_WAIT_PRIVATE, 1, NULL) = -1 EAGAIN 
(Resource temporarily unavailable)

futex(0x7f3f0226b41c, FUTEX_WAIT_PRIVATE, 2, NULL^CProcess 67411 detached





On Sun, Sep 4, 2016 at 5:34 PM, Rakesh Rajasekharan 
mailto:rakesh.rajasekha...@gmail.com>> 
wrote:


I have again got the issue of IPA hanging.. The issue came up when
i tried to run ipa-client-isntall on 142 clients simultaneously


None of the IPA commands are responding,  and I see this error

ipa user-find p-testipa
ipa: ERROR: Insufficient access: SASL(-1): generic failure: GSSAPI
Error: Unspecified GSS failure.  Minor code may provide more
information (KDC returned error string: PROCESS_TGS)

 KRB5_TRACE=/dev/stdout kinit admin
[41178] 1472984115.233214: Getting initial credentials for
ad...@xyz.com 
[41178] 1472984115.235257: Sending request (167 bytes) to XYZ.COM

[41178] 1472984115.235419: Initiating TCP connection to stream
10.1.3.36:88 
[41178] 1472984115.235685: Sending TCP request to stream
10.1.3.36:88 
[41178] 1472984120.238914: Received answer (174 bytes) from stream
10.1.3.36:88 
[41178] 1472984120.238925: Terminating TCP connection to stream
10.1.3.36:88 
[41178] 1472984120.238993: Response was from master KDC
[41


Running an ldapsearch to see the db.. does not give any results
and just hangs there

ldapsearch -x -D 'cn=Directory Manager' -W -s one -b
'cn=kerberos,dc=xyz,dc=com'
Enter LDAP Password:

even an ldapsearch -x does not respond
At this point, am sure that slapd is the one causing issues

Running an strace against the hung slapd itself seems to get stuck
does not proceed after saying "attaching to process"

From some others posts I read Thierry suggesting to increase the
nsslapd-threadnumber value

It was set to 30, I think that might be too low.

I have raised it to  500

Now after restarting the service .. ldapsearch starts responding.
But running the test to add a sudden high number of clients again
left ns-slapd to hung state

When i attempted adding the clients.. the ns-slapd cpu usage shot
up to 340% and after that ns-slapd stopped responding

So now, atleast I know what might be causing the issue and I can
now easily reproduce it.

Is there a way I can make ns-slapd handle a sudden bump in
incoming request for ipa-client-install

Thanks
Rakesh






On Mon, Aug 29, 2016 at 11:18 PM, Rich Megginson
mailto:rmegg...@redhat.com>> wrote:

On 08/29/2016 10:53 AM, Rakesh Rajasekharan wrote:

Hi Thierry,

My machine has 30GB RAM ..and  389-ds version is 1.3.4

ldapsearch shows the values for nsslapd-cachememsize updated
to 200MB.

ldapsearch -LLL -o ldif-wrap=no -D "cn=directory manager" -w
'mypassword' -b 'cn=userRoot,cn=ldbm
database,cn=plugins,cn=config'|grep nsslapd-cachememsize
nsslapd-cachememsize: 209715200


So, it seems to have updated though seeing that
warning(WARNING: ipaca: entry cache size 10485760B is less
than db size 11599872B) in the log confuses me a bit.

Thers one more entry that I found from the ldapsearch to be
bit low

nsslapd-dncachememsize: 10485760
maxdncachesize: 10485760

Should I update these as well to a higher value

At the time when the issue happened, the memory usage as well
as the overall load of the system was very low .
I will try reproducing the issue atleast in my QA
env..probably by trying to mock  simultaneous parallel logins
to a large number of hosts


To monitor your cache sizes, please use the dbmon.sh tool
provided with your distro.  If that is not available with your
particular distro, see
https://github.com/richm/scripts/wiki/dbmon.sh
 

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-09-04 Thread Rakesh Rajasekharan
starce on the slapd process actually had this in the output..
FUTEX_WAIT_PRIVATE

and checking for the number of threads slapd had.. there were 5015 threads

ps -efL|grep slapd|wc -l
5015

strace on most of the threads gave this output

strace -p 67411
Process 67411 attached
futex(0x7f3f0226b41c, FUTEX_WAIT_PRIVATE, 1, NULL) = -1 EAGAIN (Resource
temporarily unavailable)
futex(0x7f3f0226b41c, FUTEX_WAIT_PRIVATE, 2, NULL^CProcess 67411 detached





On Sun, Sep 4, 2016 at 5:34 PM, Rakesh Rajasekharan <
rakesh.rajasekha...@gmail.com> wrote:

> I have again got the issue of IPA hanging.. The issue came up when i tried
> to run ipa-client-isntall on 142 clients simultaneously
>
>
> None of the IPA commands are responding,  and I see this error
>
> ipa user-find p-testipa
> ipa: ERROR: Insufficient access: SASL(-1): generic failure: GSSAPI Error:
> Unspecified GSS failure.  Minor code may provide more information (KDC
> returned error string: PROCESS_TGS)
>
>  KRB5_TRACE=/dev/stdout kinit admin
> [41178] 1472984115.233214: Getting initial credentials for ad...@xyz.com
> [41178] 1472984115.235257: Sending request (167 bytes) to XYZ.COM
> [41178] 1472984115.235419: Initiating TCP connection to stream
> 10.1.3.36:88
> [41178] 1472984115.235685: Sending TCP request to stream 10.1.3.36:88
> [41178] 1472984120.238914: Received answer (174 bytes) from stream
> 10.1.3.36:88
> [41178] 1472984120.238925: Terminating TCP connection to stream
> 10.1.3.36:88
> [41178] 1472984120.238993: Response was from master KDC
> [41
>
>
> Running an ldapsearch to see the db.. does not give any results and just
> hangs there
>
> ldapsearch -x -D 'cn=Directory Manager' -W -s one -b
> 'cn=kerberos,dc=xyz,dc=com'
> Enter LDAP Password:
>
> even an ldapsearch -x does not respond
> At this point, am sure that slapd is the one causing issues
>
> Running an strace against the hung slapd itself seems to get stuck does
> not proceed after saying "attaching to process"
>
> From some others posts I read Thierry suggesting to increase the
> nsslapd-threadnumber value
>
> It was set to 30, I think that might be too low.
>
> I have raised it to  500
>
> Now after restarting the service .. ldapsearch starts responding.
> But running the test to add a sudden high number of clients again left
> ns-slapd to hung state
>
> When i attempted adding the clients.. the ns-slapd cpu usage shot up to
> 340% and after that ns-slapd stopped responding
>
> So now, atleast I know what might be causing the issue and I can now
> easily reproduce it.
>
> Is there a way I can make ns-slapd handle a sudden bump in incoming
> request for ipa-client-install
>
> Thanks
> Rakesh
>
>
>
>
>
>
> On Mon, Aug 29, 2016 at 11:18 PM, Rich Megginson 
> wrote:
>
>> On 08/29/2016 10:53 AM, Rakesh Rajasekharan wrote:
>>
>> Hi Thierry,
>>
>> My machine has 30GB RAM ..and  389-ds version is 1.3.4
>>
>> ldapsearch shows the values for nsslapd-cachememsize updated to 200MB.
>>
>> ldapsearch -LLL -o ldif-wrap=no -D "cn=directory manager" -w 'mypassword'
>> -b 'cn=userRoot,cn=ldbm database,cn=plugins,cn=config'|grep
>> nsslapd-cachememsize
>> nsslapd-cachememsize: 209715200
>>
>>
>> So, it seems to have updated though seeing that warning(WARNING: ipaca:
>> entry cache size 10485760B is less than db size 11599872B) in the log
>> confuses me a bit.
>>
>> Thers one more entry that I found from the ldapsearch to be bit low
>>
>> nsslapd-dncachememsize: 10485760
>> maxdncachesize: 10485760
>>
>> Should I update these as well to a higher value
>>
>> At the time when the issue happened, the memory usage as well as the
>> overall load of the system was very low .
>> I will try reproducing the issue atleast in my QA env..probably by trying
>> to mock  simultaneous parallel logins to a large number of hosts
>>
>>
>> To monitor your cache sizes, please use the dbmon.sh tool provided with
>> your distro.  If that is not available with your particular distro, see
>> https://github.com/richm/scripts/wiki/dbmon.sh
>>
>>
>>
>>
>> thanks
>> Rakesh
>>
>>
>>
>>
>> On Mon, Aug 29, 2016 at 8:16 PM, thierry bordaz 
>> wrote:
>>
>>> Hi Rakesh,
>>>
>>> Those tuning may depend on the memory available on your machine.
>>> nsslapd-cachememsize allows the entry cache to consume up to 200Mb but
>>> its memory footprint is known to go above.
>>> 200Mb both looks pretty good to me. How large is your machine ? What is
>>> your version of 389-ds ?
>>>
>>> Those warnings do not change your settings. It just raise that entry
>>> cache of 'ipaca' and 'retrocl' are small but it is fine. The size of the
>>> entry cache is important mostly in userRoot.
>>> You may double check the actual values, after restart, with ldapsearch
>>> on 'cn=userRoot,cn=ldbm database,cn=plugins,cn=config' and
>>> 'cn=config,cn=ldbm database,cn=plugins,cn=config'.
>>>
>>> A step is to know what will be response time of DS to know if it is
>>> responsible of the hang or not.
>>> The logs and possibly pstack during those intermittent hangs will help
>>> t

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-09-04 Thread Rakesh Rajasekharan
I have again got the issue of IPA hanging.. The issue came up when i tried
to run ipa-client-isntall on 142 clients simultaneously


None of the IPA commands are responding,  and I see this error

ipa user-find p-testipa
ipa: ERROR: Insufficient access: SASL(-1): generic failure: GSSAPI Error:
Unspecified GSS failure.  Minor code may provide more information (KDC
returned error string: PROCESS_TGS)

 KRB5_TRACE=/dev/stdout kinit admin
[41178] 1472984115.233214: Getting initial credentials for ad...@xyz.com
[41178] 1472984115.235257: Sending request (167 bytes) to XYZ.COM
[41178] 1472984115.235419: Initiating TCP connection to stream 10.1.3.36:88
[41178] 1472984115.235685: Sending TCP request to stream 10.1.3.36:88
[41178] 1472984120.238914: Received answer (174 bytes) from stream
10.1.3.36:88
[41178] 1472984120.238925: Terminating TCP connection to stream 10.1.3.36:88
[41178] 1472984120.238993: Response was from master KDC
[41


Running an ldapsearch to see the db.. does not give any results and just
hangs there

ldapsearch -x -D 'cn=Directory Manager' -W -s one -b
'cn=kerberos,dc=xyz,dc=com'
Enter LDAP Password:

even an ldapsearch -x does not respond
At this point, am sure that slapd is the one causing issues

Running an strace against the hung slapd itself seems to get stuck does not
proceed after saying "attaching to process"

>From some others posts I read Thierry suggesting to increase the
nsslapd-threadnumber value

It was set to 30, I think that might be too low.

I have raised it to  500

Now after restarting the service .. ldapsearch starts responding.
But running the test to add a sudden high number of clients again left
ns-slapd to hung state

When i attempted adding the clients.. the ns-slapd cpu usage shot up to
340% and after that ns-slapd stopped responding

So now, atleast I know what might be causing the issue and I can now easily
reproduce it.

Is there a way I can make ns-slapd handle a sudden bump in incoming request
for ipa-client-install

Thanks
Rakesh






On Mon, Aug 29, 2016 at 11:18 PM, Rich Megginson 
wrote:

> On 08/29/2016 10:53 AM, Rakesh Rajasekharan wrote:
>
> Hi Thierry,
>
> My machine has 30GB RAM ..and  389-ds version is 1.3.4
>
> ldapsearch shows the values for nsslapd-cachememsize updated to 200MB.
>
> ldapsearch -LLL -o ldif-wrap=no -D "cn=directory manager" -w 'mypassword'
> -b 'cn=userRoot,cn=ldbm database,cn=plugins,cn=config'|grep
> nsslapd-cachememsize
> nsslapd-cachememsize: 209715200
>
>
> So, it seems to have updated though seeing that warning(WARNING: ipaca:
> entry cache size 10485760B is less than db size 11599872B) in the log
> confuses me a bit.
>
> Thers one more entry that I found from the ldapsearch to be bit low
>
> nsslapd-dncachememsize: 10485760
> maxdncachesize: 10485760
>
> Should I update these as well to a higher value
>
> At the time when the issue happened, the memory usage as well as the
> overall load of the system was very low .
> I will try reproducing the issue atleast in my QA env..probably by trying
> to mock  simultaneous parallel logins to a large number of hosts
>
>
> To monitor your cache sizes, please use the dbmon.sh tool provided with
> your distro.  If that is not available with your particular distro, see
> https://github.com/richm/scripts/wiki/dbmon.sh
>
>
>
>
> thanks
> Rakesh
>
>
>
>
> On Mon, Aug 29, 2016 at 8:16 PM, thierry bordaz 
> wrote:
>
>> Hi Rakesh,
>>
>> Those tuning may depend on the memory available on your machine.
>> nsslapd-cachememsize allows the entry cache to consume up to 200Mb but
>> its memory footprint is known to go above.
>> 200Mb both looks pretty good to me. How large is your machine ? What is
>> your version of 389-ds ?
>>
>> Those warnings do not change your settings. It just raise that entry
>> cache of 'ipaca' and 'retrocl' are small but it is fine. The size of the
>> entry cache is important mostly in userRoot.
>> You may double check the actual values, after restart, with ldapsearch on
>> 'cn=userRoot,cn=ldbm database,cn=plugins,cn=config' and 'cn=config,cn=ldbm
>> database,cn=plugins,cn=config'.
>>
>> A step is to know what will be response time of DS to know if it is
>> responsible of the hang or not.
>> The logs and possibly pstack during those intermittent hangs will help to
>> determine that.
>>
>> regards
>> thierry
>>
>>
>>
>>
>>
>> On 08/29/2016 04:25 PM, Rakesh Rajasekharan wrote:
>>
>> I tried increasing the nsslapd-dbcachesize and nsslapd-cachememsize in my
>> QA envs to 200MB.
>>
>> However, in my log files, I still see this message
>> [29/Aug/2016:04:34:37 +] - WARNING: ipaca: entry cache size 10485760B
>> is less than db size 11599872B; We recommend to increase the entry cache
>> size nsslapd-cachememsize.
>> [29/Aug/2016:04:34:37 +] - WARNING: changelog: entry cache size
>> 2097152B is less than db size 441647104B; We recommend to increase the
>> entry cache size nsslapd-cachememsize.
>>
>> these are my ldif files that i used to modify the values
>> modify entry cache 

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-08-29 Thread Rich Megginson

On 08/29/2016 10:53 AM, Rakesh Rajasekharan wrote:

Hi Thierry,

My machine has 30GB RAM ..and  389-ds version is 1.3.4

ldapsearch shows the values for nsslapd-cachememsize updated to 200MB.

ldapsearch -LLL -o ldif-wrap=no -D "cn=directory manager" -w 
'mypassword' -b 'cn=userRoot,cn=ldbm 
database,cn=plugins,cn=config'|grep nsslapd-cachememsize

nsslapd-cachememsize: 209715200


So, it seems to have updated though seeing that warning(WARNING: 
ipaca: entry cache size 10485760B is less than db size 11599872B) in 
the log confuses me a bit.


Thers one more entry that I found from the ldapsearch to be bit low

nsslapd-dncachememsize: 10485760
maxdncachesize: 10485760

Should I update these as well to a higher value

At the time when the issue happened, the memory usage as well as the 
overall load of the system was very low .
I will try reproducing the issue atleast in my QA env..probably by 
trying to mock  simultaneous parallel logins to a large number of hosts


To monitor your cache sizes, please use the dbmon.sh tool provided with 
your distro.  If that is not available with your particular distro, see 
https://github.com/richm/scripts/wiki/dbmon.sh





thanks
Rakesh




On Mon, Aug 29, 2016 at 8:16 PM, thierry bordaz > wrote:


Hi Rakesh,

Those tuning may depend on the memory available on your machine.
nsslapd-cachememsize allows the entry cache to consume up to 200Mb
but its memory footprint is known to go above.
200Mb both looks pretty good to me. How large is your machine ?
What is your version of 389-ds ?

Those warnings do not change your settings. It just raise that
entry cache of 'ipaca' and 'retrocl' are small but it is fine. The
size of the entry cache is important mostly in userRoot.
You may double check the actual values, after restart, with
ldapsearch on 'cn=userRoot,cn=ldbm database,cn=plugins,cn=config'
and 'cn=config,cn=ldbm database,cn=plugins,cn=config'.

A step is to know what will be response time of DS to know if it
is responsible of the hang or not.
The logs and possibly pstack during those intermittent hangs will
help to determine that.

regards
thierry





On 08/29/2016 04:25 PM, Rakesh Rajasekharan wrote:

I tried increasing the nsslapd-dbcachesize and
nsslapd-cachememsize in my QA envs to 200MB.

However, in my log files, I still see this message
[29/Aug/2016:04:34:37 +] - WARNING: ipaca: entry cache size
10485760B is less than db size 11599872B; We recommend to
increase the entry cache size nsslapd-cachememsize.
[29/Aug/2016:04:34:37 +] - WARNING: changelog: entry cache
size 2097152B is less than db size 441647104B; We recommend to
increase the entry cache size nsslapd-cachememsize.

these are my ldif files that i used to modify the values
modify entry cache size
cat modify-cache-mem-size.ldif
dn: cn=userRoot,cn=ldbm database,cn=plugins,cn=config
changetype: modify
replace: nsslapd-cachememsize
nsslapd-cachememsize: 209715200

modify db cache size
cat modfy-db-cache-size.ldif
dn: cn=config,cn=ldbm database,cn=plugins,cn=config
changetype: modify
replace: nsslapd-dbcachesize
nsslapd-dbcachesize: 209715200

After modifying , i restarted IPA services

Is there anything else that  I need to take care of as the logs
suggest its still not getting the updated values

Thanks
Rakesh

On Mon, Aug 29, 2016 at 6:07 PM, Rakesh Rajasekharan
mailto:rakesh.rajasekha...@gmail.com>> wrote:

Hi Thierry,

Coz of the issues we had to revert back to earlier running
openldap in production.

I have now done a few TCP related changes in sysctl.conf and
have also increased the nsslapd-dbcachesize and
nsslapd-cachememsize to 200MB

I will again start migrating hosts back to IPA and see if I
face the earlier issue.

I will update back once I have something


Thanks,
Rakesh



On Thu, Aug 25, 2016 at 2:17 PM, thierry bordaz
mailto:tbor...@redhat.com>> wrote:



On 08/25/2016 10:15 AM, Rakesh Rajasekharan wrote:

All of the troubleshooting seems fine.


However, Running libconv.pl  gives me
this output

- Recommendations -

 1.  You have unindexed components, this can be caused
from a search on an unindexed attribute, or your
returned results exceeded the allidsthreshold. Unindexed
components are not recommended. To refuse unindexed
searches, switch 'nsslapd-require-index' to 'on' under
your database entry (e.g. cn=UserRoot,cn=ldbm
database,cn=plugins,cn=config).

 2.  You have a significant difference between binds and
unbinds.  You may want to investigate this difference.


   

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-08-29 Thread Rakesh Rajasekharan
Hi Thierry,

My machine has 30GB RAM ..and  389-ds version is 1.3.4

ldapsearch shows the values for nsslapd-cachememsize updated to 200MB.

ldapsearch -LLL -o ldif-wrap=no -D "cn=directory manager" -w 'mypassword'
-b 'cn=userRoot,cn=ldbm database,cn=plugins,cn=config'|grep
nsslapd-cachememsize
nsslapd-cachememsize: 209715200


So, it seems to have updated though seeing that warning(WARNING: ipaca:
entry cache size 10485760B is less than db size 11599872B) in the log
confuses me a bit.

Thers one more entry that I found from the ldapsearch to be bit low

nsslapd-dncachememsize: 10485760
maxdncachesize: 10485760

Should I update these as well to a higher value

At the time when the issue happened, the memory usage as well as the
overall load of the system was very low .
I will try reproducing the issue atleast in my QA env..probably by trying
to mock  simultaneous parallel logins to a large number of hosts


thanks
Rakesh




On Mon, Aug 29, 2016 at 8:16 PM, thierry bordaz  wrote:

> Hi Rakesh,
>
> Those tuning may depend on the memory available on your machine.
> nsslapd-cachememsize allows the entry cache to consume up to 200Mb but its
> memory footprint is known to go above.
> 200Mb both looks pretty good to me. How large is your machine ? What is
> your version of 389-ds ?
>
> Those warnings do not change your settings. It just raise that entry cache
> of 'ipaca' and 'retrocl' are small but it is fine. The size of the entry
> cache is important mostly in userRoot.
> You may double check the actual values, after restart, with ldapsearch on
> 'cn=userRoot,cn=ldbm database,cn=plugins,cn=config' and 'cn=config,cn=ldbm
> database,cn=plugins,cn=config'.
>
> A step is to know what will be response time of DS to know if it is
> responsible of the hang or not.
> The logs and possibly pstack during those intermittent hangs will help to
> determine that.
>
> regards
> thierry
>
>
>
>
>
> On 08/29/2016 04:25 PM, Rakesh Rajasekharan wrote:
>
> I tried increasing the nsslapd-dbcachesize and nsslapd-cachememsize in my
> QA envs to 200MB.
>
> However, in my log files, I still see this message
> [29/Aug/2016:04:34:37 +] - WARNING: ipaca: entry cache size 10485760B
> is less than db size 11599872B; We recommend to increase the entry cache
> size nsslapd-cachememsize.
> [29/Aug/2016:04:34:37 +] - WARNING: changelog: entry cache size
> 2097152B is less than db size 441647104B; We recommend to increase the
> entry cache size nsslapd-cachememsize.
>
> these are my ldif files that i used to modify the values
> modify entry cache size
> cat modify-cache-mem-size.ldif
> dn: cn=userRoot,cn=ldbm database,cn=plugins,cn=config
> changetype: modify
> replace: nsslapd-cachememsize
> nsslapd-cachememsize: 209715200
>
> modify db cache size
> cat modfy-db-cache-size.ldif
> dn: cn=config,cn=ldbm database,cn=plugins,cn=config
> changetype: modify
> replace: nsslapd-dbcachesize
> nsslapd-dbcachesize: 209715200
>
> After modifying , i restarted IPA services
>
> Is there anything else that  I need to take care of as the logs suggest
> its still not getting the updated values
>
> Thanks
> Rakesh
>
> On Mon, Aug 29, 2016 at 6:07 PM, Rakesh Rajasekharan <
> rakesh.rajasekha...@gmail.com> wrote:
>
>> Hi Thierry,
>>
>> Coz of the issues we had to revert back to earlier running openldap in
>> production.
>>
>> I have now done a few TCP related changes in sysctl.conf and have also
>> increased the nsslapd-dbcachesize and nsslapd-cachememsize to 200MB
>>
>> I will again start migrating hosts back to IPA and see if I face the
>> earlier issue.
>>
>> I will update back once I have something
>>
>>
>> Thanks,
>> Rakesh
>>
>>
>>
>> On Thu, Aug 25, 2016 at 2:17 PM, thierry bordaz < 
>> tbor...@redhat.com> wrote:
>>
>>>
>>>
>>> On 08/25/2016 10:15 AM, Rakesh Rajasekharan wrote:
>>>
>>> All of the troubleshooting seems fine.
>>>
>>>
>>> However, Running libconv.pl gives me this output
>>>
>>> - Recommendations -
>>>
>>>  1.  You have unindexed components, this can be caused from a search on
>>> an unindexed attribute, or your returned results exceeded the
>>> allidsthreshold.  Unindexed components are not recommended. To refuse
>>> unindexed searches, switch 'nsslapd-require-index' to 'on' under your
>>> database entry (e.g. cn=UserRoot,cn=ldbm database,cn=plugins,cn=config).
>>>
>>>  2.  You have a significant difference between binds and unbinds.  You
>>> may want to investigate this difference.
>>>
>>>
>>> I feel, this could be a pointer to things going slow.. and IPA hanging.
>>> I think i now have something that I can try and nail down this issue.
>>>
>>> On a sidenote, I was earlier running openldap and migrated over to
>>> Freeipa,
>>>
>>> Thanks
>>> Rakesh
>>>
>>>
>>>
>>> On Wed, Aug 24, 2016 at 12:38 PM, Petr Spacek < 
>>> pspa...@redhat.com> wrote:
>>>
 On 23.8.2016 18:44, Rakesh Rajasekharan wrote:
 > I think thers something seriously wrong with my system
 >
 > not able to run any  IPA commands
 >
 > klist

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-08-29 Thread thierry bordaz

Hi Rakesh,

Those tuning may depend on the memory available on your machine.
nsslapd-cachememsize allows the entry cache to consume up to 200Mb but 
its memory footprint is known to go above.
200Mb both looks pretty good to me. How large is your machine ? What is 
your version of 389-ds ?


Those warnings do not change your settings. It just raise that entry 
cache of 'ipaca' and 'retrocl' are small but it is fine. The size of the 
entry cache is important mostly in userRoot.
You may double check the actual values, after restart, with ldapsearch 
on 'cn=userRoot,cn=ldbm database,cn=plugins,cn=config' and 
'cn=config,cn=ldbm database,cn=plugins,cn=config'.


A step is to know what will be response time of DS to know if it is 
responsible of the hang or not.
The logs and possibly pstack during those intermittent hangs will help 
to determine that.


regards
thierry




On 08/29/2016 04:25 PM, Rakesh Rajasekharan wrote:
I tried increasing the nsslapd-dbcachesize and nsslapd-cachememsize in 
my QA envs to 200MB.


However, in my log files, I still see this message
[29/Aug/2016:04:34:37 +] - WARNING: ipaca: entry cache size 
10485760B is less than db size 11599872B; We recommend to increase the 
entry cache size nsslapd-cachememsize.
[29/Aug/2016:04:34:37 +] - WARNING: changelog: entry cache size 
2097152B is less than db size 441647104B; We recommend to increase the 
entry cache size nsslapd-cachememsize.


these are my ldif files that i used to modify the values
modify entry cache size
cat modify-cache-mem-size.ldif
dn: cn=userRoot,cn=ldbm database,cn=plugins,cn=config
changetype: modify
replace: nsslapd-cachememsize
nsslapd-cachememsize: 209715200

modify db cache size
cat modfy-db-cache-size.ldif
dn: cn=config,cn=ldbm database,cn=plugins,cn=config
changetype: modify
replace: nsslapd-dbcachesize
nsslapd-dbcachesize: 209715200

After modifying , i restarted IPA services

Is there anything else that  I need to take care of as the logs 
suggest its still not getting the updated values


Thanks
Rakesh

On Mon, Aug 29, 2016 at 6:07 PM, Rakesh Rajasekharan 
mailto:rakesh.rajasekha...@gmail.com>> 
wrote:


Hi Thierry,

Coz of the issues we had to revert back to earlier running
openldap in production.

I have now done a few TCP related changes in sysctl.conf and have
also increased the nsslapd-dbcachesize and nsslapd-cachememsize to
200MB

I will again start migrating hosts back to IPA and see if I face
the earlier issue.

I will update back once I have something


Thanks,
Rakesh



On Thu, Aug 25, 2016 at 2:17 PM, thierry bordaz
mailto:tbor...@redhat.com>> wrote:



On 08/25/2016 10:15 AM, Rakesh Rajasekharan wrote:

All of the troubleshooting seems fine.


However, Running libconv.pl  gives me this
output

- Recommendations -

 1.  You have unindexed components, this can be caused from a
search on an unindexed attribute, or your returned results
exceeded the allidsthreshold. Unindexed components are not
recommended. To refuse unindexed searches, switch
'nsslapd-require-index' to 'on' under your database entry
(e.g. cn=UserRoot,cn=ldbm database,cn=plugins,cn=config).

 2.  You have a significant difference between binds and
unbinds.  You may want to investigate this difference.


I feel, this could be a pointer to things going slow.. and
IPA hanging. I think i now have something that I can try and
nail down this issue.

On a sidenote, I was earlier running openldap and migrated
over to Freeipa,

Thanks
Rakesh



On Wed, Aug 24, 2016 at 12:38 PM, Petr Spacek
mailto:pspa...@redhat.com>> wrote:

On 23.8.2016 18:44, Rakesh Rajasekharan wrote:
> I think thers something seriously wrong with my system
>
> not able to run any  IPA commands
>
> klist
> Ticket cache: KEYRING:persistent:0:0
> Default principal: ad...@xyz.com 
>
> Valid starting   Expires Service principal
> 2016-08-23T16:26:36 2016-08-24T16:26:22 
krbtgt/xyz@xyz.com 

>
>
> [root@prod-ipa-master-1a :~] ipactl status
> Directory Service: RUNNING
> krb5kdc Service: RUNNING
> kadmin Service: RUNNING
> ipa_memcached Service: RUNNING
> httpd Service: RUNNING
> pki-tomcatd Service: RUNNING
> ipa-otpd Service: RUNNING
> ipa: INFO: The ipactl command was successful
>
>
>
> [root@prod-ipa-master :~] ipa user-find p-testuser
> ipa: ERROR: Kerberos error: ('Unspecified GSS failure. 
Minor code may

  

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-08-29 Thread Rakesh Rajasekharan
I tried increasing the nsslapd-dbcachesize and nsslapd-cachememsize in my
QA envs to 200MB.

However, in my log files, I still see this message
[29/Aug/2016:04:34:37 +] - WARNING: ipaca: entry cache size 10485760B
is less than db size 11599872B; We recommend to increase the entry cache
size nsslapd-cachememsize.
[29/Aug/2016:04:34:37 +] - WARNING: changelog: entry cache size
2097152B is less than db size 441647104B; We recommend to increase the
entry cache size nsslapd-cachememsize.

these are my ldif files that i used to modify the values
modify entry cache size
cat modify-cache-mem-size.ldif
dn: cn=userRoot,cn=ldbm database,cn=plugins,cn=config
changetype: modify
replace: nsslapd-cachememsize
nsslapd-cachememsize: 209715200

modify db cache size
cat modfy-db-cache-size.ldif
dn: cn=config,cn=ldbm database,cn=plugins,cn=config
changetype: modify
replace: nsslapd-dbcachesize
nsslapd-dbcachesize: 209715200

After modifying , i restarted IPA services

Is there anything else that  I need to take care of as the logs suggest its
still not getting the updated values

Thanks
Rakesh

On Mon, Aug 29, 2016 at 6:07 PM, Rakesh Rajasekharan <
rakesh.rajasekha...@gmail.com> wrote:

> Hi Thierry,
>
> Coz of the issues we had to revert back to earlier running openldap in
> production.
>
> I have now done a few TCP related changes in sysctl.conf and have also
> increased the nsslapd-dbcachesize and nsslapd-cachememsize to 200MB
>
> I will again start migrating hosts back to IPA and see if I face the
> earlier issue.
>
> I will update back once I have something
>
>
> Thanks,
> Rakesh
>
>
>
> On Thu, Aug 25, 2016 at 2:17 PM, thierry bordaz 
> wrote:
>
>>
>>
>> On 08/25/2016 10:15 AM, Rakesh Rajasekharan wrote:
>>
>> All of the troubleshooting seems fine.
>>
>>
>> However, Running libconv.pl gives me this output
>>
>> - Recommendations -
>>
>>  1.  You have unindexed components, this can be caused from a search on
>> an unindexed attribute, or your returned results exceeded the
>> allidsthreshold.  Unindexed components are not recommended. To refuse
>> unindexed searches, switch 'nsslapd-require-index' to 'on' under your
>> database entry (e.g. cn=UserRoot,cn=ldbm database,cn=plugins,cn=config).
>>
>>  2.  You have a significant difference between binds and unbinds.  You
>> may want to investigate this difference.
>>
>>
>> I feel, this could be a pointer to things going slow.. and IPA hanging. I
>> think i now have something that I can try and nail down this issue.
>>
>> On a sidenote, I was earlier running openldap and migrated over to
>> Freeipa,
>>
>> Thanks
>> Rakesh
>>
>>
>>
>> On Wed, Aug 24, 2016 at 12:38 PM, Petr Spacek  wrote:
>>
>>> On 23.8.2016 18:44, Rakesh Rajasekharan wrote:
>>> > I think thers something seriously wrong with my system
>>> >
>>> > not able to run any  IPA commands
>>> >
>>> > klist
>>> > Ticket cache: KEYRING:persistent:0:0
>>> > Default principal: ad...@xyz.com
>>> >
>>> > Valid starting   Expires  Service principal
>>> > 2016-08-23T16:26:36  2016-08-24T16:26:22  krbtgt/ 
>>> xyz@xyz.com
>>> >
>>> >
>>> > [root@prod-ipa-master-1a :~] ipactl status
>>> > Directory Service: RUNNING
>>> > krb5kdc Service: RUNNING
>>> > kadmin Service: RUNNING
>>> > ipa_memcached Service: RUNNING
>>> > httpd Service: RUNNING
>>> > pki-tomcatd Service: RUNNING
>>> > ipa-otpd Service: RUNNING
>>> > ipa: INFO: The ipactl command was successful
>>> >
>>> >
>>> >
>>> > [root@prod-ipa-master :~] ipa user-find p-testuser
>>> > ipa: ERROR: Kerberos error: ('Unspecified GSS failure.  Minor code may
>>> > provide more information', 851968)/("Cannot contact any KDC for realm '
>>> > XYZ.COM'", -1765328228)
>>>
>>
>> Hi Rakesh,
>>
>> Having a reproducible test case would you rerun the command above.
>> During its processing you may monitor DS process load (top). If it is
>> high, you may get some pstacks of it.
>> Also would you attach the part of DS access logs taken during the command.
>>
>> regards
>> thierry
>>
>> >
>>>
>>> This is weird because the server seems to be up.
>>>
>>> Please follow
>>> http://www.freeipa.org/page/Troubleshooting#Authentication.2FKerberos
>>>
>>> Petr^2 Spacek
>>>
>>> >
>>> >
>>> > Thanks
>>> >
>>> > Rakesh
>>> >
>>> > On Tue, Aug 23, 2016 at 10:01 PM, Rakesh Rajasekharan <
>>> > rakesh.rajasekha...@gmail.com> wrote:
>>> >
>>> >> i changed the loggin level to 4 . Modifying nsslapd-accesslog-level
>>> >>
>>> >> But, the hang is still there. though I dont see the sigfault now
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> On Tue, Aug 23, 2016 at 9:02 PM, Rakesh Rajasekharan <
>>> >> rakesh.rajasekha...@gmail.com> wrote:
>>> >>
>>> >>> My disk was getting filled too fast
>>> >>>
>>> >>> logs under /var/log/dirsrv was coming around 5 gb quickly filling up
>>> >>>
>>> >>> Is there a way to make the logging less verbose
>>> >>>
>>> >>>
>>> >>>
>>> >>> On Tue, Aug 23, 2016 at 6:41 PM, Petr Spacek 
>>> wrote:
>>> >>>
>>>  On 23.8.2016 15:07, Rakesh Rajasekharan wrote:
>>> >

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-08-29 Thread Rakesh Rajasekharan
Hi Thierry,

Coz of the issues we had to revert back to earlier running openldap in
production.

I have now done a few TCP related changes in sysctl.conf and have also
increased the nsslapd-dbcachesize and nsslapd-cachememsize to 200MB

I will again start migrating hosts back to IPA and see if I face the
earlier issue.

I will update back once I have something


Thanks,
Rakesh



On Thu, Aug 25, 2016 at 2:17 PM, thierry bordaz  wrote:

>
>
> On 08/25/2016 10:15 AM, Rakesh Rajasekharan wrote:
>
> All of the troubleshooting seems fine.
>
>
> However, Running libconv.pl gives me this output
>
> - Recommendations -
>
>  1.  You have unindexed components, this can be caused from a search on an
> unindexed attribute, or your returned results exceeded the
> allidsthreshold.  Unindexed components are not recommended. To refuse
> unindexed searches, switch 'nsslapd-require-index' to 'on' under your
> database entry (e.g. cn=UserRoot,cn=ldbm database,cn=plugins,cn=config).
>
>  2.  You have a significant difference between binds and unbinds.  You may
> want to investigate this difference.
>
>
> I feel, this could be a pointer to things going slow.. and IPA hanging. I
> think i now have something that I can try and nail down this issue.
>
> On a sidenote, I was earlier running openldap and migrated over to
> Freeipa,
>
> Thanks
> Rakesh
>
>
>
> On Wed, Aug 24, 2016 at 12:38 PM, Petr Spacek  wrote:
>
>> On 23.8.2016 18:44, Rakesh Rajasekharan wrote:
>> > I think thers something seriously wrong with my system
>> >
>> > not able to run any  IPA commands
>> >
>> > klist
>> > Ticket cache: KEYRING:persistent:0:0
>> > Default principal: ad...@xyz.com
>> >
>> > Valid starting   Expires  Service principal
>> > 2016-08-23T16:26:36  2016-08-24T16:26:22  krbtgt/ 
>> xyz@xyz.com
>> >
>> >
>> > [root@prod-ipa-master-1a :~] ipactl status
>> > Directory Service: RUNNING
>> > krb5kdc Service: RUNNING
>> > kadmin Service: RUNNING
>> > ipa_memcached Service: RUNNING
>> > httpd Service: RUNNING
>> > pki-tomcatd Service: RUNNING
>> > ipa-otpd Service: RUNNING
>> > ipa: INFO: The ipactl command was successful
>> >
>> >
>> >
>> > [root@prod-ipa-master :~] ipa user-find p-testuser
>> > ipa: ERROR: Kerberos error: ('Unspecified GSS failure.  Minor code may
>> > provide more information', 851968)/("Cannot contact any KDC for realm '
>> > XYZ.COM'", -1765328228)
>>
>
> Hi Rakesh,
>
> Having a reproducible test case would you rerun the command above.
> During its processing you may monitor DS process load (top). If it is
> high, you may get some pstacks of it.
> Also would you attach the part of DS access logs taken during the command.
>
> regards
> thierry
>
> >
>>
>> This is weird because the server seems to be up.
>>
>> Please follow
>> http://www.freeipa.org/page/Troubleshooting#Authentication.2FKerberos
>>
>> Petr^2 Spacek
>>
>> >
>> >
>> > Thanks
>> >
>> > Rakesh
>> >
>> > On Tue, Aug 23, 2016 at 10:01 PM, Rakesh Rajasekharan <
>> > rakesh.rajasekha...@gmail.com> wrote:
>> >
>> >> i changed the loggin level to 4 . Modifying nsslapd-accesslog-level
>> >>
>> >> But, the hang is still there. though I dont see the sigfault now
>> >>
>> >>
>> >>
>> >>
>> >> On Tue, Aug 23, 2016 at 9:02 PM, Rakesh Rajasekharan <
>> >> rakesh.rajasekha...@gmail.com> wrote:
>> >>
>> >>> My disk was getting filled too fast
>> >>>
>> >>> logs under /var/log/dirsrv was coming around 5 gb quickly filling up
>> >>>
>> >>> Is there a way to make the logging less verbose
>> >>>
>> >>>
>> >>>
>> >>> On Tue, Aug 23, 2016 at 6:41 PM, Petr Spacek 
>> wrote:
>> >>>
>>  On 23.8.2016 15:07, Rakesh Rajasekharan wrote:
>> > I was able to fix that may be temporarily... when i checked the
>>  network..
>> > there was another process that was running and consuming a lot of
>>  network (
>> > i have no idea who did that. I need to seriously start restricting
>>  people
>> > access to this machine )
>> >
>> > after killing that perfomance improved drastically
>> >
>> > But now, suddenly I started experiencing the same hang.
>> >
>> > This time , I gert the following error when checked dmesg
>> >
>> > [  301.236976] ns-slapd[3124]: segfault at 0 ip 7f1de416951c sp
>> > 7f1dee1dba70 error 4 in libcos-plugin.so[7f1de4166000+b000]
>> > [ 1116.248431] TCP: request_sock_TCP: Possible SYN flooding on port
>> 88.
>> > Sending cookies.  Check SNMP counters.
>> > [11831.397037] ns-slapd[22550]: segfault at 0 ip 7f533d82251c sp
>> > 7f5347894a70 error 4 in libcos-plugin.so[7f533d81f000+b000]
>> > [11832.727989] ns-slapd[22606]: segfault at 0 ip 7f6231eb951c sp
>> > 7f623bf2ba70 error 4 in libcos-plugin.so[7f6231eb6000+b00
>> 
>>  Okay, this one is serious. The LDAP server crashed.
>> 
>>  1. Make sure all your packages are up-to-date.
>> 
>>  Please see
>>  http://directory.fedoraproject.org/docs/389ds/FAQ/faq.html#d
>> 

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-08-25 Thread thierry bordaz



On 08/25/2016 10:15 AM, Rakesh Rajasekharan wrote:

All of the troubleshooting seems fine.


However, Running libconv.pl  gives me this output

- Recommendations -

 1.  You have unindexed components, this can be caused from a search 
on an unindexed attribute, or your returned results exceeded the 
allidsthreshold.  Unindexed components are not recommended. To refuse 
unindexed searches, switch 'nsslapd-require-index' to 'on' under your 
database entry (e.g. cn=UserRoot,cn=ldbm database,cn=plugins,cn=config).


 2.  You have a significant difference between binds and unbinds.  You 
may want to investigate this difference.



I feel, this could be a pointer to things going slow.. and IPA 
hanging. I think i now have something that I can try and nail down 
this issue.


On a sidenote, I was earlier running openldap and migrated over to 
Freeipa,


Thanks
Rakesh



On Wed, Aug 24, 2016 at 12:38 PM, Petr Spacek > wrote:


On 23.8.2016 18:44, Rakesh Rajasekharan wrote:
> I think thers something seriously wrong with my system
>
> not able to run any  IPA commands
>
> klist
> Ticket cache: KEYRING:persistent:0:0
> Default principal: ad...@xyz.com 
>
> Valid starting   Expires  Service principal
> 2016-08-23T16:26:36  2016-08-24T16:26:22  krbtgt/xyz@xyz.com

>
>
> [root@prod-ipa-master-1a :~] ipactl status
> Directory Service: RUNNING
> krb5kdc Service: RUNNING
> kadmin Service: RUNNING
> ipa_memcached Service: RUNNING
> httpd Service: RUNNING
> pki-tomcatd Service: RUNNING
> ipa-otpd Service: RUNNING
> ipa: INFO: The ipactl command was successful
>
>
>
> [root@prod-ipa-master :~] ipa user-find p-testuser
> ipa: ERROR: Kerberos error: ('Unspecified GSS failure.  Minor
code may
> provide more information', 851968)/("Cannot contact any KDC for
realm '
> XYZ.COM '", -1765328228)



Hi Rakesh,

   Having a reproducible test case would you rerun the command above.
   During its processing you may monitor DS process load (top). If it
   is high, you may get some pstacks of it.
   Also would you attach the part of DS access logs taken during the
   command.

   regards
   thierry


>

This is weird because the server seems to be up.

Please follow
http://www.freeipa.org/page/Troubleshooting#Authentication.2FKerberos


Petr^2 Spacek

>
>
> Thanks
>
> Rakesh
>
> On Tue, Aug 23, 2016 at 10:01 PM, Rakesh Rajasekharan <
> rakesh.rajasekha...@gmail.com
> wrote:
>
>> i changed the loggin level to 4 . Modifying nsslapd-accesslog-level
>>
>> But, the hang is still there. though I dont see the sigfault now
>>
>>
>>
>>
>> On Tue, Aug 23, 2016 at 9:02 PM, Rakesh Rajasekharan <
>> rakesh.rajasekha...@gmail.com
> wrote:
>>
>>> My disk was getting filled too fast
>>>
>>> logs under /var/log/dirsrv was coming around 5 gb quickly
filling up
>>>
>>> Is there a way to make the logging less verbose
>>>
>>>
>>>
>>> On Tue, Aug 23, 2016 at 6:41 PM, Petr Spacek
mailto:pspa...@redhat.com>> wrote:
>>>
 On 23.8.2016 15:07, Rakesh Rajasekharan wrote:
> I was able to fix that may be temporarily... when i checked the
 network..
> there was another process that was running and consuming a
lot of
 network (
> i have no idea who did that. I need to seriously start
restricting
 people
> access to this machine )
>
> after killing that perfomance improved drastically
>
> But now, suddenly I started experiencing the same hang.
>
> This time , I gert the following error when checked dmesg
>
> [  301.236976] ns-slapd[3124]: segfault at 0 ip
7f1de416951c sp
> 7f1dee1dba70 error 4 in libcos-plugin.so[7f1de4166000+b000]
> [ 1116.248431] TCP: request_sock_TCP: Possible SYN flooding
on port 88.
> Sending cookies.  Check SNMP counters.
> [11831.397037] ns-slapd[22550]: segfault at 0 ip
7f533d82251c sp
> 7f5347894a70 error 4 in libcos-plugin.so[7f533d81f000+b000]
> [11832.727989] ns-slapd[22606]: segfault at 0 ip
7f6231eb951c sp
> 7f623bf2ba70 error 4 in libcos-plugin.so[7f6231eb6000+b00

 Okay, this one is serious. The LDAP server crashed.

 1. Make sure all your packages are up-to-date.

 Please see
 http://directory.fedoraproject.org/docs/389ds/FAQ/faq.html#d


Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-08-25 Thread Rakesh Rajasekharan
All of the troubleshooting seems fine.


However, Running libconv.pl gives me this output

- Recommendations -

 1.  You have unindexed components, this can be caused from a search on an
unindexed attribute, or your returned results exceeded the
allidsthreshold.  Unindexed components are not recommended. To refuse
unindexed searches, switch 'nsslapd-require-index' to 'on' under your
database entry (e.g. cn=UserRoot,cn=ldbm database,cn=plugins,cn=config).

 2.  You have a significant difference between binds and unbinds.  You may
want to investigate this difference.


I feel, this could be a pointer to things going slow.. and IPA hanging. I
think i now have something that I can try and nail down this issue.

On a sidenote, I was earlier running openldap and migrated over to Freeipa,

Thanks
Rakesh



On Wed, Aug 24, 2016 at 12:38 PM, Petr Spacek  wrote:

> On 23.8.2016 18:44, Rakesh Rajasekharan wrote:
> > I think thers something seriously wrong with my system
> >
> > not able to run any  IPA commands
> >
> > klist
> > Ticket cache: KEYRING:persistent:0:0
> > Default principal: ad...@xyz.com
> >
> > Valid starting   Expires  Service principal
> > 2016-08-23T16:26:36  2016-08-24T16:26:22  krbtgt/xyz@xyz.com
> >
> >
> > [root@prod-ipa-master-1a :~] ipactl status
> > Directory Service: RUNNING
> > krb5kdc Service: RUNNING
> > kadmin Service: RUNNING
> > ipa_memcached Service: RUNNING
> > httpd Service: RUNNING
> > pki-tomcatd Service: RUNNING
> > ipa-otpd Service: RUNNING
> > ipa: INFO: The ipactl command was successful
> >
> >
> >
> > [root@prod-ipa-master :~] ipa user-find p-testuser
> > ipa: ERROR: Kerberos error: ('Unspecified GSS failure.  Minor code may
> > provide more information', 851968)/("Cannot contact any KDC for realm '
> > XYZ.COM'", -1765328228)
> >
>
> This is weird because the server seems to be up.
>
> Please follow
> http://www.freeipa.org/page/Troubleshooting#Authentication.2FKerberos
>
> Petr^2 Spacek
>
> >
> >
> > Thanks
> >
> > Rakesh
> >
> > On Tue, Aug 23, 2016 at 10:01 PM, Rakesh Rajasekharan <
> > rakesh.rajasekha...@gmail.com> wrote:
> >
> >> i changed the loggin level to 4 . Modifying nsslapd-accesslog-level
> >>
> >> But, the hang is still there. though I dont see the sigfault now
> >>
> >>
> >>
> >>
> >> On Tue, Aug 23, 2016 at 9:02 PM, Rakesh Rajasekharan <
> >> rakesh.rajasekha...@gmail.com> wrote:
> >>
> >>> My disk was getting filled too fast
> >>>
> >>> logs under /var/log/dirsrv was coming around 5 gb quickly filling up
> >>>
> >>> Is there a way to make the logging less verbose
> >>>
> >>>
> >>>
> >>> On Tue, Aug 23, 2016 at 6:41 PM, Petr Spacek 
> wrote:
> >>>
>  On 23.8.2016 15:07, Rakesh Rajasekharan wrote:
> > I was able to fix that may be temporarily... when i checked the
>  network..
> > there was another process that was running and consuming a lot of
>  network (
> > i have no idea who did that. I need to seriously start restricting
>  people
> > access to this machine )
> >
> > after killing that perfomance improved drastically
> >
> > But now, suddenly I started experiencing the same hang.
> >
> > This time , I gert the following error when checked dmesg
> >
> > [  301.236976] ns-slapd[3124]: segfault at 0 ip 7f1de416951c sp
> > 7f1dee1dba70 error 4 in libcos-plugin.so[7f1de4166000+b000]
> > [ 1116.248431] TCP: request_sock_TCP: Possible SYN flooding on port
> 88.
> > Sending cookies.  Check SNMP counters.
> > [11831.397037] ns-slapd[22550]: segfault at 0 ip 7f533d82251c sp
> > 7f5347894a70 error 4 in libcos-plugin.so[7f533d81f000+b000]
> > [11832.727989] ns-slapd[22606]: segfault at 0 ip 7f6231eb951c sp
> > 7f623bf2ba70 error 4 in libcos-plugin.so[7f6231eb6000+b00
> 
>  Okay, this one is serious. The LDAP server crashed.
> 
>  1. Make sure all your packages are up-to-date.
> 
>  Please see
>  http://directory.fedoraproject.org/docs/389ds/FAQ/faq.html#d
>  ebugging-crashes
>  for further instructions how to debug this.
> 
>  Petr^2 Spacek
> 
> >
> > and in /var/log/dirsrv/example-com/errors
> >
> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>  could
> > not delete change record 3291138 (rc: 32)
> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>  could
> > not delete change record 3291139 (rc: 32)
> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>  could
> > not delete change record 3291140 (rc: 32)
> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>  could
> > not delete change record 3291141 (rc: 32)
> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>  could
> > not delete change record 3291142 (rc: 32)
> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>  coul

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-08-24 Thread Petr Spacek
On 23.8.2016 18:44, Rakesh Rajasekharan wrote:
> I think thers something seriously wrong with my system
> 
> not able to run any  IPA commands
> 
> klist
> Ticket cache: KEYRING:persistent:0:0
> Default principal: ad...@xyz.com
> 
> Valid starting   Expires  Service principal
> 2016-08-23T16:26:36  2016-08-24T16:26:22  krbtgt/xyz@xyz.com
> 
> 
> [root@prod-ipa-master-1a :~] ipactl status
> Directory Service: RUNNING
> krb5kdc Service: RUNNING
> kadmin Service: RUNNING
> ipa_memcached Service: RUNNING
> httpd Service: RUNNING
> pki-tomcatd Service: RUNNING
> ipa-otpd Service: RUNNING
> ipa: INFO: The ipactl command was successful
> 
> 
> 
> [root@prod-ipa-master :~] ipa user-find p-testuser
> ipa: ERROR: Kerberos error: ('Unspecified GSS failure.  Minor code may
> provide more information', 851968)/("Cannot contact any KDC for realm '
> XYZ.COM'", -1765328228)
> 

This is weird because the server seems to be up.

Please follow
http://www.freeipa.org/page/Troubleshooting#Authentication.2FKerberos

Petr^2 Spacek

> 
> 
> Thanks
> 
> Rakesh
> 
> On Tue, Aug 23, 2016 at 10:01 PM, Rakesh Rajasekharan <
> rakesh.rajasekha...@gmail.com> wrote:
> 
>> i changed the loggin level to 4 . Modifying nsslapd-accesslog-level
>>
>> But, the hang is still there. though I dont see the sigfault now
>>
>>
>>
>>
>> On Tue, Aug 23, 2016 at 9:02 PM, Rakesh Rajasekharan <
>> rakesh.rajasekha...@gmail.com> wrote:
>>
>>> My disk was getting filled too fast
>>>
>>> logs under /var/log/dirsrv was coming around 5 gb quickly filling up
>>>
>>> Is there a way to make the logging less verbose
>>>
>>>
>>>
>>> On Tue, Aug 23, 2016 at 6:41 PM, Petr Spacek  wrote:
>>>
 On 23.8.2016 15:07, Rakesh Rajasekharan wrote:
> I was able to fix that may be temporarily... when i checked the
 network..
> there was another process that was running and consuming a lot of
 network (
> i have no idea who did that. I need to seriously start restricting
 people
> access to this machine )
>
> after killing that perfomance improved drastically
>
> But now, suddenly I started experiencing the same hang.
>
> This time , I gert the following error when checked dmesg
>
> [  301.236976] ns-slapd[3124]: segfault at 0 ip 7f1de416951c sp
> 7f1dee1dba70 error 4 in libcos-plugin.so[7f1de4166000+b000]
> [ 1116.248431] TCP: request_sock_TCP: Possible SYN flooding on port 88.
> Sending cookies.  Check SNMP counters.
> [11831.397037] ns-slapd[22550]: segfault at 0 ip 7f533d82251c sp
> 7f5347894a70 error 4 in libcos-plugin.so[7f533d81f000+b000]
> [11832.727989] ns-slapd[22606]: segfault at 0 ip 7f6231eb951c sp
> 7f623bf2ba70 error 4 in libcos-plugin.so[7f6231eb6000+b00

 Okay, this one is serious. The LDAP server crashed.

 1. Make sure all your packages are up-to-date.

 Please see
 http://directory.fedoraproject.org/docs/389ds/FAQ/faq.html#d
 ebugging-crashes
 for further instructions how to debug this.

 Petr^2 Spacek

>
> and in /var/log/dirsrv/example-com/errors
>
> [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
 could
> not delete change record 3291138 (rc: 32)
> [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
 could
> not delete change record 3291139 (rc: 32)
> [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
 could
> not delete change record 3291140 (rc: 32)
> [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
 could
> not delete change record 3291141 (rc: 32)
> [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
 could
> not delete change record 3291142 (rc: 32)
> [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
 could
> not delete change record 3291143 (rc: 32)
> [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
 could
> not delete change record 3291144 (rc: 32)
> [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
 could
> not delete change record 3291145 (rc: 32)
> [23/Aug/2016:12:49:50 +] - Retry count exceeded in delete
> [23/Aug/2016:12:49:50 +] DSRetroclPlugin - delete_changerecord:
 could
> not delete change record 3292734 (rc: 51)
>
>
> Can  i do something about this error.. I treid to restart ipa a couple
 of
> time but that did not help
>
> Thanks
> Rakesh
>
> On Mon, Aug 22, 2016 at 2:27 PM, Petr Spacek 
 wrote:
>
>> On 19.8.2016 19:32, Rakesh Rajasekharan wrote:
>>> I am running my set up on AWS cloud, and entropy is low at around
 180 .
>>>
>>> I plan to increase it bu installing haveged . But, would low entropy
 by
>> any
>>> chance cause this issue of intermittent hang .
>>> Also, the hang is mostly observed when regi

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-08-23 Thread Rakesh Rajasekharan
I think thers something seriously wrong with my system

not able to run any  IPA commands

klist
Ticket cache: KEYRING:persistent:0:0
Default principal: ad...@xyz.com

Valid starting   Expires  Service principal
2016-08-23T16:26:36  2016-08-24T16:26:22  krbtgt/xyz@xyz.com


[root@prod-ipa-master-1a :~] ipactl status
Directory Service: RUNNING
krb5kdc Service: RUNNING
kadmin Service: RUNNING
ipa_memcached Service: RUNNING
httpd Service: RUNNING
pki-tomcatd Service: RUNNING
ipa-otpd Service: RUNNING
ipa: INFO: The ipactl command was successful



[root@prod-ipa-master :~] ipa user-find p-testuser
ipa: ERROR: Kerberos error: ('Unspecified GSS failure.  Minor code may
provide more information', 851968)/("Cannot contact any KDC for realm '
XYZ.COM'", -1765328228)



Thanks

Rakesh

On Tue, Aug 23, 2016 at 10:01 PM, Rakesh Rajasekharan <
rakesh.rajasekha...@gmail.com> wrote:

> i changed the loggin level to 4 . Modifying nsslapd-accesslog-level
>
> But, the hang is still there. though I dont see the sigfault now
>
>
>
>
> On Tue, Aug 23, 2016 at 9:02 PM, Rakesh Rajasekharan <
> rakesh.rajasekha...@gmail.com> wrote:
>
>> My disk was getting filled too fast
>>
>> logs under /var/log/dirsrv was coming around 5 gb quickly filling up
>>
>> Is there a way to make the logging less verbose
>>
>>
>>
>> On Tue, Aug 23, 2016 at 6:41 PM, Petr Spacek  wrote:
>>
>>> On 23.8.2016 15:07, Rakesh Rajasekharan wrote:
>>> > I was able to fix that may be temporarily... when i checked the
>>> network..
>>> > there was another process that was running and consuming a lot of
>>> network (
>>> > i have no idea who did that. I need to seriously start restricting
>>> people
>>> > access to this machine )
>>> >
>>> > after killing that perfomance improved drastically
>>> >
>>> > But now, suddenly I started experiencing the same hang.
>>> >
>>> > This time , I gert the following error when checked dmesg
>>> >
>>> > [  301.236976] ns-slapd[3124]: segfault at 0 ip 7f1de416951c sp
>>> > 7f1dee1dba70 error 4 in libcos-plugin.so[7f1de4166000+b000]
>>> > [ 1116.248431] TCP: request_sock_TCP: Possible SYN flooding on port 88.
>>> > Sending cookies.  Check SNMP counters.
>>> > [11831.397037] ns-slapd[22550]: segfault at 0 ip 7f533d82251c sp
>>> > 7f5347894a70 error 4 in libcos-plugin.so[7f533d81f000+b000]
>>> > [11832.727989] ns-slapd[22606]: segfault at 0 ip 7f6231eb951c sp
>>> > 7f623bf2ba70 error 4 in libcos-plugin.so[7f6231eb6000+b00
>>>
>>> Okay, this one is serious. The LDAP server crashed.
>>>
>>> 1. Make sure all your packages are up-to-date.
>>>
>>> Please see
>>> http://directory.fedoraproject.org/docs/389ds/FAQ/faq.html#d
>>> ebugging-crashes
>>> for further instructions how to debug this.
>>>
>>> Petr^2 Spacek
>>>
>>> >
>>> > and in /var/log/dirsrv/example-com/errors
>>> >
>>> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>>> could
>>> > not delete change record 3291138 (rc: 32)
>>> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>>> could
>>> > not delete change record 3291139 (rc: 32)
>>> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>>> could
>>> > not delete change record 3291140 (rc: 32)
>>> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>>> could
>>> > not delete change record 3291141 (rc: 32)
>>> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>>> could
>>> > not delete change record 3291142 (rc: 32)
>>> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>>> could
>>> > not delete change record 3291143 (rc: 32)
>>> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>>> could
>>> > not delete change record 3291144 (rc: 32)
>>> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>>> could
>>> > not delete change record 3291145 (rc: 32)
>>> > [23/Aug/2016:12:49:50 +] - Retry count exceeded in delete
>>> > [23/Aug/2016:12:49:50 +] DSRetroclPlugin - delete_changerecord:
>>> could
>>> > not delete change record 3292734 (rc: 51)
>>> >
>>> >
>>> > Can  i do something about this error.. I treid to restart ipa a couple
>>> of
>>> > time but that did not help
>>> >
>>> > Thanks
>>> > Rakesh
>>> >
>>> > On Mon, Aug 22, 2016 at 2:27 PM, Petr Spacek 
>>> wrote:
>>> >
>>> >> On 19.8.2016 19:32, Rakesh Rajasekharan wrote:
>>> >>> I am running my set up on AWS cloud, and entropy is low at around
>>> 180 .
>>> >>>
>>> >>> I plan to increase it bu installing haveged . But, would low entropy
>>> by
>>> >> any
>>> >>> chance cause this issue of intermittent hang .
>>> >>> Also, the hang is mostly observed when registering around 20 clients
>>> >>> together
>>> >>
>>> >> Possibly, I'm not sure. If you want to dig into this, I would do this:
>>> >> 1. look what process hangs on client (using pstree command or so)
>>> >> $ pstree
>>> >>
>>> >> 2. look to what server and port is the hanging client connected to
>>> >> $ lsof -p 
>>> >>

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-08-23 Thread Rakesh Rajasekharan
i changed the loggin level to 4 . Modifying nsslapd-accesslog-level

But, the hang is still there. though I dont see the sigfault now




On Tue, Aug 23, 2016 at 9:02 PM, Rakesh Rajasekharan <
rakesh.rajasekha...@gmail.com> wrote:

> My disk was getting filled too fast
>
> logs under /var/log/dirsrv was coming around 5 gb quickly filling up
>
> Is there a way to make the logging less verbose
>
>
>
> On Tue, Aug 23, 2016 at 6:41 PM, Petr Spacek  wrote:
>
>> On 23.8.2016 15:07, Rakesh Rajasekharan wrote:
>> > I was able to fix that may be temporarily... when i checked the
>> network..
>> > there was another process that was running and consuming a lot of
>> network (
>> > i have no idea who did that. I need to seriously start restricting
>> people
>> > access to this machine )
>> >
>> > after killing that perfomance improved drastically
>> >
>> > But now, suddenly I started experiencing the same hang.
>> >
>> > This time , I gert the following error when checked dmesg
>> >
>> > [  301.236976] ns-slapd[3124]: segfault at 0 ip 7f1de416951c sp
>> > 7f1dee1dba70 error 4 in libcos-plugin.so[7f1de4166000+b000]
>> > [ 1116.248431] TCP: request_sock_TCP: Possible SYN flooding on port 88.
>> > Sending cookies.  Check SNMP counters.
>> > [11831.397037] ns-slapd[22550]: segfault at 0 ip 7f533d82251c sp
>> > 7f5347894a70 error 4 in libcos-plugin.so[7f533d81f000+b000]
>> > [11832.727989] ns-slapd[22606]: segfault at 0 ip 7f6231eb951c sp
>> > 7f623bf2ba70 error 4 in libcos-plugin.so[7f6231eb6000+b00
>>
>> Okay, this one is serious. The LDAP server crashed.
>>
>> 1. Make sure all your packages are up-to-date.
>>
>> Please see
>> http://directory.fedoraproject.org/docs/389ds/FAQ/faq.html#
>> debugging-crashes
>> for further instructions how to debug this.
>>
>> Petr^2 Spacek
>>
>> >
>> > and in /var/log/dirsrv/example-com/errors
>> >
>> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>> could
>> > not delete change record 3291138 (rc: 32)
>> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>> could
>> > not delete change record 3291139 (rc: 32)
>> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>> could
>> > not delete change record 3291140 (rc: 32)
>> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>> could
>> > not delete change record 3291141 (rc: 32)
>> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>> could
>> > not delete change record 3291142 (rc: 32)
>> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>> could
>> > not delete change record 3291143 (rc: 32)
>> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>> could
>> > not delete change record 3291144 (rc: 32)
>> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord:
>> could
>> > not delete change record 3291145 (rc: 32)
>> > [23/Aug/2016:12:49:50 +] - Retry count exceeded in delete
>> > [23/Aug/2016:12:49:50 +] DSRetroclPlugin - delete_changerecord:
>> could
>> > not delete change record 3292734 (rc: 51)
>> >
>> >
>> > Can  i do something about this error.. I treid to restart ipa a couple
>> of
>> > time but that did not help
>> >
>> > Thanks
>> > Rakesh
>> >
>> > On Mon, Aug 22, 2016 at 2:27 PM, Petr Spacek 
>> wrote:
>> >
>> >> On 19.8.2016 19:32, Rakesh Rajasekharan wrote:
>> >>> I am running my set up on AWS cloud, and entropy is low at around 180
>> .
>> >>>
>> >>> I plan to increase it bu installing haveged . But, would low entropy
>> by
>> >> any
>> >>> chance cause this issue of intermittent hang .
>> >>> Also, the hang is mostly observed when registering around 20 clients
>> >>> together
>> >>
>> >> Possibly, I'm not sure. If you want to dig into this, I would do this:
>> >> 1. look what process hangs on client (using pstree command or so)
>> >> $ pstree
>> >>
>> >> 2. look to what server and port is the hanging client connected to
>> >> $ lsof -p 
>> >>
>> >> 3. jump to server and see what process is bound to the target port
>> >> $ netstat -pn
>> >>
>> >> 4. see where the process if hanging
>> >> $ strace -p 
>> >>
>> >> I hope it helps.
>> >>
>> >> Petr^2 Spacek
>> >>
>> >>> On Fri, Aug 19, 2016 at 7:24 PM, Rakesh Rajasekharan <
>> >>> rakesh.rajasekha...@gmail.com> wrote:
>> >>>
>>  yes there seems to be something thats worrying.. I have faced this
>> today
>>  as well.
>>  There are few hosts around 280 odd left and when i try adding them to
>> >> IPA
>>  , the slowness begins..
>> 
>>  all the ipa commands like ipa user-find.. etc becomes very slow in
>>  responding.
>> 
>>  the SYNC_RECV are not many though just around 80-90 and today that
>> was
>>  around 20 only
>> 
>> 
>>  I have for now increased tcp_max_syn_backlog to 5000.
>>  For now the slowness seems to have gone.. but I will do a try adding
>> the
>>  clients again tomorrow and see how it goes
>> 
>>  Thanks
>>  Rakesh

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-08-23 Thread Rakesh Rajasekharan
My disk was getting filled too fast

logs under /var/log/dirsrv was coming around 5 gb quickly filling up

Is there a way to make the logging less verbose



On Tue, Aug 23, 2016 at 6:41 PM, Petr Spacek  wrote:

> On 23.8.2016 15:07, Rakesh Rajasekharan wrote:
> > I was able to fix that may be temporarily... when i checked the network..
> > there was another process that was running and consuming a lot of
> network (
> > i have no idea who did that. I need to seriously start restricting people
> > access to this machine )
> >
> > after killing that perfomance improved drastically
> >
> > But now, suddenly I started experiencing the same hang.
> >
> > This time , I gert the following error when checked dmesg
> >
> > [  301.236976] ns-slapd[3124]: segfault at 0 ip 7f1de416951c sp
> > 7f1dee1dba70 error 4 in libcos-plugin.so[7f1de4166000+b000]
> > [ 1116.248431] TCP: request_sock_TCP: Possible SYN flooding on port 88.
> > Sending cookies.  Check SNMP counters.
> > [11831.397037] ns-slapd[22550]: segfault at 0 ip 7f533d82251c sp
> > 7f5347894a70 error 4 in libcos-plugin.so[7f533d81f000+b000]
> > [11832.727989] ns-slapd[22606]: segfault at 0 ip 7f6231eb951c sp
> > 7f623bf2ba70 error 4 in libcos-plugin.so[7f6231eb6000+b00
>
> Okay, this one is serious. The LDAP server crashed.
>
> 1. Make sure all your packages are up-to-date.
>
> Please see
> http://directory.fedoraproject.org/docs/389ds/
> FAQ/faq.html#debugging-crashes
> for further instructions how to debug this.
>
> Petr^2 Spacek
>
> >
> > and in /var/log/dirsrv/example-com/errors
> >
> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
> > not delete change record 3291138 (rc: 32)
> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
> > not delete change record 3291139 (rc: 32)
> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
> > not delete change record 3291140 (rc: 32)
> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
> > not delete change record 3291141 (rc: 32)
> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
> > not delete change record 3291142 (rc: 32)
> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
> > not delete change record 3291143 (rc: 32)
> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
> > not delete change record 3291144 (rc: 32)
> > [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
> > not delete change record 3291145 (rc: 32)
> > [23/Aug/2016:12:49:50 +] - Retry count exceeded in delete
> > [23/Aug/2016:12:49:50 +] DSRetroclPlugin - delete_changerecord: could
> > not delete change record 3292734 (rc: 51)
> >
> >
> > Can  i do something about this error.. I treid to restart ipa a couple of
> > time but that did not help
> >
> > Thanks
> > Rakesh
> >
> > On Mon, Aug 22, 2016 at 2:27 PM, Petr Spacek  wrote:
> >
> >> On 19.8.2016 19:32, Rakesh Rajasekharan wrote:
> >>> I am running my set up on AWS cloud, and entropy is low at around 180 .
> >>>
> >>> I plan to increase it bu installing haveged . But, would low entropy by
> >> any
> >>> chance cause this issue of intermittent hang .
> >>> Also, the hang is mostly observed when registering around 20 clients
> >>> together
> >>
> >> Possibly, I'm not sure. If you want to dig into this, I would do this:
> >> 1. look what process hangs on client (using pstree command or so)
> >> $ pstree
> >>
> >> 2. look to what server and port is the hanging client connected to
> >> $ lsof -p 
> >>
> >> 3. jump to server and see what process is bound to the target port
> >> $ netstat -pn
> >>
> >> 4. see where the process if hanging
> >> $ strace -p 
> >>
> >> I hope it helps.
> >>
> >> Petr^2 Spacek
> >>
> >>> On Fri, Aug 19, 2016 at 7:24 PM, Rakesh Rajasekharan <
> >>> rakesh.rajasekha...@gmail.com> wrote:
> >>>
>  yes there seems to be something thats worrying.. I have faced this
> today
>  as well.
>  There are few hosts around 280 odd left and when i try adding them to
> >> IPA
>  , the slowness begins..
> 
>  all the ipa commands like ipa user-find.. etc becomes very slow in
>  responding.
> 
>  the SYNC_RECV are not many though just around 80-90 and today that was
>  around 20 only
> 
> 
>  I have for now increased tcp_max_syn_backlog to 5000.
>  For now the slowness seems to have gone.. but I will do a try adding
> the
>  clients again tomorrow and see how it goes
> 
>  Thanks
>  Rakesh
> 
>  The issues
> 
>  On Fri, Aug 19, 2016 at 12:58 PM, Petr Spacek 
> >> wrote:
> 
> > On 18.8.2016 17:23, Rakesh Rajasekharan wrote:
> >> Hi
> >>
> >> I am migrating to freeipa from openldap and have around 4000 clients
> >>
> >> I had openned a another thread on that, but chose to start a new one
> > here
> >> as its a separate issue
> >>
> >>

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-08-23 Thread Petr Spacek
On 23.8.2016 15:07, Rakesh Rajasekharan wrote:
> I was able to fix that may be temporarily... when i checked the network..
> there was another process that was running and consuming a lot of network (
> i have no idea who did that. I need to seriously start restricting people
> access to this machine )
> 
> after killing that perfomance improved drastically
> 
> But now, suddenly I started experiencing the same hang.
> 
> This time , I gert the following error when checked dmesg
> 
> [  301.236976] ns-slapd[3124]: segfault at 0 ip 7f1de416951c sp
> 7f1dee1dba70 error 4 in libcos-plugin.so[7f1de4166000+b000]
> [ 1116.248431] TCP: request_sock_TCP: Possible SYN flooding on port 88.
> Sending cookies.  Check SNMP counters.
> [11831.397037] ns-slapd[22550]: segfault at 0 ip 7f533d82251c sp
> 7f5347894a70 error 4 in libcos-plugin.so[7f533d81f000+b000]
> [11832.727989] ns-slapd[22606]: segfault at 0 ip 7f6231eb951c sp
> 7f623bf2ba70 error 4 in libcos-plugin.so[7f6231eb6000+b00

Okay, this one is serious. The LDAP server crashed.

1. Make sure all your packages are up-to-date.

Please see
http://directory.fedoraproject.org/docs/389ds/FAQ/faq.html#debugging-crashes
for further instructions how to debug this.

Petr^2 Spacek

> 
> and in /var/log/dirsrv/example-com/errors
> 
> [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
> not delete change record 3291138 (rc: 32)
> [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
> not delete change record 3291139 (rc: 32)
> [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
> not delete change record 3291140 (rc: 32)
> [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
> not delete change record 3291141 (rc: 32)
> [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
> not delete change record 3291142 (rc: 32)
> [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
> not delete change record 3291143 (rc: 32)
> [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
> not delete change record 3291144 (rc: 32)
> [23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
> not delete change record 3291145 (rc: 32)
> [23/Aug/2016:12:49:50 +] - Retry count exceeded in delete
> [23/Aug/2016:12:49:50 +] DSRetroclPlugin - delete_changerecord: could
> not delete change record 3292734 (rc: 51)
> 
> 
> Can  i do something about this error.. I treid to restart ipa a couple of
> time but that did not help
> 
> Thanks
> Rakesh
> 
> On Mon, Aug 22, 2016 at 2:27 PM, Petr Spacek  wrote:
> 
>> On 19.8.2016 19:32, Rakesh Rajasekharan wrote:
>>> I am running my set up on AWS cloud, and entropy is low at around 180 .
>>>
>>> I plan to increase it bu installing haveged . But, would low entropy by
>> any
>>> chance cause this issue of intermittent hang .
>>> Also, the hang is mostly observed when registering around 20 clients
>>> together
>>
>> Possibly, I'm not sure. If you want to dig into this, I would do this:
>> 1. look what process hangs on client (using pstree command or so)
>> $ pstree
>>
>> 2. look to what server and port is the hanging client connected to
>> $ lsof -p 
>>
>> 3. jump to server and see what process is bound to the target port
>> $ netstat -pn
>>
>> 4. see where the process if hanging
>> $ strace -p 
>>
>> I hope it helps.
>>
>> Petr^2 Spacek
>>
>>> On Fri, Aug 19, 2016 at 7:24 PM, Rakesh Rajasekharan <
>>> rakesh.rajasekha...@gmail.com> wrote:
>>>
 yes there seems to be something thats worrying.. I have faced this today
 as well.
 There are few hosts around 280 odd left and when i try adding them to
>> IPA
 , the slowness begins..

 all the ipa commands like ipa user-find.. etc becomes very slow in
 responding.

 the SYNC_RECV are not many though just around 80-90 and today that was
 around 20 only


 I have for now increased tcp_max_syn_backlog to 5000.
 For now the slowness seems to have gone.. but I will do a try adding the
 clients again tomorrow and see how it goes

 Thanks
 Rakesh

 The issues

 On Fri, Aug 19, 2016 at 12:58 PM, Petr Spacek 
>> wrote:

> On 18.8.2016 17:23, Rakesh Rajasekharan wrote:
>> Hi
>>
>> I am migrating to freeipa from openldap and have around 4000 clients
>>
>> I had openned a another thread on that, but chose to start a new one
> here
>> as its a separate issue
>>
>> I was able to change the nssslapd-maxdescriptors adding an ldif file
>>
>> cat nsslapd-modify.ldif
>> dn: cn=config
>> changetype: modify
>> replace: nsslapd-maxdescriptors
>> nsslapd-maxdescriptors: 17000
>>
>> and running the ldapmodify command
>>
>> I have now started moving clients running an openldap to Freeipa and
> have
>> today moved close to 2000 clients
>>
>> However, I have noticed that I

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-08-23 Thread Rakesh Rajasekharan
I was able to fix that may be temporarily... when i checked the network..
there was another process that was running and consuming a lot of network (
i have no idea who did that. I need to seriously start restricting people
access to this machine )

after killing that perfomance improved drastically

But now, suddenly I started experiencing the same hang.

This time , I gert the following error when checked dmesg

[  301.236976] ns-slapd[3124]: segfault at 0 ip 7f1de416951c sp
7f1dee1dba70 error 4 in libcos-plugin.so[7f1de4166000+b000]
[ 1116.248431] TCP: request_sock_TCP: Possible SYN flooding on port 88.
Sending cookies.  Check SNMP counters.
[11831.397037] ns-slapd[22550]: segfault at 0 ip 7f533d82251c sp
7f5347894a70 error 4 in libcos-plugin.so[7f533d81f000+b000]
[11832.727989] ns-slapd[22606]: segfault at 0 ip 7f6231eb951c sp
7f623bf2ba70 error 4 in libcos-plugin.so[7f6231eb6000+b00

and in /var/log/dirsrv/example-com/errors

[23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
not delete change record 3291138 (rc: 32)
[23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
not delete change record 3291139 (rc: 32)
[23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
not delete change record 3291140 (rc: 32)
[23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
not delete change record 3291141 (rc: 32)
[23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
not delete change record 3291142 (rc: 32)
[23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
not delete change record 3291143 (rc: 32)
[23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
not delete change record 3291144 (rc: 32)
[23/Aug/2016:12:49:36 +] DSRetroclPlugin - delete_changerecord: could
not delete change record 3291145 (rc: 32)
[23/Aug/2016:12:49:50 +] - Retry count exceeded in delete
[23/Aug/2016:12:49:50 +] DSRetroclPlugin - delete_changerecord: could
not delete change record 3292734 (rc: 51)


Can  i do something about this error.. I treid to restart ipa a couple of
time but that did not help

Thanks
Rakesh

On Mon, Aug 22, 2016 at 2:27 PM, Petr Spacek  wrote:

> On 19.8.2016 19:32, Rakesh Rajasekharan wrote:
> > I am running my set up on AWS cloud, and entropy is low at around 180 .
> >
> > I plan to increase it bu installing haveged . But, would low entropy by
> any
> > chance cause this issue of intermittent hang .
> > Also, the hang is mostly observed when registering around 20 clients
> > together
>
> Possibly, I'm not sure. If you want to dig into this, I would do this:
> 1. look what process hangs on client (using pstree command or so)
> $ pstree
>
> 2. look to what server and port is the hanging client connected to
> $ lsof -p 
>
> 3. jump to server and see what process is bound to the target port
> $ netstat -pn
>
> 4. see where the process if hanging
> $ strace -p 
>
> I hope it helps.
>
> Petr^2 Spacek
>
> > On Fri, Aug 19, 2016 at 7:24 PM, Rakesh Rajasekharan <
> > rakesh.rajasekha...@gmail.com> wrote:
> >
> >> yes there seems to be something thats worrying.. I have faced this today
> >> as well.
> >> There are few hosts around 280 odd left and when i try adding them to
> IPA
> >> , the slowness begins..
> >>
> >> all the ipa commands like ipa user-find.. etc becomes very slow in
> >> responding.
> >>
> >> the SYNC_RECV are not many though just around 80-90 and today that was
> >> around 20 only
> >>
> >>
> >> I have for now increased tcp_max_syn_backlog to 5000.
> >> For now the slowness seems to have gone.. but I will do a try adding the
> >> clients again tomorrow and see how it goes
> >>
> >> Thanks
> >> Rakesh
> >>
> >> The issues
> >>
> >> On Fri, Aug 19, 2016 at 12:58 PM, Petr Spacek 
> wrote:
> >>
> >>> On 18.8.2016 17:23, Rakesh Rajasekharan wrote:
>  Hi
> 
>  I am migrating to freeipa from openldap and have around 4000 clients
> 
>  I had openned a another thread on that, but chose to start a new one
> >>> here
>  as its a separate issue
> 
>  I was able to change the nssslapd-maxdescriptors adding an ldif file
> 
>  cat nsslapd-modify.ldif
>  dn: cn=config
>  changetype: modify
>  replace: nsslapd-maxdescriptors
>  nsslapd-maxdescriptors: 17000
> 
>  and running the ldapmodify command
> 
>  I have now started moving clients running an openldap to Freeipa and
> >>> have
>  today moved close to 2000 clients
> 
>  However, I have noticed that IPA hangs intermittently.
> 
>  running a kinit admin returns the below error
>  kinit: Generic error (see e-text) while getting initial credentials
> 
>  from the /var/log/messages, I see this entry
> 
>   prod-ipa-master-int kernel: [104090.315801] TCP: request_sock_TCP:
>  Possible SYN flooding on port 88. Sending cookies.  Check SNMP
> counters.
> >>>
> >>> I would be worried about t

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-08-22 Thread Petr Spacek
On 19.8.2016 19:32, Rakesh Rajasekharan wrote:
> I am running my set up on AWS cloud, and entropy is low at around 180 .
> 
> I plan to increase it bu installing haveged . But, would low entropy by any
> chance cause this issue of intermittent hang .
> Also, the hang is mostly observed when registering around 20 clients
> together

Possibly, I'm not sure. If you want to dig into this, I would do this:
1. look what process hangs on client (using pstree command or so)
$ pstree

2. look to what server and port is the hanging client connected to
$ lsof -p 

3. jump to server and see what process is bound to the target port
$ netstat -pn

4. see where the process if hanging
$ strace -p 

I hope it helps.

Petr^2 Spacek

> On Fri, Aug 19, 2016 at 7:24 PM, Rakesh Rajasekharan <
> rakesh.rajasekha...@gmail.com> wrote:
> 
>> yes there seems to be something thats worrying.. I have faced this today
>> as well.
>> There are few hosts around 280 odd left and when i try adding them to IPA
>> , the slowness begins..
>>
>> all the ipa commands like ipa user-find.. etc becomes very slow in
>> responding.
>>
>> the SYNC_RECV are not many though just around 80-90 and today that was
>> around 20 only
>>
>>
>> I have for now increased tcp_max_syn_backlog to 5000.
>> For now the slowness seems to have gone.. but I will do a try adding the
>> clients again tomorrow and see how it goes
>>
>> Thanks
>> Rakesh
>>
>> The issues
>>
>> On Fri, Aug 19, 2016 at 12:58 PM, Petr Spacek  wrote:
>>
>>> On 18.8.2016 17:23, Rakesh Rajasekharan wrote:
 Hi

 I am migrating to freeipa from openldap and have around 4000 clients

 I had openned a another thread on that, but chose to start a new one
>>> here
 as its a separate issue

 I was able to change the nssslapd-maxdescriptors adding an ldif file

 cat nsslapd-modify.ldif
 dn: cn=config
 changetype: modify
 replace: nsslapd-maxdescriptors
 nsslapd-maxdescriptors: 17000

 and running the ldapmodify command

 I have now started moving clients running an openldap to Freeipa and
>>> have
 today moved close to 2000 clients

 However, I have noticed that IPA hangs intermittently.

 running a kinit admin returns the below error
 kinit: Generic error (see e-text) while getting initial credentials

 from the /var/log/messages, I see this entry

  prod-ipa-master-int kernel: [104090.315801] TCP: request_sock_TCP:
 Possible SYN flooding on port 88. Sending cookies.  Check SNMP counters.
>>>
>>> I would be worried about this message. Maybe kernel/firewall is doing
>>> something fishy behind your back and blocking some connections or so.
>>>
>>> Petr^2 Spacek
>>>
>>>
 Aug 18 13:00:01 prod-ipa-master-int systemd[1]: Started Session 4885 of
 user root.
 Aug 18 13:00:01 prod-ipa-master-int systemd[1]: Starting Session 4885 of
 user root.
 Aug 18 13:01:01 prod-ipa-master-int systemd[1]: Started Session 4886 of
 user root.
 Aug 18 13:01:01 prod-ipa-master-int systemd[1]: Starting Session 4886 of
 user root.
 Aug 18 13:02:40 prod-ipa-master-int python[28984]: ansible-command
>>> Invoked
 with creates=None executable=None shell=True args= removes=None
>>> warn=True
 chdir=None
 Aug 18 13:04:37 prod-ipa-master-int sssd_be: GSSAPI Error: Unspecified
>>> GSS
 failure.  Minor code may provide more information (KDC returned error
 string: PROCESS_TGS)

 Could it be possible that its due to the initial load of adding the
>>> clients
 or is there something else that I need to take care of.

-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-08-19 Thread Rakesh Rajasekharan
I am running my set up on AWS cloud, and entropy is low at around 180 .

I plan to increase it bu installing haveged . But, would low entropy by any
chance cause this issue of intermittent hang .
Also, the hang is mostly observed when registering around 20 clients
together

On Fri, Aug 19, 2016 at 7:24 PM, Rakesh Rajasekharan <
rakesh.rajasekha...@gmail.com> wrote:

> yes there seems to be something thats worrying.. I have faced this today
> as well.
> There are few hosts around 280 odd left and when i try adding them to IPA
> , the slowness begins..
>
> all the ipa commands like ipa user-find.. etc becomes very slow in
> responding.
>
> the SYNC_RECV are not many though just around 80-90 and today that was
> around 20 only
>
>
> I have for now increased tcp_max_syn_backlog to 5000.
> For now the slowness seems to have gone.. but I will do a try adding the
> clients again tomorrow and see how it goes
>
> Thanks
> Rakesh
>
> The issues
>
> On Fri, Aug 19, 2016 at 12:58 PM, Petr Spacek  wrote:
>
>> On 18.8.2016 17:23, Rakesh Rajasekharan wrote:
>> > Hi
>> >
>> > I am migrating to freeipa from openldap and have around 4000 clients
>> >
>> > I had openned a another thread on that, but chose to start a new one
>> here
>> > as its a separate issue
>> >
>> > I was able to change the nssslapd-maxdescriptors adding an ldif file
>> >
>> > cat nsslapd-modify.ldif
>> > dn: cn=config
>> > changetype: modify
>> > replace: nsslapd-maxdescriptors
>> > nsslapd-maxdescriptors: 17000
>> >
>> > and running the ldapmodify command
>> >
>> > I have now started moving clients running an openldap to Freeipa and
>> have
>> > today moved close to 2000 clients
>> >
>> > However, I have noticed that IPA hangs intermittently.
>> >
>> > running a kinit admin returns the below error
>> > kinit: Generic error (see e-text) while getting initial credentials
>> >
>> > from the /var/log/messages, I see this entry
>> >
>> >  prod-ipa-master-int kernel: [104090.315801] TCP: request_sock_TCP:
>> > Possible SYN flooding on port 88. Sending cookies.  Check SNMP counters.
>>
>> I would be worried about this message. Maybe kernel/firewall is doing
>> something fishy behind your back and blocking some connections or so.
>>
>> Petr^2 Spacek
>>
>>
>> > Aug 18 13:00:01 prod-ipa-master-int systemd[1]: Started Session 4885 of
>> > user root.
>> > Aug 18 13:00:01 prod-ipa-master-int systemd[1]: Starting Session 4885 of
>> > user root.
>> > Aug 18 13:01:01 prod-ipa-master-int systemd[1]: Started Session 4886 of
>> > user root.
>> > Aug 18 13:01:01 prod-ipa-master-int systemd[1]: Starting Session 4886 of
>> > user root.
>> > Aug 18 13:02:40 prod-ipa-master-int python[28984]: ansible-command
>> Invoked
>> > with creates=None executable=None shell=True args= removes=None
>> warn=True
>> > chdir=None
>> > Aug 18 13:04:37 prod-ipa-master-int sssd_be: GSSAPI Error: Unspecified
>> GSS
>> > failure.  Minor code may provide more information (KDC returned error
>> > string: PROCESS_TGS)
>> >
>> > Could it be possible that its due to the initial load of adding the
>> clients
>> > or is there something else that I need to take care of.
>> >
>> > Thanks,
>> >
>> > Rakesh
>>
>> --
>> Manage your subscription for the Freeipa-users mailing list:
>> https://www.redhat.com/mailman/listinfo/freeipa-users
>> Go to http://freeipa.org for more info on the project
>>
>
>
-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-08-19 Thread Rakesh Rajasekharan
yes there seems to be something thats worrying.. I have faced this today as
well.
There are few hosts around 280 odd left and when i try adding them to IPA ,
the slowness begins..

all the ipa commands like ipa user-find.. etc becomes very slow in
responding.

the SYNC_RECV are not many though just around 80-90 and today that was
around 20 only


I have for now increased tcp_max_syn_backlog to 5000.
For now the slowness seems to have gone.. but I will do a try adding the
clients again tomorrow and see how it goes

Thanks
Rakesh

The issues

On Fri, Aug 19, 2016 at 12:58 PM, Petr Spacek  wrote:

> On 18.8.2016 17:23, Rakesh Rajasekharan wrote:
> > Hi
> >
> > I am migrating to freeipa from openldap and have around 4000 clients
> >
> > I had openned a another thread on that, but chose to start a new one here
> > as its a separate issue
> >
> > I was able to change the nssslapd-maxdescriptors adding an ldif file
> >
> > cat nsslapd-modify.ldif
> > dn: cn=config
> > changetype: modify
> > replace: nsslapd-maxdescriptors
> > nsslapd-maxdescriptors: 17000
> >
> > and running the ldapmodify command
> >
> > I have now started moving clients running an openldap to Freeipa and have
> > today moved close to 2000 clients
> >
> > However, I have noticed that IPA hangs intermittently.
> >
> > running a kinit admin returns the below error
> > kinit: Generic error (see e-text) while getting initial credentials
> >
> > from the /var/log/messages, I see this entry
> >
> >  prod-ipa-master-int kernel: [104090.315801] TCP: request_sock_TCP:
> > Possible SYN flooding on port 88. Sending cookies.  Check SNMP counters.
>
> I would be worried about this message. Maybe kernel/firewall is doing
> something fishy behind your back and blocking some connections or so.
>
> Petr^2 Spacek
>
>
> > Aug 18 13:00:01 prod-ipa-master-int systemd[1]: Started Session 4885 of
> > user root.
> > Aug 18 13:00:01 prod-ipa-master-int systemd[1]: Starting Session 4885 of
> > user root.
> > Aug 18 13:01:01 prod-ipa-master-int systemd[1]: Started Session 4886 of
> > user root.
> > Aug 18 13:01:01 prod-ipa-master-int systemd[1]: Starting Session 4886 of
> > user root.
> > Aug 18 13:02:40 prod-ipa-master-int python[28984]: ansible-command
> Invoked
> > with creates=None executable=None shell=True args= removes=None warn=True
> > chdir=None
> > Aug 18 13:04:37 prod-ipa-master-int sssd_be: GSSAPI Error: Unspecified
> GSS
> > failure.  Minor code may provide more information (KDC returned error
> > string: PROCESS_TGS)
> >
> > Could it be possible that its due to the initial load of adding the
> clients
> > or is there something else that I need to take care of.
> >
> > Thanks,
> >
> > Rakesh
>
> --
> Manage your subscription for the Freeipa-users mailing list:
> https://www.redhat.com/mailman/listinfo/freeipa-users
> Go to http://freeipa.org for more info on the project
>
-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-08-19 Thread Petr Spacek
On 18.8.2016 17:23, Rakesh Rajasekharan wrote:
> Hi
> 
> I am migrating to freeipa from openldap and have around 4000 clients
> 
> I had openned a another thread on that, but chose to start a new one here
> as its a separate issue
> 
> I was able to change the nssslapd-maxdescriptors adding an ldif file
> 
> cat nsslapd-modify.ldif
> dn: cn=config
> changetype: modify
> replace: nsslapd-maxdescriptors
> nsslapd-maxdescriptors: 17000
> 
> and running the ldapmodify command
> 
> I have now started moving clients running an openldap to Freeipa and have
> today moved close to 2000 clients
> 
> However, I have noticed that IPA hangs intermittently.
> 
> running a kinit admin returns the below error
> kinit: Generic error (see e-text) while getting initial credentials
> 
> from the /var/log/messages, I see this entry
> 
>  prod-ipa-master-int kernel: [104090.315801] TCP: request_sock_TCP:
> Possible SYN flooding on port 88. Sending cookies.  Check SNMP counters.

I would be worried about this message. Maybe kernel/firewall is doing
something fishy behind your back and blocking some connections or so.

Petr^2 Spacek


> Aug 18 13:00:01 prod-ipa-master-int systemd[1]: Started Session 4885 of
> user root.
> Aug 18 13:00:01 prod-ipa-master-int systemd[1]: Starting Session 4885 of
> user root.
> Aug 18 13:01:01 prod-ipa-master-int systemd[1]: Started Session 4886 of
> user root.
> Aug 18 13:01:01 prod-ipa-master-int systemd[1]: Starting Session 4886 of
> user root.
> Aug 18 13:02:40 prod-ipa-master-int python[28984]: ansible-command Invoked
> with creates=None executable=None shell=True args= removes=None warn=True
> chdir=None
> Aug 18 13:04:37 prod-ipa-master-int sssd_be: GSSAPI Error: Unspecified GSS
> failure.  Minor code may provide more information (KDC returned error
> string: PROCESS_TGS)
> 
> Could it be possible that its due to the initial load of adding the clients
> or is there something else that I need to take care of.
> 
> Thanks,
> 
> Rakesh

-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


[Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-08-18 Thread Rakesh Rajasekharan
Hi

I am migrating to freeipa from openldap and have around 4000 clients

I had openned a another thread on that, but chose to start a new one here
as its a separate issue

I was able to change the nssslapd-maxdescriptors adding an ldif file

cat nsslapd-modify.ldif
dn: cn=config
changetype: modify
replace: nsslapd-maxdescriptors
nsslapd-maxdescriptors: 17000

and running the ldapmodify command

I have now started moving clients running an openldap to Freeipa and have
today moved close to 2000 clients

However, I have noticed that IPA hangs intermittently.

running a kinit admin returns the below error
kinit: Generic error (see e-text) while getting initial credentials

from the /var/log/messages, I see this entry

 prod-ipa-master-int kernel: [104090.315801] TCP: request_sock_TCP:
Possible SYN flooding on port 88. Sending cookies.  Check SNMP counters.
Aug 18 13:00:01 prod-ipa-master-int systemd[1]: Started Session 4885 of
user root.
Aug 18 13:00:01 prod-ipa-master-int systemd[1]: Starting Session 4885 of
user root.
Aug 18 13:01:01 prod-ipa-master-int systemd[1]: Started Session 4886 of
user root.
Aug 18 13:01:01 prod-ipa-master-int systemd[1]: Starting Session 4886 of
user root.
Aug 18 13:02:40 prod-ipa-master-int python[28984]: ansible-command Invoked
with creates=None executable=None shell=True args= removes=None warn=True
chdir=None
Aug 18 13:04:37 prod-ipa-master-int sssd_be: GSSAPI Error: Unspecified GSS
failure.  Minor code may provide more information (KDC returned error
string: PROCESS_TGS)

Could it be possible that its due to the initial load of adding the clients
or is there something else that I need to take care of.

Thanks,

Rakesh
-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project