[Freeipa-users] Timeout (?) issues

2013-09-16 Thread KodaK
Yet another AIX related problem:

The AIX LDAP client is called secldapclntd (sure, they could make it more
awkward, but the budget ran out.)  I'm running into the issue detailed here:

http://www-01.ibm.com/support/docview.wss?uid=isg1IV11344

"If an LDAP server fails to answer an LDAP query, secldapclntd caches
the non-answered
query negatively. This may happen if the LDAP server is down for example.
After the LDAP server is back again secldapclntd will use the negative
cache entry and the application initiating the original query will still
fail until the cache entry expires."

IBM is working on porting the fix to our specific TL and SP levels.

What I'm concerned with here, though, is *why* is it timing out?  I don't
know what the current timeout values are (AIX sucks, etc.)

I don't see timeout issues on my Linux boxes, which leads me to believe
that either the sssd timouts are longer or that sssd is just more robust
when dealing with timeouts.

I believe I'm seeing similar behavior with LDAP sudo on AIX as well,
because I occasionally have to re-run sudo commands because they initially
fail (and I know I'm using the right passwords.)  However, sudo doesn't
appear to have a cache (or it handles caching better.)

Does anyone have any troubleshooting suggestions?  Any general "speed
things up" suggestions on the IPA side?

Thanks,

--Jason

-- 
The government is going to read our mail anyway, might as well make it
tough for them.  GPG Public key ID:  B6A1A7C6
___
Freeipa-users mailing list
Freeipa-users@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-users

Re: [Freeipa-users] Timeout (?) issues

2013-09-16 Thread Dmitri Pal
On 09/16/2013 12:02 PM, KodaK wrote:
> Yet another AIX related problem:
>
> The AIX LDAP client is called secldapclntd (sure, they could make it
> more awkward, but the budget ran out.)  I'm running into the issue
> detailed here:
>
> http://www-01.ibm.com/support/docview.wss?uid=isg1IV11344
>
> "If an LDAP server fails to answer an LDAP query, secldapclntd caches
> the non-answered query negatively. This may happen if the LDAP server
> is down for example. After the LDAP server is back
> again secldapclntd will use the negative cache entry and the
> application initiating the original query will still fail until the
> cache entry expires."
>
> IBM is working on porting the fix to our specific TL and SP levels.
>
> What I'm concerned with here, though, is *why* is it timing out?  I
> don't know what the current timeout values are (AIX sucks, etc.)
>
> I don't see timeout issues on my Linux boxes, which leads me to
> believe that either the sssd timouts are longer or that sssd is just
> more robust when dealing with timeouts.
>
> I believe I'm seeing similar behavior with LDAP sudo on AIX as well,
> because I occasionally have to re-run sudo commands because they
> initially fail (and I know I'm using the right passwords.)  However,
> sudo doesn't appear to have a cache (or it handles caching better.)
>
> Does anyone have any troubleshooting suggestions?  Any general "speed
> things up" suggestions on the IPA side?
>
> Thanks,
>
> --Jason
>
> -- 
> The government is going to read our mail anyway, might as well make it
> tough for them.  GPG Public key ID:  B6A1A7C6
>
>
> ___
> Freeipa-users mailing list
> Freeipa-users@redhat.com
> https://www.redhat.com/mailman/listinfo/freeipa-users

Is the server FreeIPA?
Can see in the server logs what is actually happening is it the server
that really takes time or there is a network connectivity issue or FW is
dropping packets?
I would really start with the server side logs.


-- 
Thank you,
Dmitri Pal

Sr. Engineering Manager for IdM portfolio
Red Hat Inc.


---
Looking to carve out IT costs?
www.redhat.com/carveoutcosts/



___
Freeipa-users mailing list
Freeipa-users@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-users

Re: [Freeipa-users] Timeout (?) issues

2013-09-17 Thread Rich Megginson

On 09/16/2013 07:57 PM, Dmitri Pal wrote:

On 09/16/2013 12:02 PM, KodaK wrote:

Yet another AIX related problem:

The AIX LDAP client is called secldapclntd (sure, they could make it 
more awkward, but the budget ran out.)  I'm running into the issue 
detailed here:


http://www-01.ibm.com/support/docview.wss?uid=isg1IV11344

"If an LDAP server fails to answer an LDAP query, secldapclntd caches 
the non-answered query negatively. This may happen if the LDAP server 
is down for example. After the LDAP server is back again secldapclntd 
will use the negative cache entry and the application initiating the 
original query will still fail until the cache entry expires."


IBM is working on porting the fix to our specific TL and SP levels.

What I'm concerned with here, though, is *why* is it timing out?  I 
don't know what the current timeout values are (AIX sucks, etc.)


I don't see timeout issues on my Linux boxes, which leads me to 
believe that either the sssd timouts are longer or that sssd is just 
more robust when dealing with timeouts.


I believe I'm seeing similar behavior with LDAP sudo on AIX as well, 
because I occasionally have to re-run sudo commands because they 
initially fail (and I know I'm using the right passwords.)  However, 
sudo doesn't appear to have a cache (or it handles caching better.)


Does anyone have any troubleshooting suggestions?  Any general "speed 
things up" suggestions on the IPA side?


Thanks,

--Jason

--
The government is going to read our mail anyway, might as well make 
it tough for them.  GPG Public key ID:  B6A1A7C6



___
Freeipa-users mailing list
Freeipa-users@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-users


Is the server FreeIPA?
Can see in the server logs what is actually happening is it the server 
that really takes time or there is a network connectivity issue or FW 
is dropping packets?

I would really start with the server side logs.


As far as 389 goes, run logconv.pl against the access logs in 
/var/log/dirsrv/slapd-DOMAIN-COM



--
Thank you,
Dmitri Pal

Sr. Engineering Manager for IdM portfolio
Red Hat Inc.


---
Looking to carve out IT costs?
www.redhat.com/carveoutcosts/




___
Freeipa-users mailing list
Freeipa-users@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-users


___
Freeipa-users mailing list
Freeipa-users@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-users

Re: [Freeipa-users] Timeout (?) issues

2013-09-19 Thread KodaK
SRV records were missing for _ldaps_tcp.  I added them in for the IPA
servers and that knocked out some of the errors, but there are still a lot.
 I suspect these boxes are overloaded with bad dns queries (probably due to
something I've messed up.)

Any help would be appreciated, but I'm opening a RH ticket.

Thanks,

--Jason


On Thu, Sep 19, 2013 at 1:57 PM, KodaK  wrote:

> Well, this is awkward:
>
> [root@slpidml01 slapd-UNIX-xxx-COM]# grep conn=170902 access* | wc -l
> 5453936
> [root@slpidml01 slapd-UNIX-xxx-COM]#
>
>
> On Thu, Sep 19, 2013 at 1:48 PM, KodaK  wrote:
>
>> Thanks.  I've been running that against my logs, and this has to be
>> abnormal:
>>
>> err=32   129274No Such Object
>> err=0 10952Successful Operations
>> err=14  536SASL Bind in Progress
>> err=53   39Unwilling To Perform
>> err=493Invalid Credentials (Bad Password)
>>
>> I'm still trying to figure out why there are so many error 32s.  Are
>> there any usual suspects I should know about?  (That's just the current
>> access log, btw.)
>>
>>
>> On Tue, Sep 17, 2013 at 9:01 AM, Rich Megginson wrote:
>>
>>>  On 09/16/2013 07:57 PM, Dmitri Pal wrote:
>>>
>>> On 09/16/2013 12:02 PM, KodaK wrote:
>>>
>>> Yet another AIX related problem:
>>>
>>>  The AIX LDAP client is called secldapclntd (sure, they could make it
>>> more awkward, but the budget ran out.)  I'm running into the issue detailed
>>> here:
>>>
>>>  http://www-01.ibm.com/support/docview.wss?uid=isg1IV11344
>>>
>>>  "If an LDAP server fails to answer an LDAP query, secldapclntd caches
>>> the non-answered query negatively. This may happen if the LDAP server
>>> is down for example. After the LDAP server is back again secldapclntd will
>>> use the negative cache entry and the application initiating the original
>>> query will still fail until the cache entry expires."
>>>
>>>  IBM is working on porting the fix to our specific TL and SP levels.
>>>
>>>  What I'm concerned with here, though, is *why* is it timing out?  I
>>> don't know what the current timeout values are (AIX sucks, etc.)
>>>
>>>  I don't see timeout issues on my Linux boxes, which leads me to
>>> believe that either the sssd timouts are longer or that sssd is just more
>>> robust when dealing with timeouts.
>>>
>>>  I believe I'm seeing similar behavior with LDAP sudo on AIX as well,
>>> because I occasionally have to re-run sudo commands because they initially
>>> fail (and I know I'm using the right passwords.)  However, sudo doesn't
>>> appear to have a cache (or it handles caching better.)
>>>
>>>  Does anyone have any troubleshooting suggestions?  Any general "speed
>>> things up" suggestions on the IPA side?
>>>
>>>  Thanks,
>>>
>>>  --Jason
>>>
>>>  --
>>> The government is going to read our mail anyway, might as well make it
>>> tough for them.  GPG Public key ID:  B6A1A7C6
>>>
>>>
>>> ___
>>> Freeipa-users mailing 
>>> listFreeipa-users@redhat.comhttps://www.redhat.com/mailman/listinfo/freeipa-users
>>>
>>>
>>> Is the server FreeIPA?
>>> Can see in the server logs what is actually happening is it the server
>>> that really takes time or there is a network connectivity issue or FW is
>>> dropping packets?
>>> I would really start with the server side logs.
>>>
>>>
>>> As far as 389 goes, run logconv.pl against the access logs in
>>> /var/log/dirsrv/slapd-DOMAIN-COM
>>>
>>>
>>>
>>> --
>>> Thank you,
>>> Dmitri Pal
>>>
>>> Sr. Engineering Manager for IdM portfolio
>>> Red Hat Inc.
>>>
>>>
>>> ---
>>> Looking to carve out IT costs?www.redhat.com/carveoutcosts/
>>>
>>>
>>>
>>> ___
>>> Freeipa-users mailing 
>>> listFreeipa-users@redhat.comhttps://www.redhat.com/mailman/listinfo/freeipa-users
>>>
>>>
>>>
>>> ___
>>> Freeipa-users mailing list
>>> Freeipa-users@redhat.com
>>> https://www.redhat.com/mailman/listinfo/freeipa-users
>>>
>>
>>
>>
>> --
>> The government is going to read our mail anyway, might as well make it
>> tough for them.  GPG Public key ID:  B6A1A7C6
>>
>
>
>
> --
> The government is going to read our mail anyway, might as well make it
> tough for them.  GPG Public key ID:  B6A1A7C6
>



-- 
The government is going to read our mail anyway, might as well make it
tough for them.  GPG Public key ID:  B6A1A7C6
___
Freeipa-users mailing list
Freeipa-users@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-users

Re: [Freeipa-users] Timeout (?) issues

2013-09-19 Thread KodaK
Well, this is awkward:

[root@slpidml01 slapd-UNIX-xxx-COM]# grep conn=170902 access* | wc -l
5453936
[root@slpidml01 slapd-UNIX-xxx-COM]#


On Thu, Sep 19, 2013 at 1:48 PM, KodaK  wrote:

> Thanks.  I've been running that against my logs, and this has to be
> abnormal:
>
> err=32   129274No Such Object
> err=0 10952Successful Operations
> err=14  536SASL Bind in Progress
> err=53   39Unwilling To Perform
> err=493Invalid Credentials (Bad Password)
>
> I'm still trying to figure out why there are so many error 32s.  Are there
> any usual suspects I should know about?  (That's just the current access
> log, btw.)
>
>
> On Tue, Sep 17, 2013 at 9:01 AM, Rich Megginson wrote:
>
>>  On 09/16/2013 07:57 PM, Dmitri Pal wrote:
>>
>> On 09/16/2013 12:02 PM, KodaK wrote:
>>
>> Yet another AIX related problem:
>>
>>  The AIX LDAP client is called secldapclntd (sure, they could make it
>> more awkward, but the budget ran out.)  I'm running into the issue detailed
>> here:
>>
>>  http://www-01.ibm.com/support/docview.wss?uid=isg1IV11344
>>
>>  "If an LDAP server fails to answer an LDAP query, secldapclntd caches
>> the non-answered query negatively. This may happen if the LDAP server is down
>> for example. After the LDAP server is back again secldapclntd will use
>> the negative cache entry and the application initiating the original
>> query will still fail until the cache entry expires."
>>
>>  IBM is working on porting the fix to our specific TL and SP levels.
>>
>>  What I'm concerned with here, though, is *why* is it timing out?  I
>> don't know what the current timeout values are (AIX sucks, etc.)
>>
>>  I don't see timeout issues on my Linux boxes, which leads me to believe
>> that either the sssd timouts are longer or that sssd is just more robust
>> when dealing with timeouts.
>>
>>  I believe I'm seeing similar behavior with LDAP sudo on AIX as well,
>> because I occasionally have to re-run sudo commands because they initially
>> fail (and I know I'm using the right passwords.)  However, sudo doesn't
>> appear to have a cache (or it handles caching better.)
>>
>>  Does anyone have any troubleshooting suggestions?  Any general "speed
>> things up" suggestions on the IPA side?
>>
>>  Thanks,
>>
>>  --Jason
>>
>>  --
>> The government is going to read our mail anyway, might as well make it
>> tough for them.  GPG Public key ID:  B6A1A7C6
>>
>>
>> ___
>> Freeipa-users mailing 
>> listFreeipa-users@redhat.comhttps://www.redhat.com/mailman/listinfo/freeipa-users
>>
>>
>> Is the server FreeIPA?
>> Can see in the server logs what is actually happening is it the server
>> that really takes time or there is a network connectivity issue or FW is
>> dropping packets?
>> I would really start with the server side logs.
>>
>>
>> As far as 389 goes, run logconv.pl against the access logs in
>> /var/log/dirsrv/slapd-DOMAIN-COM
>>
>>
>>
>> --
>> Thank you,
>> Dmitri Pal
>>
>> Sr. Engineering Manager for IdM portfolio
>> Red Hat Inc.
>>
>>
>> ---
>> Looking to carve out IT costs?www.redhat.com/carveoutcosts/
>>
>>
>>
>> ___
>> Freeipa-users mailing 
>> listFreeipa-users@redhat.comhttps://www.redhat.com/mailman/listinfo/freeipa-users
>>
>>
>>
>> ___
>> Freeipa-users mailing list
>> Freeipa-users@redhat.com
>> https://www.redhat.com/mailman/listinfo/freeipa-users
>>
>
>
>
> --
> The government is going to read our mail anyway, might as well make it
> tough for them.  GPG Public key ID:  B6A1A7C6
>



-- 
The government is going to read our mail anyway, might as well make it
tough for them.  GPG Public key ID:  B6A1A7C6
___
Freeipa-users mailing list
Freeipa-users@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-users

Re: [Freeipa-users] Timeout (?) issues

2013-09-19 Thread KodaK
I didn't realize that DNS created one connection.  I thought it was one
connection spanning several days.


On Thu, Sep 19, 2013 at 2:51 PM, Rich Megginson  wrote:

>  On 09/19/2013 12:57 PM, KodaK wrote:
>
> Well, this is awkward:
>
>  [root@slpidml01 slapd-UNIX-xxx-COM]# grep conn=170902 access* | wc -l
> 5453936
> [root@slpidml01 slapd-UNIX-xxx-COM]#
>
>
> Why is it awkward?
>
>
>
>
> On Thu, Sep 19, 2013 at 1:48 PM, KodaK  wrote:
>
>> Thanks.  I've been running that against my logs, and this has to be
>> abnormal:
>>
>>  err=32   129274No Such Object
>> err=0 10952Successful Operations
>> err=14  536SASL Bind in Progress
>> err=53   39Unwilling To Perform
>> err=493Invalid Credentials (Bad Password)
>>
>>  I'm still trying to figure out why there are so many error 32s.  Are
>> there any usual suspects I should know about?  (That's just the current
>> access log, btw.)
>>
>>
>> On Tue, Sep 17, 2013 at 9:01 AM, Rich Megginson wrote:
>>
>>>   On 09/16/2013 07:57 PM, Dmitri Pal wrote:
>>>
>>> On 09/16/2013 12:02 PM, KodaK wrote:
>>>
>>> Yet another AIX related problem:
>>>
>>>  The AIX LDAP client is called secldapclntd (sure, they could make it
>>> more awkward, but the budget ran out.)  I'm running into the issue detailed
>>> here:
>>>
>>>  http://www-01.ibm.com/support/docview.wss?uid=isg1IV11344
>>>
>>>  "If an LDAP server fails to answer an LDAP query, secldapclntd caches
>>> the non-answered query negatively. This may happen if the LDAP server
>>> is down for example. After the LDAP server is back again secldapclntd will
>>> use the negative cache entry and the application initiating the original
>>> query will still fail until the cache entry expires."
>>>
>>>  IBM is working on porting the fix to our specific TL and SP levels.
>>>
>>>  What I'm concerned with here, though, is *why* is it timing out?  I
>>> don't know what the current timeout values are (AIX sucks, etc.)
>>>
>>>  I don't see timeout issues on my Linux boxes, which leads me to
>>> believe that either the sssd timouts are longer or that sssd is just more
>>> robust when dealing with timeouts.
>>>
>>>  I believe I'm seeing similar behavior with LDAP sudo on AIX as well,
>>> because I occasionally have to re-run sudo commands because they initially
>>> fail (and I know I'm using the right passwords.)  However, sudo doesn't
>>> appear to have a cache (or it handles caching better.)
>>>
>>>  Does anyone have any troubleshooting suggestions?  Any general "speed
>>> things up" suggestions on the IPA side?
>>>
>>>  Thanks,
>>>
>>>  --Jason
>>>
>>>  --
>>> The government is going to read our mail anyway, might as well make it
>>> tough for them.  GPG Public key ID:  B6A1A7C6
>>>
>>>
>>> ___
>>> Freeipa-users mailing 
>>> listFreeipa-users@redhat.comhttps://www.redhat.com/mailman/listinfo/freeipa-users
>>>
>>>
>>> Is the server FreeIPA?
>>> Can see in the server logs what is actually happening is it the server
>>> that really takes time or there is a network connectivity issue or FW is
>>> dropping packets?
>>> I would really start with the server side logs.
>>>
>>>
>>>  As far as 389 goes, run logconv.pl against the access logs in
>>> /var/log/dirsrv/slapd-DOMAIN-COM
>>>
>>>
>>>
>>> --
>>> Thank you,
>>> Dmitri Pal
>>>
>>> Sr. Engineering Manager for IdM portfolio
>>> Red Hat Inc.
>>>
>>>
>>> ---
>>> Looking to carve out IT costs?www.redhat.com/carveoutcosts/
>>>
>>>
>>>
>>>  ___
>>> Freeipa-users mailing 
>>> listFreeipa-users@redhat.comhttps://www.redhat.com/mailman/listinfo/freeipa-users
>>>
>>>
>>>
>>> ___
>>> Freeipa-users mailing list
>>> Freeipa-users@redhat.com
>>> https://www.redhat.com/mailman/listinfo/freeipa-users
>>>
>>
>>
>>
>>  --
>> The government is going to read our mail anyway, might as well make it
>> tough for them.  GPG Public key ID:  B6A1A7C6
>>
>
>
>
>  --
> The government is going to read our mail anyway, might as well make it
> tough for them.  GPG Public key ID:  B6A1A7C6
>
>
>


-- 
The government is going to read our mail anyway, might as well make it
tough for them.  GPG Public key ID:  B6A1A7C6
___
Freeipa-users mailing list
Freeipa-users@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-users

Re: [Freeipa-users] Timeout (?) issues

2013-09-19 Thread Rich Megginson

On 09/19/2013 12:57 PM, KodaK wrote:

Well, this is awkward:

[root@slpidml01 slapd-UNIX-xxx-COM]# grep conn=170902 access* | wc -l
5453936
[root@slpidml01 slapd-UNIX-xxx-COM]#


Why is it awkward?




On Thu, Sep 19, 2013 at 1:48 PM, KodaK > wrote:


Thanks.  I've been running that against my logs, and this has to
be abnormal:

err=32   129274No Such Object
err=0 10952Successful Operations
err=14  536SASL Bind in Progress
err=53   39Unwilling To Perform
err=493Invalid Credentials (Bad Password)

I'm still trying to figure out why there are so many error 32s.
 Are there any usual suspects I should know about?  (That's just
the current access log, btw.)


On Tue, Sep 17, 2013 at 9:01 AM, Rich Megginson
mailto:rmegg...@redhat.com>> wrote:

On 09/16/2013 07:57 PM, Dmitri Pal wrote:

On 09/16/2013 12:02 PM, KodaK wrote:

Yet another AIX related problem:

The AIX LDAP client is called secldapclntd (sure, they could
make it more awkward, but the budget ran out.)  I'm running
into the issue detailed here:

http://www-01.ibm.com/support/docview.wss?uid=isg1IV11344

"If an LDAP server fails to answer an LDAP query,
secldapclntd caches the non-answered query negatively. This
may happen if the LDAP server is down for example. After the
LDAP server is back again secldapclntd will use the negative
cache entry and the application initiating the original
query will still fail until the cache entry expires."

IBM is working on porting the fix to our specific TL and SP
levels.

What I'm concerned with here, though, is *why* is it timing
out?  I don't know what the current timeout values are (AIX
sucks, etc.)

I don't see timeout issues on my Linux boxes, which leads me
to believe that either the sssd timouts are longer or that
sssd is just more robust when dealing with timeouts.

I believe I'm seeing similar behavior with LDAP sudo on AIX
as well, because I occasionally have to re-run sudo commands
because they initially fail (and I know I'm using the right
passwords.)  However, sudo doesn't appear to have a cache
(or it handles caching better.)

Does anyone have any troubleshooting suggestions?  Any
general "speed things up" suggestions on the IPA side?

Thanks,

--Jason

-- 
The government is going to read our mail anyway, might as

well make it tough for them.  GPG Public key ID:  B6A1A7C6


___
Freeipa-users mailing list
Freeipa-users@redhat.com  
https://www.redhat.com/mailman/listinfo/freeipa-users


Is the server FreeIPA?
Can see in the server logs what is actually happening is it
the server that really takes time or there is a network
connectivity issue or FW is dropping packets?
I would really start with the server side logs.


As far as 389 goes, run logconv.pl  against
the access logs in /var/log/dirsrv/slapd-DOMAIN-COM



-- 
Thank you,

Dmitri Pal

Sr. Engineering Manager for IdM portfolio
Red Hat Inc.


---
Looking to carve out IT costs?
www.redhat.com/carveoutcosts/  




___
Freeipa-users mailing list
Freeipa-users@redhat.com  
https://www.redhat.com/mailman/listinfo/freeipa-users



___
Freeipa-users mailing list
Freeipa-users@redhat.com 
https://www.redhat.com/mailman/listinfo/freeipa-users




-- 
The government is going to read our mail anyway, might as well

make it tough for them.  GPG Public key ID:  B6A1A7C6




--
The government is going to read our mail anyway, might as well make it 
tough for them.  GPG Public key ID:  B6A1A7C6


___
Freeipa-users mailing list
Freeipa-users@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-users

Re: [Freeipa-users] Timeout (?) issues

2013-09-19 Thread KodaK
This is ridiculous, right?

IPA server 1:

# for i in $(ls access*); do echo -n  $i:\  ;grep err=32 $i | wc -l; done
access: 248478
access.20130916-043207: 302774
access.20130916-123642: 272572
access.20130916-201516: 294308
access.20130917-081053: 295060
access.20130917-144559: 284498
access.20130917-231435: 281035
access.20130918-091611: 291165
access.20130918-154945: 275792
access.20130919-014322: 296113

IPA server 2:

access: 4313
access.20130909-200216: 4023
access.20130910-200229: 4161
access.20130911-200239: 4182
access.20130912-200249: 5069
access.20130913-200258: 3833
access.20130914-200313: 4208
access.20130915-200323: 4702
access.20130916-200332: 4532


IPA server 3:

access: 802
access.20130910-080737: 3876
access.20130911-080748: 3902
access.20130912-080802: 3678
access.20130913-080810: 3765
access.20130914-080826: 3524
access.20130915-080907: 4142
access.20130916-080916: 4930
access.20130917-080926: 4769
access.20130918-081005: 2879

IPA server 4:

access: 2812
access.20130910-003051: 4095
access.20130911-003105: 3623
access.20130912-003113: 3606
access.20130913-003125: 3581
access.20130914-003135: 3758
access.20130915-003150: 3935
access.20130916-003159: 4184
access.20130917-003210: 3859
access.20130918-003221: 5110


The vast majority of the err=32 messages are DNS entries.

Here are some samples:

[19/Sep/2013:18:19:51 -0500] conn=9 op=169764 SRCH base="idnsName=xxx.com
,idnsname=unix.xxx.com,cn=dns,dc=unix,dc=xxx,dc=com" scope=0
filter="(objectClass=idnsRecord)" attrs=ALL
[19/Sep/2013:18:19:51 -0500] conn=9 op=169764 RESULT err=32 tag=101
nentries=0 etime=0

[19/Sep/2013:18:19:51 -0500] conn=9 op=169774 SRCH base="idnsName=
slpoxacl01.unix.xxx.com,idnsname=unix.xxx.com,cn=dns,dc=unix,dc=xxx,dc=com"
scope=0 filter="(objectClass=idnsRecord)" attrs=ALL
[19/Sep/2013:18:19:51 -0500] conn=9 op=169774 RESULT err=32 tag=101
nentries=0 etime=0

[19/Sep/2013:18:19:51 -0500] conn=9 op=169770 SRCH base="idnsName=
sla400q1.unix.xxx.com,idnsname=unix.xxx.com,cn=dns,dc=unix,dc=xxx,dc=com"
scope=0 filter="(objectClass=idnsRecord)" attrs=ALL
[19/Sep/2013:18:19:51 -0500] conn=9 op=169770 RESULT err=32 tag=101
nentries=0 etime=0

[19/Sep/2013:18:19:51 -0500] conn=9 op=169772 SRCH base="idnsName=
magellanhealth.com,idnsname=unix.magellanhealth.com,cn=dns,dc=unix,dc=magellanhealth,dc=com"
scope=0 filter="(objectClass=idnsRecord)" attrs=ALL
[19/Sep/2013:18:19:51 -0500] conn=9 op=169772 RESULT err=32 tag=101
nentries=0 etime=0

So far today there are over half a million of these.  That can't be right.



On Thu, Sep 19, 2013 at 3:05 PM, KodaK  wrote:

> I didn't realize that DNS created one connection.  I thought it was one
> connection spanning several days.
>
>
> On Thu, Sep 19, 2013 at 2:51 PM, Rich Megginson wrote:
>
>>  On 09/19/2013 12:57 PM, KodaK wrote:
>>
>> Well, this is awkward:
>>
>>  [root@slpidml01 slapd-UNIX-xxx-COM]# grep conn=170902 access* | wc -l
>> 5453936
>> [root@slpidml01 slapd-UNIX-xxx-COM]#
>>
>>
>> Why is it awkward?
>>
>>
>>
>>
>> On Thu, Sep 19, 2013 at 1:48 PM, KodaK  wrote:
>>
>>> Thanks.  I've been running that against my logs, and this has to be
>>> abnormal:
>>>
>>>  err=32   129274No Such Object
>>> err=0 10952Successful Operations
>>> err=14  536SASL Bind in Progress
>>> err=53   39Unwilling To Perform
>>> err=493Invalid Credentials (Bad Password)
>>>
>>>  I'm still trying to figure out why there are so many error 32s.  Are
>>> there any usual suspects I should know about?  (That's just the current
>>> access log, btw.)
>>>
>>>
>>> On Tue, Sep 17, 2013 at 9:01 AM, Rich Megginson wrote:
>>>
   On 09/16/2013 07:57 PM, Dmitri Pal wrote:

 On 09/16/2013 12:02 PM, KodaK wrote:

 Yet another AIX related problem:

  The AIX LDAP client is called secldapclntd (sure, they could make it
 more awkward, but the budget ran out.)  I'm running into the issue detailed
 here:

  http://www-01.ibm.com/support/docview.wss?uid=isg1IV11344

  "If an LDAP server fails to answer an LDAP query, secldapclntd caches
 the non-answered query negatively. This may happen if the LDAP server
 is down for example. After the LDAP server is back again secldapclntd will
 use the negative cache entry and the application initiating the original
 query will still fail until the cache entry expires."

  IBM is working on porting the fix to our specific TL and SP levels.

  What I'm concerned with here, though, is *why* is it timing out?  I
 don't know what the current timeout values are (AIX sucks, etc.)

  I don't see timeout issues on my Linux boxes, which leads me to
 believe that either the sssd timouts are longer or that sssd is just more
 robust when dealing with timeouts.

  I believe I'm seeing similar behavior with LDAP sudo on AIX as well,
 because I occasionally have to re-run sudo commands b

Re: [Freeipa-users] Timeout (?) issues

2013-09-19 Thread KodaK
Thanks.  I've been running that against my logs, and this has to be
abnormal:

err=32   129274No Such Object
err=0 10952Successful Operations
err=14  536SASL Bind in Progress
err=53   39Unwilling To Perform
err=493Invalid Credentials (Bad Password)

I'm still trying to figure out why there are so many error 32s.  Are there
any usual suspects I should know about?  (That's just the current access
log, btw.)


On Tue, Sep 17, 2013 at 9:01 AM, Rich Megginson  wrote:

>  On 09/16/2013 07:57 PM, Dmitri Pal wrote:
>
> On 09/16/2013 12:02 PM, KodaK wrote:
>
> Yet another AIX related problem:
>
>  The AIX LDAP client is called secldapclntd (sure, they could make it
> more awkward, but the budget ran out.)  I'm running into the issue detailed
> here:
>
>  http://www-01.ibm.com/support/docview.wss?uid=isg1IV11344
>
>  "If an LDAP server fails to answer an LDAP query, secldapclntd caches
> the non-answered query negatively. This may happen if the LDAP server is down
> for example. After the LDAP server is back again secldapclntd will use
> the negative cache entry and the application initiating the original
> query will still fail until the cache entry expires."
>
>  IBM is working on porting the fix to our specific TL and SP levels.
>
>  What I'm concerned with here, though, is *why* is it timing out?  I
> don't know what the current timeout values are (AIX sucks, etc.)
>
>  I don't see timeout issues on my Linux boxes, which leads me to believe
> that either the sssd timouts are longer or that sssd is just more robust
> when dealing with timeouts.
>
>  I believe I'm seeing similar behavior with LDAP sudo on AIX as well,
> because I occasionally have to re-run sudo commands because they initially
> fail (and I know I'm using the right passwords.)  However, sudo doesn't
> appear to have a cache (or it handles caching better.)
>
>  Does anyone have any troubleshooting suggestions?  Any general "speed
> things up" suggestions on the IPA side?
>
>  Thanks,
>
>  --Jason
>
>  --
> The government is going to read our mail anyway, might as well make it
> tough for them.  GPG Public key ID:  B6A1A7C6
>
>
> ___
> Freeipa-users mailing 
> listFreeipa-users@redhat.comhttps://www.redhat.com/mailman/listinfo/freeipa-users
>
>
> Is the server FreeIPA?
> Can see in the server logs what is actually happening is it the server
> that really takes time or there is a network connectivity issue or FW is
> dropping packets?
> I would really start with the server side logs.
>
>
> As far as 389 goes, run logconv.pl against the access logs in
> /var/log/dirsrv/slapd-DOMAIN-COM
>
>
>
> --
> Thank you,
> Dmitri Pal
>
> Sr. Engineering Manager for IdM portfolio
> Red Hat Inc.
>
>
> ---
> Looking to carve out IT costs?www.redhat.com/carveoutcosts/
>
>
>
> ___
> Freeipa-users mailing 
> listFreeipa-users@redhat.comhttps://www.redhat.com/mailman/listinfo/freeipa-users
>
>
>
> ___
> Freeipa-users mailing list
> Freeipa-users@redhat.com
> https://www.redhat.com/mailman/listinfo/freeipa-users
>



-- 
The government is going to read our mail anyway, might as well make it
tough for them.  GPG Public key ID:  B6A1A7C6
___
Freeipa-users mailing list
Freeipa-users@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-users

Re: [Freeipa-users] Timeout (?) issues

2013-09-20 Thread Petr Spacek

On 20.9.2013 01:24, KodaK wrote:

This is ridiculous, right?

IPA server 1:

# for i in $(ls access*); do echo -n  $i:\  ;grep err=32 $i | wc -l; done
access: 248478
access.20130916-043207: 302774
access.20130916-123642: 272572
access.20130916-201516: 294308
access.20130917-081053: 295060
access.20130917-144559: 284498
access.20130917-231435: 281035
access.20130918-091611: 291165
access.20130918-154945: 275792
access.20130919-014322: 296113

IPA server 2:

access: 4313
access.20130909-200216: 4023
access.20130910-200229: 4161
access.20130911-200239: 4182
access.20130912-200249: 5069
access.20130913-200258: 3833
access.20130914-200313: 4208
access.20130915-200323: 4702
access.20130916-200332: 4532


IPA server 3:

access: 802
access.20130910-080737: 3876
access.20130911-080748: 3902
access.20130912-080802: 3678
access.20130913-080810: 3765
access.20130914-080826: 3524
access.20130915-080907: 4142
access.20130916-080916: 4930
access.20130917-080926: 4769
access.20130918-081005: 2879

IPA server 4:

access: 2812
access.20130910-003051: 4095
access.20130911-003105: 3623
access.20130912-003113: 3606
access.20130913-003125: 3581
access.20130914-003135: 3758
access.20130915-003150: 3935
access.20130916-003159: 4184
access.20130917-003210: 3859
access.20130918-003221: 5110


The vast majority of the err=32 messages are DNS entries.


It depends on your setup. Bind-dyndb-ldap does LDAP search for each 
non-existent name to verify that the name wasn't added to LDAP in meanwhile. 
If you have clients doing 1M queries for non-existing names per day, then you 
will see 1M LDAP queries with err=32 per day.


Next major version of bind-dyndb-ldap will have reworked internal database and 
it will support negative caching, so number of err=32 should drop significantly.



Here are some samples:

[19/Sep/2013:18:19:51 -0500] conn=9 op=169764 SRCH base="idnsName=xxx.com
,idnsname=unix.xxx.com,cn=dns,dc=unix,dc=xxx,dc=com" scope=0
filter="(objectClass=idnsRecord)" attrs=ALL
[19/Sep/2013:18:19:51 -0500] conn=9 op=169764 RESULT err=32 tag=101
nentries=0 etime=0


This is interesting, because this LDAP query is equal to DNS query for 
"xxx.com.unix.xxx.com." Are your clients that crazy? :-)



[19/Sep/2013:18:19:51 -0500] conn=9 op=169774 SRCH base="idnsName=
slpoxacl01.unix.xxx.com,idnsname=unix.xxx.com,cn=dns,dc=unix,dc=xxx,dc=com"
scope=0 filter="(objectClass=idnsRecord)" attrs=ALL
[19/Sep/2013:18:19:51 -0500] conn=9 op=169774 RESULT err=32 tag=101
nentries=0 etime=0


This is equivalent to DNS query for "slpoxacl01.unix.xxx.com.unix.xxx.com.".


[19/Sep/2013:18:19:51 -0500] conn=9 op=169770 SRCH base="idnsName=
sla400q1.unix.xxx.com,idnsname=unix.xxx.com,cn=dns,dc=unix,dc=xxx,dc=com"
scope=0 filter="(objectClass=idnsRecord)" attrs=ALL
[19/Sep/2013:18:19:51 -0500] conn=9 op=169770 RESULT err=32 tag=101
nentries=0 etime=0


And this is "sla400q1.unix.xxx.com.unix.xxx.com.".


[19/Sep/2013:18:19:51 -0500] conn=9 op=169772 SRCH base="idnsName=
magellanhealth.com,idnsname=unix.magellanhealth.com,cn=dns,dc=unix,dc=magellanhealth,dc=com"
scope=0 filter="(objectClass=idnsRecord)" attrs=ALL
[19/Sep/2013:18:19:51 -0500] conn=9 op=169772 RESULT err=32 tag=101
nentries=0 etime=0

So far today there are over half a million of these.  That can't be right.


I would recommend you to use network sniffer and check which clients sends 
these crazy queries.


My guess is that your resolver library (libc?) causes this.

On my Linux system with glibc-2.17-14.fc19.x86_64 it behaves in this way:

client query = nonexistent.example.com.
(I used $ "ping nonexistent.example.com.")
search domain in /etc/resolv.conf = brq.redhat.com.

DNS query #1: nonexistent.example.com. => NXDOMAIN
DNS query #2: nonexistent.example.com.brq.redhat.com. => NXDOMAIN
DNS query #3: nonexistent.example.com.redhat.com. => NXDOMAIN


On Thu, Sep 19, 2013 at 3:05 PM, KodaK  wrote:


I didn't realize that DNS created one connection.  I thought it was one
connection spanning several days.


In theory, there should be 2-4 LDAP connections from each DNS server and those 
connections should live until DNS or LDAP server restarts/crashes.


Petr^2 Spacek


On Thu, Sep 19, 2013 at 2:51 PM, Rich Megginson wrote:


  On 09/19/2013 12:57 PM, KodaK wrote:

Well, this is awkward:

  [root@slpidml01 slapd-UNIX-xxx-COM]# grep conn=170902 access* | wc -l
5453936
[root@slpidml01 slapd-UNIX-xxx-COM]#


Why is it awkward?




On Thu, Sep 19, 2013 at 1:48 PM, KodaK  wrote:


Thanks.  I've been running that against my logs, and this has to be
abnormal:

  err=32   129274No Such Object
err=0 10952Successful Operations
err=14  536SASL Bind in Progress
err=53   39Unwilling To Perform
err=493Invalid Credentials (Bad Password)

  I'm still trying to figure out why there are so many error 32s.  Are
there any usual suspects I should know about?  (That's just the current
access log, btw.)


On Tue, Sep 

Re: [Freeipa-users] Timeout (?) issues

2013-09-23 Thread KodaK
I'm pretty sure this is the root of my problem (not confirmed yet, but it's
AIX -- that's always the problem):

http://www-01.ibm.com/support/docview.wss?uid=swg21212940

The takeaway is this:

"The first query (184) is a normal IPV4 lookup for "ldap.austin.texas.com",
which returns "192.168.1.255". But then an IPV6 lookup is done for the same
name. Because there is no IPV6 address for ldap.austin.texas.com, it
continues searching every search domain in the resolv.conf file (
example.austin.texas.com austin.texas.com texas.com) trying to find one."



On Fri, Sep 20, 2013 at 3:07 AM, Petr Spacek  wrote:

> On 20.9.2013 01:24, KodaK wrote:
>
>> This is ridiculous, right?
>>
>> IPA server 1:
>>
>> # for i in $(ls access*); do echo -n  $i:\  ;grep err=32 $i | wc -l; done
>> access: 248478
>> access.20130916-043207: 302774
>> access.20130916-123642: 272572
>> access.20130916-201516: 294308
>> access.20130917-081053: 295060
>> access.20130917-144559: 284498
>> access.20130917-231435: 281035
>> access.20130918-091611: 291165
>> access.20130918-154945: 275792
>> access.20130919-014322: 296113
>>
>> IPA server 2:
>>
>> access: 4313
>> access.20130909-200216: 4023
>> access.20130910-200229: 4161
>> access.20130911-200239: 4182
>> access.20130912-200249: 5069
>> access.20130913-200258: 3833
>> access.20130914-200313: 4208
>> access.20130915-200323: 4702
>> access.20130916-200332: 4532
>>
>>
>> IPA server 3:
>>
>> access: 802
>> access.20130910-080737: 3876
>> access.20130911-080748: 3902
>> access.20130912-080802: 3678
>> access.20130913-080810: 3765
>> access.20130914-080826: 3524
>> access.20130915-080907: 4142
>> access.20130916-080916: 4930
>> access.20130917-080926: 4769
>> access.20130918-081005: 2879
>>
>> IPA server 4:
>>
>> access: 2812
>> access.20130910-003051: 4095
>> access.20130911-003105: 3623
>> access.20130912-003113: 3606
>> access.20130913-003125: 3581
>> access.20130914-003135: 3758
>> access.20130915-003150: 3935
>> access.20130916-003159: 4184
>> access.20130917-003210: 3859
>> access.20130918-003221: 5110
>>
>>
>> The vast majority of the err=32 messages are DNS entries.
>>
>
> It depends on your setup. Bind-dyndb-ldap does LDAP search for each
> non-existent name to verify that the name wasn't added to LDAP in
> meanwhile. If you have clients doing 1M queries for non-existing names per
> day, then you will see 1M LDAP queries with err=32 per day.
>
> Next major version of bind-dyndb-ldap will have reworked internal database
> and it will support negative caching, so number of err=32 should drop
> significantly.
>
>
>  Here are some samples:
>>
>> [19/Sep/2013:18:19:51 -0500] conn=9 op=169764 SRCH base="idnsName=xxx.com
>> ,idnsname=unix.xxx.com,cn=dns,**dc=unix,dc=xxx,dc=com" scope=0
>> filter="(objectClass=**idnsRecord)" attrs=ALL
>> [19/Sep/2013:18:19:51 -0500] conn=9 op=169764 RESULT err=32 tag=101
>> nentries=0 etime=0
>>
>
> This is interesting, because this LDAP query is equal to DNS query for "
> xxx.com.unix.xxx.com." Are your clients that crazy? :-)
>
>
>  [19/Sep/2013:18:19:51 -0500] conn=9 op=169774 SRCH base="idnsName=
>> slpoxacl01.unix.xxx.com,**idnsname=unix.xxx.com,cn=dns,**
>> dc=unix,dc=xxx,dc=com"
>> scope=0 filter="(objectClass=**idnsRecord)" attrs=ALL
>> [19/Sep/2013:18:19:51 -0500] conn=9 op=169774 RESULT err=32 tag=101
>> nentries=0 etime=0
>>
>
> This is equivalent to DNS query for "slpoxacl01.unix.xxx.com.unix.**
> xxx.com .".
>
>
>  [19/Sep/2013:18:19:51 -0500] conn=9 op=169770 SRCH base="idnsName=
>> sla400q1.unix.xxx.com,**idnsname=unix.xxx.com,cn=dns,**
>> dc=unix,dc=xxx,dc=com"
>> scope=0 filter="(objectClass=**idnsRecord)" attrs=ALL
>> [19/Sep/2013:18:19:51 -0500] conn=9 op=169770 RESULT err=32 tag=101
>> nentries=0 etime=0
>>
>
> And this is 
> "sla400q1.unix.xxx.com.unix.**xxx.com
> .".
>
>
>  [19/Sep/2013:18:19:51 -0500] conn=9 op=169772 SRCH base="idnsName=
>> magellanhealth.com,idnsname=un**ix.magellanhealth.com
>> ,cn=dns,**dc=unix,dc=magellanhealth,dc=**com"
>> scope=0 filter="(objectClass=**idnsRecord)" attrs=ALL
>> [19/Sep/2013:18:19:51 -0500] conn=9 op=169772 RESULT err=32 tag=101
>> nentries=0 etime=0
>>
>> So far today there are over half a million of these.  That can't be right.
>>
>
> I would recommend you to use network sniffer and check which clients sends
> these crazy queries.
>
> My guess is that your resolver library (libc?) causes this.
>
> On my Linux system with glibc-2.17-14.fc19.x86_64 it behaves in this way:
>
> client query = nonexistent.example.com.
> (I used $ "ping nonexistent.example.com.")
> search domain in /etc/resolv.conf = brq.redhat.com.
>
> DNS query #1: nonexistent.example.com. => NXDOMAIN
> DNS query #2: 
> nonexistent.example.com.brq.**redhat.com.
> => NXDOMAIN
> DNS query #3: 
> nonexistent.example.com.**redhat.com