Re: [Freeipa-users] 3.0.0-42 Replication issue after Centos6.5->6.6 upgrade

thierry bordaz Thu, 20 Nov 2014 02:05:35 -0800

Hello Will, Daniel,

Server1 successfully replicated to Server2, but Server2 fails toreplicated to Server1.


The replication Server2->Server1 is done with kerberos authentication.

Server1 receives the replication session, successfully identify thereplication manager, start to receives replication extop but suddenlycloses the connection.



   [19/Nov/2014:14:21:39 +0100] conn=2980 fd=78 slot=78 connection from
   xxx to yyy
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=0 BIND dn="" method=sasl
   version=3 mech=GSSAPI
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=0 RESULT err=14 tag=97
   nentries=0 etime=0, SASL bind in progress
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=1 BIND dn="" method=sasl
   version=3 mech=GSSAPI
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=1 RESULT err=14 tag=97
   nentries=0 etime=0, SASL bind in progress
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=2 BIND dn="" method=sasl
   version=3 mech=GSSAPI
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=2 RESULT err=0 tag=97
   nentries=0 etime=0 dn="krbprincipalname=xxx"
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=3 SRCH base="" scope=0
   filter="(objectClass=*)" attrs="supportedControl supportedExtension"
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=3 RESULT err=0 tag=101
   nentries=1 etime=0
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=4 SRCH base="" scope=0
   filter="(objectClass=*)" attrs="supportedControl supportedExtension"
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=4 RESULT err=0 tag=101
   nentries=1 etime=0
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=5 EXT
   oid="2.16.840.1.113730.3.5.12" name="replication-multimaster-extop"
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=5 RESULT err=0 tag=120
   nentries=0 etime=0
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=6 SRCH base="cn=schema"
   scope=0 filter="(objectClass=*)" attrs="nsSchemaCSN"
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=6 RESULT err=0 tag=101
   nentries=1 etime=0
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=-1 fd=78 closed - I/O
   function error.

The reason of this closure is logged in server1 error log. sasl_decodefails to decode a received PDU.


   [19/Nov/2014:14:21:39 +0100] - sasl_io_recv failed to decode packet
   for connection 2980

I do not know why it fails but I wonder if the received PDU is notlarger than the maximum configured value. The attributensslapd-maxsasliosize is set to 2Mb by default. Would it be possible toincrease its value (5Mb) to see if it has an impact


Thanks
thierry

On 11/19/2014 09:49 AM, thierry bordaz wrote:

On 11/18/2014 07:44 PM, Will Sheldon wrote:
No, not resolved yet I did test with GSSAPI (-Y) and like you itworked. :(
Hello,
Would it be possible to get server1/server2 logs (error/access) andconfig (dse.ldif) ?. Turning on replication logs would help (
http://www.port389.org/docs/389ds/FAQ/faq.html#troubleshooting).
In the sample of the log, there is a failure while ending areplication session. No replication error before ?It is like suddenly server1 was no longer able to reach server2 (dnsor network issue ?).
thanks
thierry
Will Sheldon
On November 18, 2014 at 8:37:10 AM, dbisc...@hrz.uni-kassel.de(dbisc...@hrz.uni-kassel.de <mailto:dbisc...@hrz.uni-kassel.de>) wrote:
Hi,

On Fri, 7 Nov 2014, Dmitri Pal wrote:

> On 11/07/2014 01:24 AM, Will Sheldon wrote:
>> On November 6, 2014 at 10:07:54 PM, Dmitri Pal (d...@redhat.com
>> <mailto:d...@redhat.com>) wrote:
>>> On 11/07/2014 12:18 AM, Will Sheldon wrote:
>>>>
>>>> On the whole we are loving FreeIPA, Many thanks and muchrespect to>>>> all involved, we've had a great 12-18 months hassle free useout of>>>> it - it is a fantastically stable trouble free solution...however now>>>> we've run into a small issue we (as mere mortals) are findingit hard
>>>> to resolve :-/
>>>>
>>>> We upgraded our ipa servers (3.0.0-42) to Centos 6.6. everything
>>>> seems to go well, but one server is behaving oddly. It's likelynot>>>> an IPA issue, it also reset it's hostname somehow after theupgrade
>>>> (it's an image in an openstack environment)
>>>>
>>>> If anyone has any pointers as to how to debug I'd be hugely
>>>> appreciative :)
>>>>
>>>> Two servers, server1.domain.com and server2.domain.com
>>>>
>>>> Server1 can't push data to server2, there are updates and newrecords
>>>> on server1 that do not exist on server2.
>>>>
>>>>
>>>> from the logs on server1:
>>>>
>>>> [07/Nov/2014:01:33:42 +0000] NSMMReplicationPlugin -
>>>> agmt="cn=meToserver2.domain.com" (server2:389): Warning: unableto send
>>>> endReplication extended operation (Can't contact LDAP server)
>>>> [07/Nov/2014:01:33:47 +0000] NSMMReplicationPlugin -
>>>> agmt="cn=meToserver2.domain.com" (server2:389): Replicationbind with
>>>> GSSAPI auth resumed
>>>> [07/Nov/2014:01:33:48 +0000] NSMMReplicationPlugin -
>>>> agmt="cn=meToserver2.domain.com" (server2:389): Warning: unable to
>>>> replicate schema: rc=2
>>>> [07/Nov/2014:01:33:48 +0000] NSMMReplicationPlugin -
>>>> agmt="cn=meToserver2.domain.com" (server2:389): Consumer failedto replay>>>> change (uniqueid (null), CSN (null)): Can't contact LDAPserver(-1). Will
>>>> retry later.
>>>
>>> Try to see
>>> a) Server 1 properly resolves server 2
>>> b) You can connect from server 1 to server 2 using ldapsearch
>>> c) your firewall has proper ports open
>>> d) dirserver on server 2 is actually running
>>
>> All seems working:
>>
>> [root@server1 ~]# ldapsearch -x -H ldap://server2.domain.com -sbase -b ''
>> namingContexts
>
> Can you try kinit admin and then use kerberos GSSAPI to connect,i.e. -Y
> switch?

is this resolved? I observe it on my systems, too. Exact same symptoms.
ldapsearch with "-Y GSSAPI" works.

> Did you find anything in the server2 logs?

On my "server2", I see "sasl_io_recv failed to decode packet for
connection #".
Could there be something wrong with default buffer sizes asdescribed in
https://bugzilla.redhat.com/show_bug.cgi?id=953653

I have nsslapd-sasl-max-buffer-size: 65536 on both machines, but my
database is rather small: ~30 users, <10 hosts and services.

>> # extended LDIF
>> #
>> # LDAPv3
>> # base <> with scope baseObject
>> # filter: (objectclass=*)
>> # requesting: namingContexts
>> #
>>
>> #
>> dn:
>> namingContexts: dc=domain,dc=com
>>
>> # search result
>> search: 2
>> result: 0 Success
>>
>> # numResponses: 2
>> # numEntries: 1
>> [root@server1 ~]#
>>
>> And:
>>
>> [root@server2 ~]# /etc/init.d/dirsrv status
>> dirsrv DOMAIN-COM (pid 1009) is running...
>> dirsrv PKI-IPA (pid 1083) is running...
>> [root@server2 ~]#
>>
>>>
>>> Check logs on server 2 to see whether it actually sees anattempt to>>> connect, I suspect not, so it is most likely a DNS/FW issue ordir server
>>> is not running on 2.
>>>>
>>>>
>>>> and the servers:
>>>>
>>>> [root@server1 ~]# ipa-replica-manage list -v `hostname`
>>>> Directory Manager password:
>>>>
>>>> server2.domain.com: replica
>>>> last init status: None
>>>> last init ended: None
>>>> last update status: 0 Replica acquired successfully:Incremental update
>>>> started
>>>> last update ended: 2014-11-07 01:35:58+00:00
>>>> [root@server1 ~]#
>>>>
>>>>
>>>>
>>>> [root@server2 ~]# ipa-replica-manage list -v `hostname`
>>>> Directory Manager password:
>>>>
>>>> server1.domain.com: replica
>>>> last init status: None
>>>> last init ended: None
>>>> last update status: 0 Replica acquired successfully:Incremental update
>>>> succeeded
>>>> last update ended: 2014-11-07 01:35:43+00:00
>>>> [root@server2 ~]#


Mit freundlichen Gruessen/With best regards,

--Daniel.

--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go To http://freeipa.org for more info on the project

-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go To http://freeipa.org for more info on the project

Re: [Freeipa-users] 3.0.0-42 Replication issue after Centos6.5->6.6 upgrade

Reply via email to