Hello Will, Daniel,

Server1 successfully replicated to Server2, but Server2 fails to replicated to Server1.

The replication Server2->Server1 is done with kerberos authentication.
Server1 receives the replication session, successfully identify the replication manager, start to receives replication extop but suddenly closes the connection.


   [19/Nov/2014:14:21:39 +0100] conn=2980 fd=78 slot=78 connection from
   xxx to yyy
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=0 BIND dn="" method=sasl
   version=3 mech=GSSAPI
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=0 RESULT err=14 tag=97
   nentries=0 etime=0, SASL bind in progress
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=1 BIND dn="" method=sasl
   version=3 mech=GSSAPI
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=1 RESULT err=14 tag=97
   nentries=0 etime=0, SASL bind in progress
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=2 BIND dn="" method=sasl
   version=3 mech=GSSAPI
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=2 RESULT err=0 tag=97
   nentries=0 etime=0 dn="krbprincipalname=xxx"
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=3 SRCH base="" scope=0
   filter="(objectClass=*)" attrs="supportedControl supportedExtension"
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=3 RESULT err=0 tag=101
   nentries=1 etime=0
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=4 SRCH base="" scope=0
   filter="(objectClass=*)" attrs="supportedControl supportedExtension"
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=4 RESULT err=0 tag=101
   nentries=1 etime=0
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=5 EXT
   oid="2.16.840.1.113730.3.5.12" name="replication-multimaster-extop"
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=5 RESULT err=0 tag=120
   nentries=0 etime=0
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=6 SRCH base="cn=schema"
   scope=0 filter="(objectClass=*)" attrs="nsSchemaCSN"
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=6 RESULT err=0 tag=101
   nentries=1 etime=0
   [19/Nov/2014:14:21:39 +0100] conn=2980 op=-1 fd=78 closed - I/O
   function error.

The reason of this closure is logged in server1 error log. sasl_decode fails to decode a received PDU.

   [19/Nov/2014:14:21:39 +0100] - sasl_io_recv failed to decode packet
   for connection 2980

I do not know why it fails but I wonder if the received PDU is not larger than the maximum configured value. The attribute nsslapd-maxsasliosize is set to 2Mb by default. Would it be possible to increase its value (5Mb) to see if it has an impact

Thanks
thierry

On 11/19/2014 09:49 AM, thierry bordaz wrote:
On 11/18/2014 07:44 PM, Will Sheldon wrote:

No, not resolved yet I did test with GSSAPI (-Y) and like you it worked. :(

Hello,

Would it be possible to get server1/server2 logs (error/access) and config (dse.ldif) ?. Turning on replication logs would help (
http://www.port389.org/docs/389ds/FAQ/faq.html#troubleshooting).

In the sample of the log, there is a failure while ending a replication session. No replication error before ? It is like suddenly server1 was no longer able to reach server2 (dns or network issue ?).

thanks
thierry


Will Sheldon

On November 18, 2014 at 8:37:10 AM, dbisc...@hrz.uni-kassel.de (dbisc...@hrz.uni-kassel.de <mailto:dbisc...@hrz.uni-kassel.de>) wrote:

Hi,

On Fri, 7 Nov 2014, Dmitri Pal wrote:

> On 11/07/2014 01:24 AM, Will Sheldon wrote:
>> On November 6, 2014 at 10:07:54 PM, Dmitri Pal (d...@redhat.com
>> <mailto:d...@redhat.com>) wrote:
>>> On 11/07/2014 12:18 AM, Will Sheldon wrote:
>>>>
>>>> On the whole we are loving FreeIPA, Many thanks and much respect to >>>> all involved, we've had a great 12-18 months hassle free use out of >>>> it - it is a fantastically stable trouble free solution... however now >>>> we've run into a small issue we (as mere mortals) are finding it hard
>>>> to resolve :-/
>>>>
>>>> We upgraded our ipa servers (3.0.0-42) to Centos 6.6. everything
>>>> seems to go well, but one server is behaving oddly. It's likely not >>>> an IPA issue, it also reset it's hostname somehow after the upgrade
>>>> (it's an image in an openstack environment)
>>>>
>>>> If anyone has any pointers as to how to debug I'd be hugely
>>>> appreciative :)
>>>>
>>>> Two servers, server1.domain.com and server2.domain.com
>>>>
>>>> Server1 can't push data to server2, there are updates and new records
>>>> on server1 that do not exist on server2.
>>>>
>>>>
>>>> from the logs on server1:
>>>>
>>>> [07/Nov/2014:01:33:42 +0000] NSMMReplicationPlugin -
>>>> agmt="cn=meToserver2.domain.com" (server2:389): Warning: unable to send
>>>> endReplication extended operation (Can't contact LDAP server)
>>>> [07/Nov/2014:01:33:47 +0000] NSMMReplicationPlugin -
>>>> agmt="cn=meToserver2.domain.com" (server2:389): Replication bind with
>>>> GSSAPI auth resumed
>>>> [07/Nov/2014:01:33:48 +0000] NSMMReplicationPlugin -
>>>> agmt="cn=meToserver2.domain.com" (server2:389): Warning: unable to
>>>> replicate schema: rc=2
>>>> [07/Nov/2014:01:33:48 +0000] NSMMReplicationPlugin -
>>>> agmt="cn=meToserver2.domain.com" (server2:389): Consumer failed to replay >>>> change (uniqueid (null), CSN (null)): Can't contact LDAP server(-1). Will
>>>> retry later.
>>>
>>> Try to see
>>> a) Server 1 properly resolves server 2
>>> b) You can connect from server 1 to server 2 using ldapsearch
>>> c) your firewall has proper ports open
>>> d) dirserver on server 2 is actually running
>>
>> All seems working:
>>
>> [root@server1 ~]# ldapsearch -x -H ldap://server2.domain.com -s base -b ''
>> namingContexts
>
> Can you try kinit admin and then use kerberos GSSAPI to connect, i.e. -Y
> switch?

is this resolved? I observe it on my systems, too. Exact same symptoms.
ldapsearch with "-Y GSSAPI" works.

> Did you find anything in the server2 logs?

On my "server2", I see "sasl_io_recv failed to decode packet for
connection #".

Could there be something wrong with default buffer sizes as described in
https://bugzilla.redhat.com/show_bug.cgi?id=953653

I have nsslapd-sasl-max-buffer-size: 65536 on both machines, but my
database is rather small: ~30 users, <10 hosts and services.

>> # extended LDIF
>> #
>> # LDAPv3
>> # base <> with scope baseObject
>> # filter: (objectclass=*)
>> # requesting: namingContexts
>> #
>>
>> #
>> dn:
>> namingContexts: dc=domain,dc=com
>>
>> # search result
>> search: 2
>> result: 0 Success
>>
>> # numResponses: 2
>> # numEntries: 1
>> [root@server1 ~]#
>>
>> And:
>>
>> [root@server2 ~]# /etc/init.d/dirsrv status
>> dirsrv DOMAIN-COM (pid 1009) is running...
>> dirsrv PKI-IPA (pid 1083) is running...
>> [root@server2 ~]#
>>
>>>
>>> Check logs on server 2 to see whether it actually sees an attempt to >>> connect, I suspect not, so it is most likely a DNS/FW issue or dir server
>>> is not running on 2.
>>>>
>>>>
>>>> and the servers:
>>>>
>>>> [root@server1 ~]# ipa-replica-manage list -v `hostname`
>>>> Directory Manager password:
>>>>
>>>> server2.domain.com: replica
>>>> last init status: None
>>>> last init ended: None
>>>> last update status: 0 Replica acquired successfully: Incremental update
>>>> started
>>>> last update ended: 2014-11-07 01:35:58+00:00
>>>> [root@server1 ~]#
>>>>
>>>>
>>>>
>>>> [root@server2 ~]# ipa-replica-manage list -v `hostname`
>>>> Directory Manager password:
>>>>
>>>> server1.domain.com: replica
>>>> last init status: None
>>>> last init ended: None
>>>> last update status: 0 Replica acquired successfully: Incremental update
>>>> succeeded
>>>> last update ended: 2014-11-07 01:35:43+00:00
>>>> [root@server2 ~]#


Mit freundlichen Gruessen/With best regards,

--Daniel.

--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go To http://freeipa.org for more info on the project






-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go To http://freeipa.org for more info on the project

Reply via email to