Re: [389-users] MMR issue ...

2012-11-13 Thread Reinhard Nappert
The 3 servers do not crash.

I am not sure about the network, though. My first assumption was that the 
firewall (between A and B) might cause the issue. The latest occurrence (the 
one, I described) had the firewall removed. I see quite some TCP 
Retransmissions in the packet captures. Could that be the issue?

-Reinhard

From: Rich Megginson [mailto:rmegg...@redhat.com]
Sent: Tuesday, November 13, 2012 1:15 PM
To: General discussion list for the 389 Directory server project.
Cc: Reinhard Nappert
Subject: Re: [389-users] MMR issue ...

On 11/13/2012 11:02 AM, Reinhard Nappert wrote:
Rich,

Do you know what the cause of this issue is?

No, I don't know.


You would expect that you saw this issue in different deployments, but I only 
saw it in one instance.

If it turns out that the issue I see is identical the issue, you mentioned, I’d 
like to know, when it was fixed.

Upon further investigation, this does not appear to be the same as 
https://fedorahosted.org/389/ticket/374

I'm not sure what the problem is.  I've seen timeouts when servers crash or 
there are network issues.



Thanks,
-Reinhard

From: 
389-users-boun...@lists.fedoraproject.org<mailto:389-users-boun...@lists.fedoraproject.org>
 [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Reinhard 
Nappert
Sent: Tuesday, November 13, 2012 12:22 PM
To: Rich Megginson; General discussion list for the 389 Directory server 
project.
Subject: Re: [389-users] MMR issue ...

I use 1.2.8.2

From: Rich Megginson [mailto:rmegg...@redhat.com]
Sent: Tuesday, November 13, 2012 12:18 PM
To: General discussion list for the 389 Directory server project.
Cc: Reinhard Nappert
Subject: Re: [389-users] MMR issue ...

On 11/13/2012 09:24 AM, Reinhard Nappert wrote:
Hi,

I’ve encountered issues with a MMR setup, which looks like the following:

 A --- B
   \   /
 \   /
   \   /
 C

The replication works for approximately 24 hours. There are not many changes to 
the content anyway. After about 1 day, the attribute  value of the type 
“nsds5replicaLastUpdateStatus”  changes to “1 Can't acquire busy replica “ of 
the replication agreement object from type “nsDS5ReplicationAgreement”.  I see 
this message on C for the agreement “C-to-B”.  The start-time of the last 
update is 01:08:33.  When I check the status on B, it looks fine for “B-to-C” 
and “B-to-A”, however, the start-time of the last update is stuck at 01:08:36 
for “B-to-C”, whereas A gets updated afterwards as well. I don’t have the 
values for A!

When, I check errors and access on the boxes, I see the following:

Errors on A:
[10/Nov/2012:01:19:31 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Warning: unable to receive endReplication extended operation response (Timed 
out)
[10/Nov/2012:01:25:01 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Unable to receive the response for a startReplication extended operation to 
consumer (Can't contact LDAP server). Will retry later.
[10/Nov/2012:01:25:05 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Replication bind with SIMPLE auth resumed
[10/Nov/2012:02:26:29 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Unable to receive the response for a startReplication extended operation to 
consumer (Timed out). Will retry later.
[10/Nov/2012:02:31:55 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Unable to receive the response for a startReplication extended operation to 
consumer (Can't contact LDAP server). Will retry later.
[10/Nov/2012:02:31:59 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Replication bind with SIMPLE auth resumed
[10/Nov/2012:02:43:36 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Unable to receive the response for a startReplication extended operation to 
consumer (Timed out). Will retry later.
[10/Nov/2012:03:03:00 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Unable to receive the response for a startReplication extended operation to 
consumer (Timed out). Will retry later.
[10/Nov/2012:03:08:24 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Unable to receive the response for a startReplication extended operation to 
consumer (Can't contact LDAP server). Will retry later.
[10/Nov/2012:03:11:35 -0300] slapi_ldap_bind - Error: could not send bind 
request for id [cn=replication,cn=config] mech [SIMPLE]: error 91 (Can't 
connect to the LDAP server) -5961 (TCP connection reset by peer.) 115 
(Operation now in progress)
[10/Nov/2012:03:11:35 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Replication bind with SIMPLE auth failed: LDAP error 91 (Can't connect to the 
LDAP server) ((null))
[10/Nov/2012:03:14:45 -0300] slapi_ldap_bind - Error: could not send bind 
request for id [cn=replication,cn=config] mech [SIMPLE]: error 91 (Can&#

Re: [389-users] MMR issue ...

2012-11-13 Thread Reinhard Nappert
I use 1.2.8.2

From: Rich Megginson [mailto:rmegg...@redhat.com]
Sent: Tuesday, November 13, 2012 12:18 PM
To: General discussion list for the 389 Directory server project.
Cc: Reinhard Nappert
Subject: Re: [389-users] MMR issue ...

On 11/13/2012 09:24 AM, Reinhard Nappert wrote:
Hi,

I’ve encountered issues with a MMR setup, which looks like the following:

 A --- B
   \   /
 \   /
   \   /
 C

The replication works for approximately 24 hours. There are not many changes to 
the content anyway. After about 1 day, the attribute  value of the type 
“nsds5replicaLastUpdateStatus”  changes to “1 Can't acquire busy replica “ of 
the replication agreement object from type “nsDS5ReplicationAgreement”.  I see 
this message on C for the agreement “C-to-B”.  The start-time of the last 
update is 01:08:33.  When I check the status on B, it looks fine for “B-to-C” 
and “B-to-A”, however, the start-time of the last update is stuck at 01:08:36 
for “B-to-C”, whereas A gets updated afterwards as well. I don’t have the 
values for A!

When, I check errors and access on the boxes, I see the following:

Errors on A:
[10/Nov/2012:01:19:31 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Warning: unable to receive endReplication extended operation response (Timed 
out)
[10/Nov/2012:01:25:01 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Unable to receive the response for a startReplication extended operation to 
consumer (Can't contact LDAP server). Will retry later.
[10/Nov/2012:01:25:05 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Replication bind with SIMPLE auth resumed
[10/Nov/2012:02:26:29 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Unable to receive the response for a startReplication extended operation to 
consumer (Timed out). Will retry later.
[10/Nov/2012:02:31:55 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Unable to receive the response for a startReplication extended operation to 
consumer (Can't contact LDAP server). Will retry later.
[10/Nov/2012:02:31:59 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Replication bind with SIMPLE auth resumed
[10/Nov/2012:02:43:36 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Unable to receive the response for a startReplication extended operation to 
consumer (Timed out). Will retry later.
[10/Nov/2012:03:03:00 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Unable to receive the response for a startReplication extended operation to 
consumer (Timed out). Will retry later.
[10/Nov/2012:03:08:24 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Unable to receive the response for a startReplication extended operation to 
consumer (Can't contact LDAP server). Will retry later.
[10/Nov/2012:03:11:35 -0300] slapi_ldap_bind - Error: could not send bind 
request for id [cn=replication,cn=config] mech [SIMPLE]: error 91 (Can't 
connect to the LDAP server) -5961 (TCP connection reset by peer.) 115 
(Operation now in progress)
[10/Nov/2012:03:11:35 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Replication bind with SIMPLE auth failed: LDAP error 91 (Can't connect to the 
LDAP server) ((null))
[10/Nov/2012:03:14:45 -0300] slapi_ldap_bind - Error: could not send bind 
request for id [cn=replication,cn=config] mech [SIMPLE]: error 91 (Can't 
connect to the LDAP server) -5961 (TCP connection reset by peer.) 115 
(Operation now in progress)
[10/Nov/2012:03:14:52 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Replication bind with SIMPLE auth resumed
[10/Nov/2012:03:33:29 -0300] slapi_ldap_bind - Error: could not send bind 
request for id [cn=replication,cn=config] mech [SIMPLE]: error 91 (Can't 
connect to the LDAP server) -5961 (TCP connection reset by peer.) 115 
(Operation now in progress)
[10/Nov/2012:03:33:29 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Replication bind with SIMPLE auth failed: LDAP error 91 (Can't connect to the 
LDAP server) ((null))
[10/Nov/2012:03:43:29 -0300] slapi_ldap_bind - Error: timeout after [0.0] 
seconds reading bind response for [cn=replication,cn=config] mech [SIMPLE]
[10/Nov/2012:03:43:29 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Replication bind with SIMPLE auth failed: LDAP error 85 (Timed out) ((null))
[10/Nov/2012:03:46:39 -0300] slapi_ldap_bind - Error: could not send bind 
request for id [cn=replication,cn=config] mech [SIMPLE]: error 91 (Can't 
connect to the LDAP server) -5961 (TCP connection reset by peer.) 115 
(Operation now in progress)
[10/Nov/2012:03:46:39 -0300] NSMMReplicationPlugin - agmt="cn=A-to-B" (B:389): 
Replication bind with SIMPLE auth failed: LDAP error 91 (Can't connect to the 
LDAP server) ((null))
[10/Nov/2012:03:46:42 -0300]

Re: [389-users] Replication issue

2011-10-18 Thread Reinhard Nappert
Hi Rich,

actually just restarting srvA seems to have cleared the replication issue. It 
looks like replication is working fine now,
but I see now the following error log:
[18/Oct/2011:13:09:57 +] NSMMReplicationPlugin - agmt="cn=srvAtosrvB" 
(srvB:389): changelog iteration code returned a dummy entry with csn 
4e9d7bc20008, skipping ...
I think that we can ignore this message, right? But, how can I get rid of this 
message, since it is generated quite often?

Any ideas?

-Reinhard

From: Rich Megginson [mailto:rmegg...@redhat.com]
Sent: Wednesday, October 12, 2011 4:29 PM
To: Reinhard Nappert
Cc: General discussion list for the 389 Directory server project.; Marc Sauton
Subject: Re: [389-users] Replication issue

On 10/12/2011 02:16 PM, Reinhard Nappert wrote:
Good.

what about the different generation ID message? Is it possible that this could 
be caused by a re-initialize?
Yes.

But then, I thought a re-initialize would fix this error, if it occurs.
In this case, it should fix this problem.

-Reinhard


From: Rich Megginson [mailto:rmegg...@redhat.com]
Sent: Wednesday, October 12, 2011 4:11 PM
To: Reinhard Nappert
Cc: General discussion list for the 389 Directory server project.; Marc Sauton
Subject: Re: [389-users] Replication issue

On 10/12/2011 02:08 PM, Reinhard Nappert wrote:
Rich,

I was thinking about the "Replica has a different generation ID than the local 
data." error, because I have seen this before. If possible, I want to avoid 
that I have to go though each box and re-initialize.

So, you suggest I take let's say D (or A) and re-initialize B with D's data. 
Then, I would have to re-initialize F from B, right?
Right.

Let's go a bit further: If I had an agreement from A to F (and vice versa), I 
would not even have to re-initialize F from B. Is this correct?
Assuming the AtoF agreement is not complaining about "unable to find CSN" and 
"data reload", then yes.

-Reinhard


From: Rich Megginson [mailto:rmegg...@redhat.com]
Sent: Wednesday, October 12, 2011 4:00 PM
To: General discussion list for the 389 Directory server project.
Cc: Reinhard Nappert; Marc Sauton
Subject: Re: [389-users] Replication issue

On 10/11/2011 02:41 PM, Reinhard Nappert wrote:
How do I do this manually on server A?

The other question is, what kind of impact does it have when I re-iitialize 
server B? To be more precise, my replication environment is more complex than 
just server A and server B. In fact, I have a setup like the following:

srv C <--> srv A <--> srv B <--> srv D <--> srv C
 /\ /\
 |   |
\/  \/
  srv E  srv F

I don't want to end up to re-initialize all boxes in my environment.
Assuming C and D are up to date and don't have any problems, reinitializing B 
should affect only B and F.

Thanks,
-Reinhard


From: Marc Sauton [mailto:msau...@redhat.com]
Sent: Tuesday, October 11, 2011 4:36 PM
To: General discussion list for the 389 Directory server project.
Cc: Reinhard Nappert
Subject: Re: [389-users] Replication issue

On 10/11/2011 01:22 PM, Reinhard Nappert wrote:
Hi,

I encountered the following logs in the errors:

[06/Oct/2011:10:11:57 +] NSMMReplicationPlugin - changelog program - 
agmt="cn=srvAtosrvB" (srvB:389): CSN 4e8d804a000c not found, we aren't 
as up to date, or we purged
[06/Oct/2011:10:11:57 +] NSMMReplicationPlugin - agmt="cn=srvAtosrvB" 
(srvB:389): Data required to update replica has been purged. The replica must 
be reinitialized.


[06/Oct/2011:10:11:57 +] NSMMReplicationPlugin - agmt="cn=srvAtosrvB" 
(srvB:389): Incremental update failed and requires administrator action



Does anyone have an idea, what could have caused this and more importantly, how 
to fix this?



Thanks

-Reinhard


--
389 users mailing list
389-us...@lists.fedoraproject.org<mailto:389-us...@lists.fedoraproject.org>
https://admin.fedoraproject.org/mailman/listinfo/389-users

On server A, read a changelog to manually run the changes on server B.
May be tune up nsds5ReplicaPurgeDelay if such errors somehow appears regularly.
Otherwise, like the errors log says, the change was purged/removed, and replica 
need a re-init.
M.


--
389 users mailing list
389-us...@lists.fedoraproject.org<mailto:389-us...@lists.fedoraproject.org>
https://admin.fedoraproject.org/mailman/listinfo/389-users



--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] Replication issue

2011-10-12 Thread Reinhard Nappert
Good.

what about the different generation ID message? Is it possible that this could 
be caused by a re-initialize?

But then, I thought a re-initialize would fix this error, if it occurs.

-Reinhard


From: Rich Megginson [mailto:rmegg...@redhat.com]
Sent: Wednesday, October 12, 2011 4:11 PM
To: Reinhard Nappert
Cc: General discussion list for the 389 Directory server project.; Marc Sauton
Subject: Re: [389-users] Replication issue

On 10/12/2011 02:08 PM, Reinhard Nappert wrote:
Rich,

I was thinking about the "Replica has a different generation ID than the local 
data." error, because I have seen this before. If possible, I want to avoid 
that I have to go though each box and re-initialize.

So, you suggest I take let's say D (or A) and re-initialize B with D's data. 
Then, I would have to re-initialize F from B, right?
Right.

Let's go a bit further: If I had an agreement from A to F (and vice versa), I 
would not even have to re-initialize F from B. Is this correct?
Assuming the AtoF agreement is not complaining about "unable to find CSN" and 
"data reload", then yes.

-Reinhard


From: Rich Megginson [mailto:rmegg...@redhat.com]
Sent: Wednesday, October 12, 2011 4:00 PM
To: General discussion list for the 389 Directory server project.
Cc: Reinhard Nappert; Marc Sauton
Subject: Re: [389-users] Replication issue

On 10/11/2011 02:41 PM, Reinhard Nappert wrote:
How do I do this manually on server A?

The other question is, what kind of impact does it have when I re-iitialize 
server B? To be more precise, my replication environment is more complex than 
just server A and server B. In fact, I have a setup like the following:

srv C <--> srv A <--> srv B <--> srv D <--> srv C
 /\ /\
 |   |
\/  \/
  srv E  srv F

I don't want to end up to re-initialize all boxes in my environment.
Assuming C and D are up to date and don't have any problems, reinitializing B 
should affect only B and F.

Thanks,
-Reinhard


From: Marc Sauton [mailto:msau...@redhat.com]
Sent: Tuesday, October 11, 2011 4:36 PM
To: General discussion list for the 389 Directory server project.
Cc: Reinhard Nappert
Subject: Re: [389-users] Replication issue

On 10/11/2011 01:22 PM, Reinhard Nappert wrote:
Hi,

I encountered the following logs in the errors:

[06/Oct/2011:10:11:57 +] NSMMReplicationPlugin - changelog program - 
agmt="cn=srvAtosrvB" (srvB:389): CSN 4e8d804a000c not found, we aren't 
as up to date, or we purged
[06/Oct/2011:10:11:57 +] NSMMReplicationPlugin - agmt="cn=srvAtosrvB" 
(srvB:389): Data required to update replica has been purged. The replica must 
be reinitialized.


[06/Oct/2011:10:11:57 +] NSMMReplicationPlugin - agmt="cn=srvAtosrvB" 
(srvB:389): Incremental update failed and requires administrator action



Does anyone have an idea, what could have caused this and more importantly, how 
to fix this?



Thanks

-Reinhard


--
389 users mailing list
389-us...@lists.fedoraproject.org<mailto:389-us...@lists.fedoraproject.org>
https://admin.fedoraproject.org/mailman/listinfo/389-users

On server A, read a changelog to manually run the changes on server B.
May be tune up nsds5ReplicaPurgeDelay if such errors somehow appears regularly.
Otherwise, like the errors log says, the change was purged/removed, and replica 
need a re-init.
M.


--
389 users mailing list
389-us...@lists.fedoraproject.org<mailto:389-us...@lists.fedoraproject.org>
https://admin.fedoraproject.org/mailman/listinfo/389-users


--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] Replication issue

2011-10-12 Thread Reinhard Nappert
Rich,

I was thinking about the "Replica has a different generation ID than the local 
data." error, because I have seen this before. If possible, I want to avoid 
that I have to go though each box and re-initialize.

So, you suggest I take let's say D (or A) and re-initialize B with D's data. 
Then, I would have to re-initialize F from B, right?

Let's go a bit further: If I had an agreement from A to F (and vice versa), I 
would not even have to re-initialize F from B. Is this correct?

-Reinhard


From: Rich Megginson [mailto:rmegg...@redhat.com]
Sent: Wednesday, October 12, 2011 4:00 PM
To: General discussion list for the 389 Directory server project.
Cc: Reinhard Nappert; Marc Sauton
Subject: Re: [389-users] Replication issue

On 10/11/2011 02:41 PM, Reinhard Nappert wrote:
How do I do this manually on server A?

The other question is, what kind of impact does it have when I re-iitialize 
server B? To be more precise, my replication environment is more complex than 
just server A and server B. In fact, I have a setup like the following:

srv C <--> srv A <--> srv B <--> srv D <--> srv C
 /\ /\
 |   |
\/  \/
  srv E  srv F

I don't want to end up to re-initialize all boxes in my environment.
Assuming C and D are up to date and don't have any problems, reinitializing B 
should affect only B and F.

Thanks,
-Reinhard


From: Marc Sauton [mailto:msau...@redhat.com]
Sent: Tuesday, October 11, 2011 4:36 PM
To: General discussion list for the 389 Directory server project.
Cc: Reinhard Nappert
Subject: Re: [389-users] Replication issue

On 10/11/2011 01:22 PM, Reinhard Nappert wrote:
Hi,

I encountered the following logs in the errors:

[06/Oct/2011:10:11:57 +] NSMMReplicationPlugin - changelog program - 
agmt="cn=srvAtosrvB" (srvB:389): CSN 4e8d804a000c not found, we aren't 
as up to date, or we purged
[06/Oct/2011:10:11:57 +] NSMMReplicationPlugin - agmt="cn=srvAtosrvB" 
(srvB:389): Data required to update replica has been purged. The replica must 
be reinitialized.


[06/Oct/2011:10:11:57 +] NSMMReplicationPlugin - agmt="cn=srvAtosrvB" 
(srvB:389): Incremental update failed and requires administrator action



Does anyone have an idea, what could have caused this and more importantly, how 
to fix this?



Thanks

-Reinhard


--
389 users mailing list
389-us...@lists.fedoraproject.org<mailto:389-us...@lists.fedoraproject.org>
https://admin.fedoraproject.org/mailman/listinfo/389-users

On server A, read a changelog to manually run the changes on server B.
May be tune up nsds5ReplicaPurgeDelay if such errors somehow appears regularly.
Otherwise, like the errors log says, the change was purged/removed, and replica 
need a re-init.
M.


--
389 users mailing list
389-us...@lists.fedoraproject.org<mailto:389-us...@lists.fedoraproject.org>
https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] Replication questions.

2011-09-19 Thread Reinhard Nappert
Thanks for the quick response, Rich.

Good to know that there is a procedure for cleaning up the old ruvs. This 
probably helps, when I want to troubleshoot replication issues. Getting rid of 
old data is helpful.

-Reinhard


From: Rich Megginson [mailto:rmegg...@redhat.com]
Sent: Monday, September 19, 2011 2:54 PM
To: General discussion list for the 389 Directory server project.
Cc: Reinhard Nappert
Subject: Re: [389-users] Replication questions.

On 09/19/2011 12:46 PM, Reinhard Nappert wrote:
Hi, I have a multi-master setup with three masters. All of them are running 
1.2.8.2. I had created and deleted the replication environment at least 3 times.

The current setup looks like the following.

Svr A has replication id 1, Srv B has the replication id 3 and Srv C has the id 
4. The replication seems to work.

When I look into the dse.ldif files, I see the following:

Srv A:
dn: cn=replica,cn=o\3base,cn=mapping tree,cn=config
nsDS5ReplicaRoot: o=base
nsDS5ReplicaId: 1
nsDS5Flags: 1
nsDS5ReplicaType: 3
objectClass: top
objectClass: nsDS5Replica
cn: replica
...
nsDS5ReplicaReferral: ldap://SrvB:389/o=base
nsDS5ReplicaReferral: ldap://SrvC:389/o=base
numSubordinates: 2
dn: cn=SrvA2SrvB,cn=replica,cn=o\3base,cn=mapping tree,cn=config
objectClass: top
objectClass: nsDS5ReplicationAgreement
...
nsDS5ReplicaRoot: o=base
nsds50ruv: {replicageneration} 4e5ab5090001
nsds50ruv: {replica 3 ldap://SrvB:389} 4e775b740003 4e77634c0003
 
nsds50ruv: {replica 4 ldap://SrvC:389} 4e77621d0004 4e7763490
 004
nsds50ruv: {replica 1 ldap://SrvA:389} 4e5ab5170001 4e77633d000
 1
nsds50ruv: {replica 2 ldap://SrvC:389} 4e775c9c0002 4e775e4c0
 002
nsruvReplicaLastModified: {replica 3 ldap://SrvB:389} 
nsruvReplicaLastModified: {replica 4 ldap://SrvC:389} 
nsruvReplicaLastModified: {replica 1 ldap://SrvA389} 
nsruvReplicaLastModified: {replica 2 ldap://SrvC:389} 

dn: cn=SrvA2SrvC,cn=replica,cn=o\3DUMC,cn=mapping tree,cn=config
...
nsds50ruv: {replicageneration} 4e5ab5090001
nsds50ruv: {replica 4 ldap://SrvC:389} 4e77621d0004 4e7763490
 004
nsds50ruv: {replica 3 ldap://SrvB:389} 4e775b740003 4e77634c0003
 
nsds50ruv: {replica 1 ldap://SrvA:389} 4e5ab5170001 4e77634e000
 1
nsds50ruv: {replica 2 ldap://SrvC:389} 4e775c9c0002 4e775e4c0
 002
nsruvReplicaLastModified: {replica 4 ldap://SrvC:389} 
nsruvReplicaLastModified: {replica 3 ldap://SrvB:389} 
nsruvReplicaLastModified: {replica 1 ldap://SrvA:389} 
nsruvReplicaLastModified: {replica 2 ldap://SrvC:389} 

I did expect that the replicageneration is equal for all of the agreements (not 
only locally but also for the others).

When, I look at SrvB and SrvC, I do not seen any replicageneration: No 
nsds50ruv and nsruvReplicaLastModified values!

Srv B:
dn: cn=replica,cn=o\3base,cn=mapping tree,cn=config
nsDS5ReplicaRoot: o=base
nsDS5ReplicaId: 2
nsDS5Flags: 1
nsDS5ReplicaType: 3
objectClass: top
objectClass: nsDS5Replica
cn: replica
...
nsDS5ReplicaReferral: ldap://SrvA:389/o=base
nsDS5ReplicaReferral: ldap://SrvC:389/o=base
numSubordinates: 2


dn: cn=SrvB2SrvA,cn=replica,cn=o\3base,cn=mapping tree,cn=config
objectClass: top
objectClass: nsDS5ReplicationAgreement
...
nsDS5ReplicaRoot: o=base

dn: cn=SrvB2SrvC,cn=replica,cn=o\3DUMC,cn=mapping tree,cn=config
...

Srv3 entries look like:

Srv C:
dn: cn=replica,cn=o\3base,cn=mapping tree,cn=config
nsDS5ReplicaRoot: o=base
nsDS5ReplicaId: 3
nsDS5Flags: 1
nsDS5ReplicaType: 3
objectClass: top
objectClass: nsDS5Replica
cn: replica
...
nsDS5ReplicaReferral: ldap://SrvB:389/o=base
nsDS5ReplicaReferral: ldap://SrvA:389/o=base
numSubordinates: 2


dn: cn=SrvC2SrvA,cn=replica,cn=o\3base,cn=mapping tree,cn=config
objectClass: top
objectClass: nsDS5ReplicationAgreement
...
nsDS5ReplicaRoot: o=base

dn: cn=SrvC2SrvB,cn=replica,cn=o\3DUMC,cn=mapping tree,cn=config
...

So, my questions are:
Why are the two attributes nsds50ruv and nsruvReplicaLastModified missing in 
the agreement objects on SrvB and SrvC.

Not sure.  But the main ones are the ones in the database in the nsTombstone 
entry (see below).

Secondly, why do I still see those old values for nsds50ruv and 
nsruvReplicaLastModified  in SrvA. This looks strange to me.
Deleting a replica does not clean up these values.  For dse.ldif you'll have to 
shutdown the server, edit that entry in dse.ldif to remove the old values, and 
restart the server.

Even more confusing is that I see those attributes if I read the dn: 
nsuniqueid=---,o=base entry. Why are those 
nsds50ruv with old replicagenerations still there? At least the 
replicageneration is identical on all three boxes.
If replicageneration is not the same across all servers, replication should not 
work.

Note that deleting a replica does not clean up this ruv metadata

[389-users] Replication questions.

2011-09-19 Thread Reinhard Nappert
Hi, I have a multi-master setup with three masters. All of them are running 
1.2.8.2. I had created and deleted the replication environment at least 3 times.

The current setup looks like the following.

Svr A has replication id 1, Srv B has the replication id 3 and Srv C has the id 
4. The replication seems to work.

When I look into the dse.ldif files, I see the following:

Srv A:
dn: cn=replica,cn=o\3base,cn=mapping tree,cn=config
nsDS5ReplicaRoot: o=base
nsDS5ReplicaId: 1
nsDS5Flags: 1
nsDS5ReplicaType: 3
objectClass: top
objectClass: nsDS5Replica
cn: replica
...
nsDS5ReplicaReferral: ldap://SrvB:389/o=base
nsDS5ReplicaReferral: ldap://SrvC:389/o=base
numSubordinates: 2
dn: cn=SrvA2SrvB,cn=replica,cn=o\3base,cn=mapping tree,cn=config
objectClass: top
objectClass: nsDS5ReplicationAgreement
...
nsDS5ReplicaRoot: o=base
nsds50ruv: {replicageneration} 4e5ab5090001
nsds50ruv: {replica 3 ldap://SrvB:389} 4e775b740003 4e77634c0003
 
nsds50ruv: {replica 4 ldap://SrvC:389} 4e77621d0004 4e7763490
 004
nsds50ruv: {replica 1 ldap://SrvA:389} 4e5ab5170001 4e77633d000
 1
nsds50ruv: {replica 2 ldap://SrvC:389} 4e775c9c0002 4e775e4c0
 002
nsruvReplicaLastModified: {replica 3 ldap://SrvB:389} 
nsruvReplicaLastModified: {replica 4 ldap://SrvC:389} 
nsruvReplicaLastModified: {replica 1 ldap://SrvA389} 
nsruvReplicaLastModified: {replica 2 ldap://SrvC:389} 

dn: cn=SrvA2SrvC,cn=replica,cn=o\3DUMC,cn=mapping tree,cn=config
...
nsds50ruv: {replicageneration} 4e5ab5090001
nsds50ruv: {replica 4 ldap://SrvC:389} 4e77621d0004 4e7763490
 004
nsds50ruv: {replica 3 ldap://SrvB:389} 4e775b740003 4e77634c0003
 
nsds50ruv: {replica 1 ldap://SrvA:389} 4e5ab5170001 4e77634e000
 1
nsds50ruv: {replica 2 ldap://SrvC:389} 4e775c9c0002 4e775e4c0
 002
nsruvReplicaLastModified: {replica 4 ldap://SrvC:389} 
nsruvReplicaLastModified: {replica 3 ldap://SrvB:389} 
nsruvReplicaLastModified: {replica 1 ldap://SrvA:389} 
nsruvReplicaLastModified: {replica 2 ldap://SrvC:389} 

I did expect that the replicageneration is equal for all of the agreements (not 
only locally but also for the others).

When, I look at SrvB and SrvC, I do not seen any replicageneration: No 
nsds50ruv and nsruvReplicaLastModified values!

Srv B:
dn: cn=replica,cn=o\3base,cn=mapping tree,cn=config
nsDS5ReplicaRoot: o=base
nsDS5ReplicaId: 2
nsDS5Flags: 1
nsDS5ReplicaType: 3
objectClass: top
objectClass: nsDS5Replica
cn: replica
...
nsDS5ReplicaReferral: ldap://SrvA:389/o=base
nsDS5ReplicaReferral: ldap://SrvC:389/o=base
numSubordinates: 2


dn: cn=SrvB2SrvA,cn=replica,cn=o\3base,cn=mapping tree,cn=config
objectClass: top
objectClass: nsDS5ReplicationAgreement
...
nsDS5ReplicaRoot: o=base

dn: cn=SrvB2SrvC,cn=replica,cn=o\3DUMC,cn=mapping tree,cn=config
...

Srv3 entries look like:

Srv C:
dn: cn=replica,cn=o\3base,cn=mapping tree,cn=config
nsDS5ReplicaRoot: o=base
nsDS5ReplicaId: 3
nsDS5Flags: 1
nsDS5ReplicaType: 3
objectClass: top
objectClass: nsDS5Replica
cn: replica
...
nsDS5ReplicaReferral: ldap://SrvB:389/o=base
nsDS5ReplicaReferral: ldap://SrvA:389/o=base
numSubordinates: 2


dn: cn=SrvC2SrvA,cn=replica,cn=o\3base,cn=mapping tree,cn=config
objectClass: top
objectClass: nsDS5ReplicationAgreement
...
nsDS5ReplicaRoot: o=base

dn: cn=SrvC2SrvB,cn=replica,cn=o\3DUMC,cn=mapping tree,cn=config
...

So, my questions are:
Why are the two attributes nsds50ruv and nsruvReplicaLastModified missing in 
the agreement objects on SrvB and SrvC. Secondly, why do I still see those old 
values for nsds50ruv and nsruvReplicaLastModified  in SrvA. This looks strange 
to me.

Even more confusing is that I see those attributes if I read the dn: 
nsuniqueid=---,o=base entry. Why are those 
nsds50ruv with old replicagenerations still there? At least the 
replicageneration is identical on all three boxes.

SrvA:
dn: nsuniqueid=---,o=base
objectClass: top
objectClass: nsTombstone
objectClass: extensibleobject
nsds50ruv: {replicageneration} 4e5ab5090001
nsds50ruv: {replica 4 ldap://SrvC:389} 4e77621d0004 4e7770f20
 004
nsds50ruv: {replica 3 ldap://SrvB:389} 4e775b740003 4e77860b0003
 
nsds50ruv: {replica 1 ldap://SrvA:389} 4e5ab5170001 4e77859d000
 1
nsds50ruv: {replica 2 ldap://SrvC:389} 4e775c9c0002 4e775e4c0
 002
o: umc
nsruvReplicaLastModified: {replica 4 ldap://SrvC:389} 4e7770f1
nsruvReplicaLastModified: {replica 3 ldap://SrvB:389} 4e77860a
nsruvReplicaLastModified: {replica 1 ldap://SrvA:389} 4e77859c
nsruvReplicaLastModified: {replica 2 ldap://SrvC:389} 

SrvB:
dn: nsuniqueid=---,o=base
objectClass: top
objectClass: nsTombstone
objectClass: extensibleobject
nsds50ruv: {replicageneration} 4e5ab5090001
nsds5

Re: [389-users] Replication trouble when promoting dedicated Consumer to Multiple master

2011-07-14 Thread Reinhard Nappert
Do a   ldapsearch -b 
'nsuniqueid=---,dc=mydomain,dc=com' -D 
 -w  -s base objectclass=nstombstone

This gives you all the configured (history) of replication ids. The following 
is the output in my setup.

dn: nsuniqueid=---,o=base
objectClass: top
objectClass: nsTombstone
objectClass: extensibleobject
nsds50ruv: {replicageneration} 4df7a1070001
nsds50ruv: {replica 1 ldap://yale:389} 4df7a3960001 4e19ad95000100
 00
nsds50ruv: {replica 3 ldap://norquay:389} 4df7a39d0003 4e160565000
 3
nsds50ruv: {replica 4 ldap://mustrum:389} 4df7a3a4 4dfb9365000
 4
nsds50ruv: {replica 2 ldap://louise:389} 4df7a39a0002 4e171a070002
 
o: base
nsruvReplicaLastModified: {replica 1 ldap://yale:389} 
nsruvReplicaLastModified: {replica 3 ldap://norquay:389} 
nsruvReplicaLastModified: {replica 4 ldap://mustrum:389} 
nsruvReplicaLastModified: {replica 2 ldap://louise:389} 
  /\
   |
 replication-id

I am pretty sure you have somewhere there a duplicate of 3

-Reinhard



From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Roland Schwingel
Sent: Thursday, July 14, 2011 7:39 AM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] Replication trouble when promoting dedicated Consumer 
to Multiple master

Hi Penedo...

Thanks for your reply
Yes.. I did that already a couple of times. I am always coming to the  same 
error message...

Roland



From:Penedo 
To:"General discussion list for the 389 Directory server project." 
<389-us...@lists.fedoraproject.org>
Date:14.07.2011 13:08
Subject:Re: [389-users] Replication trouble when promoting dedicated 
Consumer to Multiple master
Sent by:389-users-boun...@lists.fedoraproject.org




I'm far from being an expert but have you considered clearing the existing 
replication agreements and following the instructions for setting up 
multi-master from scratch?

On Jul 13, 2011 5:44 PM, "Roland Schwingel" 
mailto:roland.schwin...@onevision.com>> wrote:
> Hi...
>
> Since yesterday I got some replication trouble.
>
> My Scenario
>
> server A < - server B <-> server C
> -> server D
> (dedicated Consumer) (multiple Master replica ID:1) (multiple
> Master replica ID:2) (Dedicated Consumer)
>
> The arrows are depicting the replication directions.
> In that scenario everything is fine.
>
> But I want to promote server D to a multiple Master replicating from/to
> server C.
>
> On server D I enabled changelog and changed the Replica Role of userRoot
> to Multiple Master (now with replica id 3)
> I created a replication aggrement to server C.
>
> When enabling that I see these error messages in error log on server D. At
> 08:51:53 I enabled the replication agreement from server D to server C:
>
> [13/Jul/2011:08:49:41 +0200] - 389-Directory/1.2.5 B2010.120.1414 starting
> up
> [13/Jul/2011:08:49:41 +0200] - slapd started. Listening on All Interfaces
> port 389 for LDAP requests
> [13/Jul/2011:08:49:41 +0200] - Listening on All Interfaces port 636 for
> LDAPS requests
> [13/Jul/2011:08:51:53 +0200] NSMMReplicationPlugin -
> repl_set_mtn_referrals: could not set referrals for replica
> dc=mydomain,dc=com: 32
> [13/Jul/2011:08:51:53 +0200] NSMMReplicationPlugin -
> multimaster_be_state_change: replica dc=mydomain,dc=com is going offline;
> disabling replication
> [13/Jul/2011:08:51:53 +0200] - WARNING: Import is running with
> nsslapd-db-private-import-mem on; No other process is allowed to access
> the database
> [13/Jul/2011:08:51:56 +0200] - import userRoot: Workers finished; cleaning
> up...
> [13/Jul/2011:08:51:57 +0200] - import userRoot: Workers cleaned up.
> [13/Jul/2011:08:51:57 +0200] - import userRoot: Indexing complete.
> Post-processing...
> [13/Jul/2011:08:51:57 +0200] - import userRoot: Flushing caches...
> [13/Jul/2011:08:51:57 +0200] - import userRoot: Closing files...
> [13/Jul/2011:08:51:59 +0200] - import userRoot: Import complete. Processed
> 772 entries in 5 seconds. (154.40 entries/sec)
> [13/Jul/2011:08:51:59 +0200] NSMMReplicationPlugin -
> multimaster_be_state_change: replica dc=mydomain,dc=com is coming online;
> enabling replication
> [13/Jul/2011:09:11:00 +0200] NSMMReplicationPlugin -
> agmt="cn=server-d_to_server-c" (server-c:389): Unable to aquire replica:
> the replica has the same Replica ID as this one. Replication is aborting.
> [13/Jul/2011:09:11:00 +0200] NSMMReplicationPlugin -
> agmt="cn=server-d_to_server-c" (server-c:389): Incremental update failed
> and requires administrator action
>
> It says that it has the same replica id, but this is not true. I 

Re: [389-users] db import failure, when setting replication up

2011-05-25 Thread Reinhard Nappert
So, you are saying that the server would start up when I just replace the 
Berkeley Database libraries.

I doubt it, but I will try ..

I think I have to rebuild the source with the new lib. When I upgrade ds-base 
with the newly build package, does the server still understand the db files?

-Reinhard


From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Noriko Hosoi
Sent: Wednesday, May 25, 2011 1:50 PM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] db import failure, when setting replication up

On 05/24/2011 06:27 AM, Reinhard Nappert wrote:
I do that.

Now, I have two questions:

So, what db version do you recommend?
Hi Reinhard,

Which OS you are running?

If it's RHEL5 (BDB4.3.29) or RHEL6 (BDB4.7.25), they are patched.  But RHEL4 
(BDB4.2.52) was rejected.

More importantly, is there a migration path or do I have to reload the existing 
data? I could see issues migrating replicated environments.
There's no data change needed.  The bug was just in the data verification code.

This bug has more detailed info.
Bug 472131<https://bugzilla.redhat.com/show_bug.cgi?id=472131> - dbverify: when 
a duplicate is large enough to have internal page(s), dbverify issues bogus 
out-of-order key errors

Thanks,
--noriko

Thanks,
-Reinhard


From: 
389-users-boun...@lists.fedoraproject.org<mailto:389-users-boun...@lists.fedoraproject.org>
 [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Noriko Hosoi
Sent: Monday, May 23, 2011 1:42 PM
To: 389-us...@lists.fedoraproject.org<mailto:389-us...@lists.fedoraproject.org>
Subject: Re: [389-users] db import failure, when setting replication up

Hi Reinhard,

That was an unfortunate...  I was hoping you were using a newer version. :)  
You hit this bug.

Bug 472131<https://bugzilla.redhat.com/show_bug.cgi?id=472131> - dbverify: when 
a duplicate is large enough to have internal page(s), dbverify issues bogus 
out-of-order key errors

The bug was fixed by Sleepycat on db4.8.  And we ported the fix back to 4.3, 
but no chance to do so to 4.2.  So, we cannot use dbverify to check if the 
index file is healthy or not...  Could it be possible to reindex the ancestorid 
index and see if the error goes away?  (Or you could reinitialize the consumer? 
 That would be the cleanest)

Thanks,
--noriko

Reinhard Nappert wrote:
Hi Noriko,

I run it on a CentOS 4.4 box (Linux 2.6.24). I use the db 4.2 libs with all the 
patches.

Oh, yes dbverify does complain a lot. I see for all of the db files messages 
like:

[20/May/2011:11:03:05 -0400] DB verify - verify failed(-30976): 
/var/lib/dirsrv/slapd-ID/db/userRoot/cn.db4
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 2
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 5
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 8
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 10
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 13
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 16
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 19
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 21
[20/May/2011:11:03:07 -0400] DB verify - verify failed(-30976): 
/var/lib/dirsrv/slapd-ID/db/userRoot/parentid.db4
DB verify: Passed
This said, I guess I should re-index the entire db. Any idea, why this is 
happening?

Right now, I have a 2 MMR setup, where both masters also have a replication 
agreement to a third box, which is a dedicated consumer. I do run tests, where 
I perform simultaneously adds and deletes (not on the same object) on all three 
boxes. I just want to verify how replication behaves in 1.2.8.

-Reinhard


From: 
389-users-boun...@lists.fedoraproject.org<mailto:389-users-boun...@lists.fedoraproject.org>
 [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Noriko Hosoi
Sent: Thursday, May 19, 2011 5:33 PM
To: 389-us...@lists.fedoraproject.org<mailto:389-us...@lists.fedoraproject.org>
Subject: Re: [389-users] db import failure, when setting replication up

Hi Reinhard,

Could you tell me the OS version and Berkeley DB version (rpm -q db4)?

Could you run "/usr/lib[64]/dirsrv/slapd-ID/dbverify"?  Does it complain 
anything?  Especially, the ancestorid index?  If it does, you may want to 
re-create the corrupted index...
--noriko

Reinhard Nappert wrote:
Noriko,

I observed one more item, which does not bother me right now, but you may want 
to see:

I am not sure why and how it happened,  but I see the following message on the 
supplier:

[18/May/2011:13:59:50 -0400] NSMMReplicationPlugin - 
agmt="cn=supplier2consumer" (consumer:389): Consumer failed to replay change 
(uniq

Re: [389-users] db import failure, when setting replication up

2011-05-24 Thread Reinhard Nappert
I do that.

Now, I have two questions:

So, what db version do you recommend?

More importantly, is there a migration path or do I have to reload the existing 
data? I could see issues migrating replicated environments.

Thanks,
-Reinhard


From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Noriko Hosoi
Sent: Monday, May 23, 2011 1:42 PM
To: 389-us...@lists.fedoraproject.org
Subject: Re: [389-users] db import failure, when setting replication up

Hi Reinhard,

That was an unfortunate...  I was hoping you were using a newer version. :)  
You hit this bug.

Bug 472131<https://bugzilla.redhat.com/show_bug.cgi?id=472131> - dbverify: when 
a duplicate is large enough to have internal page(s), dbverify issues bogus 
out-of-order key errors

The bug was fixed by Sleepycat on db4.8.  And we ported the fix back to 4.3, 
but no chance to do so to 4.2.  So, we cannot use dbverify to check if the 
index file is healthy or not...  Could it be possible to reindex the ancestorid 
index and see if the error goes away?  (Or you could reinitialize the consumer? 
 That would be the cleanest)

Thanks,
--noriko

Reinhard Nappert wrote:
Hi Noriko,

I run it on a CentOS 4.4 box (Linux 2.6.24). I use the db 4.2 libs with all the 
patches.

Oh, yes dbverify does complain a lot. I see for all of the db files messages 
like:

[20/May/2011:11:03:05 -0400] DB verify - verify failed(-30976): 
/var/lib/dirsrv/slapd-ID/db/userRoot/cn.db4
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 2
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 5
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 8
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 10
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 13
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 16
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 19
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 21
[20/May/2011:11:03:07 -0400] DB verify - verify failed(-30976): 
/var/lib/dirsrv/slapd-ID/db/userRoot/parentid.db4
DB verify: Passed
This said, I guess I should re-index the entire db. Any idea, why this is 
happening?

Right now, I have a 2 MMR setup, where both masters also have a replication 
agreement to a third box, which is a dedicated consumer. I do run tests, where 
I perform simultaneously adds and deletes (not on the same object) on all three 
boxes. I just want to verify how replication behaves in 1.2.8.

-Reinhard


From: 
389-users-boun...@lists.fedoraproject.org<mailto:389-users-boun...@lists.fedoraproject.org>
 [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Noriko Hosoi
Sent: Thursday, May 19, 2011 5:33 PM
To: 389-us...@lists.fedoraproject.org<mailto:389-us...@lists.fedoraproject.org>
Subject: Re: [389-users] db import failure, when setting replication up

Hi Reinhard,

Could you tell me the OS version and Berkeley DB version (rpm -q db4)?

Could you run "/usr/lib[64]/dirsrv/slapd-ID/dbverify"?  Does it complain 
anything?  Especially, the ancestorid index?  If it does, you may want to 
re-create the corrupted index...
--noriko

Reinhard Nappert wrote:
Noriko,

I observed one more item, which does not bother me right now, but you may want 
to see:

I am not sure why and how it happened,  but I see the following message on the 
supplier:

[18/May/2011:13:59:50 -0400] NSMMReplicationPlugin - 
agmt="cn=supplier2consumer" (consumer:389): Consumer failed to replay change 
(uniqueid aea3731d-808711e0-83d5fdc8-f32b8f3c, CSN 4dd4085b00480004): 
Operations error. Will retry later.

And I see the following on the consumer:
[18/May/2011:13:59:29 -0400] - idl_new.c BAD 22, err=-30988 DB_PAGE_NOTFOUND: 
Requested page not found
[18/May/2011:13:59:29 -0400] - ancestorid BAD 13120, err=-30988 
DB_PAGE_NOTFOUND: Requested page not found

 Any idea, what happened there

Thanks,
-Reinhard




From: 
389-users-boun...@lists.fedoraproject.org<mailto:389-users-boun...@lists.fedoraproject.org>
 [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Noriko Hosoi
Sent: Tuesday, May 17, 2011 4:02 PM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] db import failure, when setting replication up

Hi Reinhard,

Reinhard Nappert wrote:
Hi Noriko,

I have to correct myself. The box which had the import issue was on a 1.2.7.5 
system. The other box was running 1.2.8.2.

So, it looks like you have fixed the issue with 1.2.8.2.
*relieved*  Thanks for testing it on 1.2.8.2!
--noriko

Thanks,
-Reinhard


From: 
389-users-boun...@lists.fedoraproject.org<mailto:389-users-boun...@lists.fedoraproject.org>
 [mailto:389-users-boun

Re: [389-users] db import failure, when setting replication up

2011-05-17 Thread Reinhard Nappert
1.2.8.2

-Reinhard


From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Noriko Hosoi
Sent: Tuesday, May 17, 2011 2:16 PM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] db import failure, when setting replication up

It looks to me you have hit this bug...  Which version of 389-ds-base you are 
running?
Bug 684996<https://bugzilla.redhat.com/show_bug.cgi?id=684996> - Exported 
tombstone cannot be imported correctly.
The patch should be in the version 1.2.8.2.
Thanks,
--noriko

On 05/17/2011 11:03 AM, Reinhard Nappert wrote:
Hi,

I have seen the following:

I set 2 systems up in MMR. Replication worked. For some reason, I needed to 
take one of the boxes out of the replication and disabled replication. Later 
on, I enabled it again and created the shadowing agreement to the other box. 
Now, I saw the following errors during the import of the db:

[17/May/2011:11:46:04 -0400] NSMMReplicationPlugin - multimaster_be_state_change
: replica o=base is going offline; disabling replication
[17/May/2011:11:46:07 -0400] - WARNING: Import is running with nsslapd-db-privat
e-import-mem on; No other process is allowed to access the database
[17/May/2011:11:46:08 -0400] - import userRoot: WARNING: Skipping entry "nsuniqu
eid=06869502-7fe011e0-8f589300-7e7b2163,ou=sample,o=base" which has no parent,
ending at line 0 of file "(bulk import)"
[17/May/2011:11:46:08 -0400] - import userRoot: WARNING: bad entry: ID 453
.

Any idea, what is going on there?

Thanks,
-Reinhard


--
389 users mailing list
389-us...@lists.fedoraproject.org<mailto:389-us...@lists.fedoraproject.org>
https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

[389-users] db import failure, when setting replication up

2011-05-17 Thread Reinhard Nappert
Hi,

I have seen the following:

I set 2 systems up in MMR. Replication worked. For some reason, I needed to 
take one of the boxes out of the replication and disabled replication. Later 
on, I enabled it again and created the shadowing agreement to the other box. 
Now, I saw the following errors during the import of the db:

[17/May/2011:11:46:04 -0400] NSMMReplicationPlugin - multimaster_be_state_change
: replica o=base is going offline; disabling replication
[17/May/2011:11:46:07 -0400] - WARNING: Import is running with nsslapd-db-privat
e-import-mem on; No other process is allowed to access the database
[17/May/2011:11:46:08 -0400] - import userRoot: WARNING: Skipping entry "nsuniqu
eid=06869502-7fe011e0-8f589300-7e7b2163,ou=sample,o=base" which has no parent,
ending at line 0 of file "(bulk import)"
[17/May/2011:11:46:08 -0400] - import userRoot: WARNING: bad entry: ID 453
.

Any idea, what is going on there?

Thanks,
-Reinhard
--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] MMR issue, when deleting the replica setup.

2011-05-10 Thread Reinhard Nappert
Rick,

it seems that the issue does not exist in 1.2.8.2. I just compiled and 
installed it.

If 1.2.8.2 is stable, I won't bother open a bug for it..

-Reinhard


From: Rich Megginson [mailto:rmegg...@redhat.com]
Sent: Monday, May 09, 2011 9:37 PM
To: General discussion list for the 389 Directory server project.
Cc: Reinhard Nappert
Subject: Re: [389-users] MMR issue, when deleting the replica setup.

On 05/09/2011 02:06 PM, Reinhard Nappert wrote:
Hi,

I noticed an issue with 389 DS 1.2.7.5, which I have not seen before. Here is 
what I do:
1. I create a two multi-master setup.
2. I don't perform any changes on the directory.
3. I delete the replica setup on both systems -- everything is fine.
4. I create a two multi-master setup.
5. Perform changes on both systems
6. Modifications get replicated.
7 I delete the replica setup. No I get the following error logs:

[09/May/2011:15:43:18 -0400] - import userRoot: Import complete.  Processed 446 
entries in 4 seconds. (111.50 entries/sec)
[09/May/2011:15:43:18 -0400] NSMMReplicationPlugin - 
multimaster_be_state_change: replica o=base is coming online; enabling 
replication
...
[09/May/2011:15:45:21 -0400] NSMMReplicationPlugin - agmt_delete: begin
[09/May/2011:15:45:22 -0400] NSMMReplicationPlugin - replica_config_delete: 
Warning: The changelog for replica o=BASE is no longer valid since the replica 
config is being deleted.  Removing the changelog.
[09/May/2011:15:45:22 -0400] NSMMReplicationPlugin - changelog program - 
_cl5Add Thread: invalid changelog state - 2 <== This is good!
[09/May/2011:15:45:27 -0400] - libdb: /changelogdb/7773fd02-7a7411e0-ac71f4b1-0fb2d026_4dc840d30002.db4: 
unable to flush: No such file or directory
[09/May/2011:15:45:27 -0400] - libdb: txn_checkpoint: failed to flush the 
buffer cache No such file or directory
[09/May/2011:15:45:27 -0400] - Serious Error---Failed to checkpoint database, 
err=2 (No such file or directory)
Of course, the changelog directory was gone. It looks to me that the server 
keeps this still somehow in memory.

I enabled the audit-logging: This is what I see there:

time: 20110509154521
dn: cn=changelog5,cn=config
changetype: delete
modifiersname: 
time: 20110509154522
dn: cn=agreement1,cn=replica,cn=o\3dbase,cn=mapping tree,cn=config
changetype: delete
modifiersname: 

time: 20110509154522
dn: cn=replica,cn=o\3dbase,cn=mapping tree,cn=config
changetype: delete
modifiersname: 
time: 20110509154522
dn: cn=o\3dbase,cn=mapping tree,cn=config
changetype: modify
replace: nsslapd-state
nsslapd-state: backend
-
replace: nsslapd-referral
-
replace: modifiersname
modifiersname: -
replace: modifytimestamp
-
replace: nsslapd-referral
-
replace: modifiersname
modifiersname: 
-
replace: modifytimestamp
modifytimestamp: 20110509194522Z
-

time: 20110509154605
dn: cn=uniqueid generator,cn=config
changetype: modify
replace: nsState
nsState:: AM+94nR64AH0sQ+y0CZxbAEA
-
replace: modifiersname
modifiersname: cn=server,cn=plugins,cn=config
-
replace: modifytimestamp
modifytimestamp: 20110509194605Z
-
Has somebody has seen this before.
No, please file a bug.

Thanks,
-Reinhard


--
389 users mailing list
389-us...@lists.fedoraproject.org<mailto:389-us...@lists.fedoraproject.org>
https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

[389-users] MMR issue, when deleting the replica setup.

2011-05-09 Thread Reinhard Nappert
Hi,

I noticed an issue with 389 DS 1.2.7.5, which I have not seen before. Here is 
what I do:
1. I create a two multi-master setup.
2. I don't perform any changes on the directory.
3. I delete the replica setup on both systems -- everything is fine.
4. I create a two multi-master setup.
5. Perform changes on both systems
6. Modifications get replicated.
7 I delete the replica setup. No I get the following error logs:

[09/May/2011:15:43:18 -0400] - import userRoot: Import complete.  Processed 446 
entries in 4 seconds. (111.50 entries/sec)
[09/May/2011:15:43:18 -0400] NSMMReplicationPlugin - 
multimaster_be_state_change: replica o=base is coming online; enabling 
replication
...
[09/May/2011:15:45:21 -0400] NSMMReplicationPlugin - agmt_delete: begin
[09/May/2011:15:45:22 -0400] NSMMReplicationPlugin - replica_config_delete: 
Warning: The changelog for replica o=BASE is no longer valid since the replica 
config is being deleted.  Removing the changelog.
[09/May/2011:15:45:22 -0400] NSMMReplicationPlugin - changelog program - 
_cl5Add Thread: invalid changelog state - 2 <== This is good!
[09/May/2011:15:45:27 -0400] - libdb: /changelogdb/7773fd02-7a7411e0-ac71f4b1-0fb2d026_4dc840d30002.db4: 
unable to flush: No such file or directory
[09/May/2011:15:45:27 -0400] - libdb: txn_checkpoint: failed to flush the 
buffer cache No such file or directory
[09/May/2011:15:45:27 -0400] - Serious Error---Failed to checkpoint database, 
err=2 (No such file or directory)
Of course, the changelog directory was gone. It looks to me that the server 
keeps this still somehow in memory.

I enabled the audit-logging: This is what I see there:

time: 20110509154521
dn: cn=changelog5,cn=config
changetype: delete
modifiersname: 
time: 20110509154522
dn: cn=agreement1,cn=replica,cn=o\3dbase,cn=mapping tree,cn=config
changetype: delete
modifiersname: 

time: 20110509154522
dn: cn=replica,cn=o\3dbase,cn=mapping tree,cn=config
changetype: delete
modifiersname: 
time: 20110509154522
dn: cn=o\3dbase,cn=mapping tree,cn=config
changetype: modify
replace: nsslapd-state
nsslapd-state: backend
-
replace: nsslapd-referral
-
replace: modifiersname
modifiersname: -
replace: modifytimestamp
-
replace: nsslapd-referral
-
replace: modifiersname
modifiersname: 
-
replace: modifytimestamp
modifytimestamp: 20110509194522Z
-

time: 20110509154605
dn: cn=uniqueid generator,cn=config
changetype: modify
replace: nsState
nsState:: AM+94nR64AH0sQ+y0CZxbAEA
-
replace: modifiersname
modifiersname: cn=server,cn=plugins,cn=config
-
replace: modifytimestamp
modifytimestamp: 20110509194605Z
-
Has somebody has seen this before.

Thanks,
-Reinhard
--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] Referral errors ....

2011-05-05 Thread Reinhard Nappert
If I find out, I let you know. 

-Original Message-
From: Rich Megginson [mailto:rmegg...@redhat.com] 
Sent: Thursday, May 05, 2011 10:36 AM
To: Reinhard Nappert
Cc: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] Referral errors 

On 05/05/2011 06:41 AM, Reinhard Nappert wrote:
> This is actually what I thought, too. It logs looked fine to me as well.
>
> Guess what, a restart of the LDAP server did get rid of the issue!
>
> For sure it would be nice to figure out how the system can get into this 
> state!
Yes, this is an odd problem.
> -Reinhard
>
> -Original Message-
> From: Rich Megginson [mailto:rmegg...@redhat.com]
> Sent: Wednesday, May 04, 2011 6:28 PM
> To: General discussion list for the 389 Directory server project.
> Cc: Reinhard Nappert
> Subject: Re: [389-users] Referral errors 
>
> On 05/04/2011 03:59 PM, Reinhard Nappert wrote:
>> I actually tried even a bit more 1+2+4+65536=65543.
>>
>> I tried to add the object uid=stibbons,ou=admins,o=operator s,o=UMC.
>>
>> I tried to add it at (from access file):
>> [04/May/2011:17:40:32 -0400] conn=83 op=51 SRCH 
>> base="uid=stibbons,ou=admins,o=o perators,o=UMC" scope=0 
>> filter="(objectClass=*)" attrs=ALL
>> [04/May/2011:17:40:32 -0400] conn=83 op=51 RESULT err=32 tag=101 
>> nentries=0 etim e=0
>> [04/May/2011:17:40:32 -0400] conn=83 op=52 ADD 
>> dn="uid=stibbons,ou=admins,o=oper ators,o=UMC"
>> [04/May/2011:17:40:32 -0400] conn=83 op=52 RESULT err=1 tag=105 
>> nentries=0 etime =0
>>
>> Here is what you see in errors:
>>
>> [04/May/2011:17:40:32 -0400] - mapping tree release backend : 
>> userRoot
>> [04/May/2011:17:40:32 -0400] - do_search
>> [04/May/2011:17:40:32 -0400] - SRCH
>> base="uid=stibbons,ou=admins,o=operators,o=UMC" scope=0 deref=3 
>> sizelimit=0 timelimit=0 attrsonly=0 filter="(objectClass=*)" attrs=ALL
>> [04/May/2011:17:40:32 -0400] - =>   get_ldapmessage_controls
>> [04/May/2011:17:40:32 -0400] -<= get_ldapmessage_controls no controls
>> [04/May/2011:17:40:32 -0400] - =>   slapi_control_present (looking for
>> 2.16.840.1.113730.3.4.3)
>> [04/May/2011:17:40:32 -0400] -<= slapi_control_present 0 (NO 
>> CONTROLS)
>> [04/May/2011:17:40:32 -0400] - mtn_lock : lock count : 1
>> [04/May/2011:17:40:32 -0400] - mapping tree selected backend :
>> userRoot
>> [04/May/2011:17:40:32 -0400] - mtn_unlock : lock count : 0
>> [04/May/2011:17:40:32 -0400] - =>   slapi_reslimit_get_integer_limit()
>> conn=0xfd42c678, handle=2
>> [04/May/2011:17:40:32 -0400] -<= slapi_reslimit_get_integer_limit()
>> returning NO VALUE
>> [04/May/2011:17:40:32 -0400] - =>   slapi_reslimit_get_integer_limit()
>> conn=0xfd42c678, handle=1
>> [04/May/2011:17:40:32 -0400] -<= slapi_reslimit_get_integer_limit()
>> returning NO VALUE
>> [04/May/2011:17:40:32 -0400] - =>   compute_limits: sizelimit=2000,
>> timelimit=3600
>> [04/May/2011:17:40:32 -0400] - Calling plugin 'ACL preoperation' #1 
>> type 403
>> [04/May/2011:17:40:32 -0400] - =>   slapi_control_present (looking for
>> 2.16.840.1.113730.3.4.12)
>> [04/May/2011:17:40:32 -0400] -<= slapi_control_present 0 (NO CONTROLS)
>> [04/May/2011:17:40:32 -0400] - =>   slapi_control_present (looking for
>> 2.16.840.1.113730.3.4.18)
>> [04/May/2011:17:40:32 -0400] -<= slapi_control_present 0 (NO 
>> CONTROLS)
>> [04/May/2011:17:40:32 -0400] - Calling plugin 'Legacy replication 
>> preoperation plugin' #3 type 403
>> [04/May/2011:17:40:32 -0400] - Calling plugin 'Multimaster 
>> replication preoperation plugin' #4 type 403
>> [04/May/2011:17:40:32 -0400] - =>   slapi_reslimit_get_integer_limit()
>> conn=0xfd42c678, handle=0
>> [04/May/2011:17:40:32 -0400] -<= slapi_reslimit_get_integer_limit()
>> returning NO VALUE
>> [04/May/2011:17:40:32 -0400] - =>   find_entry_internal
>> (dn=uid=stibbons,ou=admins,o=operators,o=umc) lock 0
>> [04/May/2011:17:40:32 -0400] - =>   dn2entry 
>> "uid=stibbons,ou=admins,o=operators,o=umc"
>> [04/May/2011:17:40:32 -0400] - =>   index_read( "entrydn" = 
>> "uid=stibbons,ou=admins,o=operators,o=umc" )
>> [04/May/2011:17:40:32 -0400] -indextype: "eq" indexmask: 0x2
>> [04/May/2011:17:40:32 -0400] -<= index_read 0 candidates
>> [04/May/2011:17:40:32 -0400] -<= dn2entry 0
>> [04/May/2011:17:40:32 -0400] - =>   dn2ancestor 
>> "uid=stibbons,ou=admins,o=operators,o=umc"
>

[389-users] Referral errors ....

2011-04-29 Thread Reinhard Nappert
Hi,

I have the following setup:

I have a 2 multimaster replication setup, where both masters also have a number 
of shadowing agreements to other consumers. The data gets replicated to all 
boxes and there are no issues. When I try to perform an update on the slaves, 
it works on all, but one. Meaning, the server sends back err=10, with the 
referral to one of the masters and the client automatically follows the 
referrals. Unfortunately, it does not works with one box:

When there is an attempt to write to the db, the server returns an error-code 
1, with the following message:
javax.naming.NamingException: [LDAP: error code 1 - Mapping tree node for 
o=base is set to return a referral, but no referral is configured for it];

This can also be seen in the access file:
[26/Apr/2011:05:35:45 -0300] conn=3418 op=13256 ADD dn="ou=test,o=base"
[26/Apr/2011:05:35:45 -0300] conn=3418 op=13256 RESULT err=1 tag=105 nentries=0 
etime=0
When I have a look at the configuration, it looks exactly like the others:

dn: cn="o=Base",cn=mapping tree,cn=config
objectClass: top
objectClass: extensibleObject
objectClass: nsMappingTree
cn: "o=Base"
nsslapd-state: referral on update
nsslapd-backend: userRoot
modifiersName: cn=server,cn=plugins,cn=config
modifyTimestamp: 20100721202730Z
nsslapd-referral: ldap://master-ld01:389/o=Base
nsslapd-referral: 
ldap://master-ld02:389/o=Base
numSubordinates: 1

dn: cn=replica,cn="o=Base",cn=mapping tree,cn=config
nsDS5ReplicaBindDN: cn=replication,cn=config
nsDS5ReplicaRoot: o=Base
nsDS5Flags: 0
nsDS5ReplicaType: 2
nsds5ReplicaPurgeDelay: 43200
objectClass: top
objectClass: nsDS5Replica
cn: replica
modifiersName: cn=Multimaster Replication Plugin,cn=plugins,cn=config
modifyTimestamp: 20110421052744Z
nsDS5ReplicaId: 65535
nsState:: //8AAADLv69NLSoIAA==
nsDS5ReplicaName: 59480b7e-94fb11df-9df8eeea-774385c0
nsDS5ReplicaReferral: ldap://master-ld01:389/o=Base
nsDS5ReplicaReferral: 
ldap://master-ld02:389/o=Base



I was wondering if someone has seen this kind of issue. Everything looks fine 
to me and I can not explain this behavior.



Right now, I can not reproduce this issue. I only see it in this one setup.



Thanks,

-Reinhard
--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

[389-users] Is there a way to upgrade to 1.2.7 without the subtree-rename feature switched on

2011-02-23 Thread Reinhard Nappert
Hi,

I want to upgrade from 1.1.2 to 1.2.7.5, but I am not interested in using the 
subtree-rename feature. Question: can I call sbin/setup-ds.pl -u with a 
parameter indicating that I do not want to have this feature.

Thanks,
-Reinhard
--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

[389-users] Export/import with 389 DS 1.2.7.5

2011-02-01 Thread Reinhard Nappert
Hi,

I have a working MM setup and I exported my db with db2ldif.pl with the -r 
option:

db2ldif.pl -D 'cn=Directory Manager' -w password -n userRoot -r -a 
/tmp/db_replica.ldif

The errors file do not indicate an issue:
[01/Feb/2011:09:23:59 -0500] - Beginning export of 'userRoot'
[01/Feb/2011:09:23:59 -0500] - export userRoot: Processed 1000 entries (10%).
[01/Feb/2011:09:23:59 -0500] - export userRoot: Processed 2000 entries (21%).
[01/Feb/2011:09:23:59 -0500] - export userRoot: Processed 3000 entries (32%).
[01/Feb/2011:09:24:00 -0500] - export userRoot: Processed 4000 entries (43%).
[01/Feb/2011:09:24:00 -0500] - export userRoot: Processed 5000 entries (54%).
[01/Feb/2011:09:24:00 -0500] - export userRoot: Processed 6000 entries (65%).
[01/Feb/2011:09:24:00 -0500] - export userRoot: Processed 7000 entries (76%).
[01/Feb/2011:09:24:00 -0500] - export userRoot: Processed 8000 entries (87%).
[01/Feb/2011:09:24:00 -0500] - export userRoot: Processed 9000 entries (98%).
[01/Feb/2011:09:24:00 -0500] - export userRoot: Processed 9160 entries (100%).
[01/Feb/2011:09:24:00 -0500] - Export finished.
and the ldif file itself looks fine to me as well.

Then, I tried to import the ldif file with

ldif2db.pl -D 'cn=Directory Manager' -w password -n userRoot -i 
/tmp/db_replica.ldif

This fails with the following errors log:
[01/Feb/2011:09:29:45 -0500] - Bringing userRoot offline...
[01/Feb/2011:09:29:45 -0500] NSMMReplicationPlugin - 
multimaster_be_state_change: replica o=umc is going offline; disabling 
replication
[01/Feb/2011:09:29:46 -0500] - entrycache_clear_int: there are still 1 entries 
in the entry cache. :/
[01/Feb/2011:09:29:49 -0500] - WARNING: Import is running with 
nsslapd-db-private-import-mem on; No other process is allowed to access the 
database
[01/Feb/2011:09:29:49 -0500] - import userRoot: Beginning import job...
[01/Feb/2011:09:29:49 -0500] - import userRoot: Index buffering is disabled.
[01/Feb/2011:09:29:49 -0500] - import userRoot: Processing file 
"/tmp/db_replica.ldif"
[01/Feb/2011:09:29:49 -0500] - BAD CACHE ASSERTION at 
../ldap/servers/slapd/back-ldbm/cache.c/883: e->ep_refcnt > 0

Any idea, what is going on there.

Thanks,
-Reinhard


--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] Replication with 1.2.7.5

2011-01-31 Thread Reinhard Nappert
Hi Rich,

I think I figured out what was going on. It looks like the 
share/dirsrv/data/template-dse.ldif file was incomplete after the build 
process. I did not go through the entire plugin list, but the cn=Multimaster 
Replication Plugin,cn=plugins,cn=config object was missing in the template 
file. I have no clue, how this can happen, but I also saw it at least once with 
the 1.2.6.1 code. Strange thing is that I called exactly the same configuartion 
parameters (I sue a shell script for that).

-Reinhard

-Original Message-
From: Rich Megginson [mailto:rmegg...@redhat.com] 
Sent: Monday, January 10, 2011 1:26 PM
To: Reinhard Nappert
Cc: 389-us...@lists.fedoraproject.org
Subject: Re: Replication with 1.2.7.5

On 01/10/2011 11:16 AM, Reinhard Nappert wrote:
> After I did set it to start and did do a ldapsearch and  
> nsds5beginreplicarefresh was still set to start. None of the other 
> replication attributes was set. It looks to me that the server did not do any 
> replication related operations.
Is this related to deleting then recreating replica configuration and/or a 
replication agreement?  I believe you reported a bug related to that.
> For now, I suggest to not "waste" any time on it, since I've got it working 
> with 1.2.6. Again, is there a compelling reason to switch to 1.2.7.5?
There were a few bugs that we fixed in 1.2.7.x that didn't make it into 
1.2.6.x.  But if 1.2.6 is working for you, then there is probably no compelling 
reason to switch.
> Once, I am done with my 1.2.x testing tasks, I will re-compile and build the 
> 1.2.7.5 code and you know.
>
> -Reinhard
>
> -Original Message-
> From: Rich Megginson [mailto:rmegg...@redhat.com]
> Sent: Monday, January 10, 2011 1:10 PM
> To: Reinhard Nappert
> Cc: 389-us...@lists.fedoraproject.org
> Subject: Re: Replication with 1.2.7.5
>
> On 01/10/2011 08:18 AM, Reinhard Nappert wrote:
>> Rich,
>>
>> I had log level set to 8192 and still there was nothing in errors.
> I've tried to reproduce the problem with the latest epel released
> 1.2.7.5 on RHEL 5, and with 1.2.7.5 built from source on RHEL 6 - in both 
> cases, I created the replication agreement, and did an ldapmodify to set 
> nsds5beginreplicarefresh: start - in both cases, the repl. init works.
>
> After doing the ldapmodify to set nsds5beginreplicarefresh: start, if you do 
> an ldapsearch of that entry, do you see that attribute?  What about the other 
> replication status attributes?
>> I did compile, build and install 1.2.6. With that, it seems to work.
>>
>> I need to do some tests with 1.2.6, before I can re-build 1.2.7.5 and try to 
>> re-produce.
>> Are there some compelling reasons to use 1.2.7.5, instead of going with 
>> 1.2.6?
>>
>> -Reinhard
>>
>> -Original Message-
>> From: Rich Megginson [mailto:rmegg...@redhat.com]
>> Sent: Friday, January 07, 2011 4:00 PM
>> To: Reinhard Nappert
>> Cc: 389-us...@lists.fedoraproject.org
>> Subject: Re: Replication with 1.2.7.5
>>
>> On 01/07/2011 01:52 PM, Reinhard Nappert wrote:
>>> No, it does not.
>> And no errors from ldapmodify?  What does it say in the directory server 
>> access log for the operation and result?  With log level 8192, is there 
>> anything in the errors log?
>>> -----Original Message-
>>> From: Rich Megginson [mailto:rmegg...@redhat.com]
>>> Sent: Friday, January 07, 2011 3:47 PM
>>> To: Reinhard Nappert
>>> Cc: 389-us...@lists.fedoraproject.org
>>> Subject: Re: Replication with 1.2.7.5
>>>
>>> On 01/07/2011 01:39 PM, Reinhard Nappert wrote:
>>>> Rich,
>>>>
>>>> I am not sure if I tested it with any 1.2.x release. I think, I did it, 
>>>> but this would have been some time back.
>>>>
>>>> It is really weird that I do not see anything in errors at all. Anyway, 
>>>> here are the ops from the access file:
>>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=1 ADD 
>>>> dn="cn=replica,cn=o\3DUMC,cn=mapping tree,cn=config"
>>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=1 RESULT err=0 tag=105
>>>> nentries=0 etime=0
>>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=2 ADD dn="cn=changelog5,cn=config"
>>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=2 RESULT err=0 tag=105
>>>> nentries=0 etime=0
>>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=3 MOD 
>>>> dn="cn=replica,cn=o\3DUMC,cn=mapping tree,cn=config"
>>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=3 RESULT err=0 tag=103
>>>> nentries=0 etime=0
>>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=4

Re: [389-users] Replication with 1.2.7.5

2011-01-13 Thread Reinhard Nappert
Rich,

I did configure and compile the 1.2.7.5 source from 
http://port389.org/sources/389-ds-base-1.2.7.5.tar.bz2 with the same options 
and libraries as the source of 1.2.6.1 
(http://port389.org/sources/389-ds-base-1.2.6.1.tar.bz2).

While the replication works as designed with 1.2.6.1, by using my 
administration application, the 1.2.7.5 DS does not react to the application at 
all.
Once I have a bit more time, I will provide you more info.

-Reinhard

-Original Message-
From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Reinhard Nappert
Sent: Monday, January 10, 2011 1:32 PM
To: Rich Megginson
Cc: 389-us...@lists.fedoraproject.org
Subject: Re: [389-users] Replication with 1.2.7.5

Yes, I did report a bug regarding the deletion of the replica configuration, 
but my testing are not related to this. I want to re-create the situation where 
the slapd process "freezes" with a 1.2.x release. Remember, you analyzed some 
coredumps of 1.1.2. I want to produce some cores with 1.2.x for that. I still 
need to get to the bottom of this one..

-Reinhard

-Original Message-
From: Rich Megginson [mailto:rmegg...@redhat.com]
Sent: Monday, January 10, 2011 1:26 PM
To: Reinhard Nappert
Cc: 389-us...@lists.fedoraproject.org
Subject: Re: Replication with 1.2.7.5

On 01/10/2011 11:16 AM, Reinhard Nappert wrote:
> After I did set it to start and did do a ldapsearch and  
> nsds5beginreplicarefresh was still set to start. None of the other 
> replication attributes was set. It looks to me that the server did not do any 
> replication related operations.
Is this related to deleting then recreating replica configuration and/or a 
replication agreement?  I believe you reported a bug related to that.
> For now, I suggest to not "waste" any time on it, since I've got it working 
> with 1.2.6. Again, is there a compelling reason to switch to 1.2.7.5?
There were a few bugs that we fixed in 1.2.7.x that didn't make it into 
1.2.6.x.  But if 1.2.6 is working for you, then there is probably no compelling 
reason to switch.
> Once, I am done with my 1.2.x testing tasks, I will re-compile and build the 
> 1.2.7.5 code and you know.
>
> -Reinhard
>
> -Original Message-
> From: Rich Megginson [mailto:rmegg...@redhat.com]
> Sent: Monday, January 10, 2011 1:10 PM
> To: Reinhard Nappert
> Cc: 389-us...@lists.fedoraproject.org
> Subject: Re: Replication with 1.2.7.5
>
> On 01/10/2011 08:18 AM, Reinhard Nappert wrote:
>> Rich,
>>
>> I had log level set to 8192 and still there was nothing in errors.
> I've tried to reproduce the problem with the latest epel released
> 1.2.7.5 on RHEL 5, and with 1.2.7.5 built from source on RHEL 6 - in both 
> cases, I created the replication agreement, and did an ldapmodify to set 
> nsds5beginreplicarefresh: start - in both cases, the repl. init works.
>
> After doing the ldapmodify to set nsds5beginreplicarefresh: start, if you do 
> an ldapsearch of that entry, do you see that attribute?  What about the other 
> replication status attributes?
>> I did compile, build and install 1.2.6. With that, it seems to work.
>>
>> I need to do some tests with 1.2.6, before I can re-build 1.2.7.5 and try to 
>> re-produce.
>> Are there some compelling reasons to use 1.2.7.5, instead of going with 
>> 1.2.6?
>>
>> -Reinhard
>>
>> -Original Message-
>> From: Rich Megginson [mailto:rmegg...@redhat.com]
>> Sent: Friday, January 07, 2011 4:00 PM
>> To: Reinhard Nappert
>> Cc: 389-us...@lists.fedoraproject.org
>> Subject: Re: Replication with 1.2.7.5
>>
>> On 01/07/2011 01:52 PM, Reinhard Nappert wrote:
>>> No, it does not.
>> And no errors from ldapmodify?  What does it say in the directory server 
>> access log for the operation and result?  With log level 8192, is there 
>> anything in the errors log?
>>> -----Original Message-
>>> From: Rich Megginson [mailto:rmegg...@redhat.com]
>>> Sent: Friday, January 07, 2011 3:47 PM
>>> To: Reinhard Nappert
>>> Cc: 389-us...@lists.fedoraproject.org
>>> Subject: Re: Replication with 1.2.7.5
>>>
>>> On 01/07/2011 01:39 PM, Reinhard Nappert wrote:
>>>> Rich,
>>>>
>>>> I am not sure if I tested it with any 1.2.x release. I think, I did it, 
>>>> but this would have been some time back.
>>>>
>>>> It is really weird that I do not see anything in errors at all. Anyway, 
>>>> here are the ops from the access file:
>>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=1 ADD 
>>>> dn="cn=replica,cn=o\3DUMC,cn=mapping tree,cn=config"
>>>>

Re: [389-users] Replication with 1.2.7.5

2011-01-10 Thread Reinhard Nappert
Yes, I did report a bug regarding the deletion of the replica configuration, 
but my testing are not related to this. I want to re-create the situation where 
the slapd process "freezes" with a 1.2.x release. Remember, you analyzed some 
coredumps of 1.1.2. I want to produce some cores with 1.2.x for that. I still 
need to get to the bottom of this one..

-Reinhard

-Original Message-
From: Rich Megginson [mailto:rmegg...@redhat.com] 
Sent: Monday, January 10, 2011 1:26 PM
To: Reinhard Nappert
Cc: 389-us...@lists.fedoraproject.org
Subject: Re: Replication with 1.2.7.5

On 01/10/2011 11:16 AM, Reinhard Nappert wrote:
> After I did set it to start and did do a ldapsearch and  
> nsds5beginreplicarefresh was still set to start. None of the other 
> replication attributes was set. It looks to me that the server did not do any 
> replication related operations.
Is this related to deleting then recreating replica configuration and/or a 
replication agreement?  I believe you reported a bug related to that.
> For now, I suggest to not "waste" any time on it, since I've got it working 
> with 1.2.6. Again, is there a compelling reason to switch to 1.2.7.5?
There were a few bugs that we fixed in 1.2.7.x that didn't make it into 
1.2.6.x.  But if 1.2.6 is working for you, then there is probably no compelling 
reason to switch.
> Once, I am done with my 1.2.x testing tasks, I will re-compile and build the 
> 1.2.7.5 code and you know.
>
> -Reinhard
>
> -Original Message-
> From: Rich Megginson [mailto:rmegg...@redhat.com]
> Sent: Monday, January 10, 2011 1:10 PM
> To: Reinhard Nappert
> Cc: 389-us...@lists.fedoraproject.org
> Subject: Re: Replication with 1.2.7.5
>
> On 01/10/2011 08:18 AM, Reinhard Nappert wrote:
>> Rich,
>>
>> I had log level set to 8192 and still there was nothing in errors.
> I've tried to reproduce the problem with the latest epel released
> 1.2.7.5 on RHEL 5, and with 1.2.7.5 built from source on RHEL 6 - in both 
> cases, I created the replication agreement, and did an ldapmodify to set 
> nsds5beginreplicarefresh: start - in both cases, the repl. init works.
>
> After doing the ldapmodify to set nsds5beginreplicarefresh: start, if you do 
> an ldapsearch of that entry, do you see that attribute?  What about the other 
> replication status attributes?
>> I did compile, build and install 1.2.6. With that, it seems to work.
>>
>> I need to do some tests with 1.2.6, before I can re-build 1.2.7.5 and try to 
>> re-produce.
>> Are there some compelling reasons to use 1.2.7.5, instead of going with 
>> 1.2.6?
>>
>> -Reinhard
>>
>> -Original Message-
>> From: Rich Megginson [mailto:rmegg...@redhat.com]
>> Sent: Friday, January 07, 2011 4:00 PM
>> To: Reinhard Nappert
>> Cc: 389-us...@lists.fedoraproject.org
>> Subject: Re: Replication with 1.2.7.5
>>
>> On 01/07/2011 01:52 PM, Reinhard Nappert wrote:
>>> No, it does not.
>> And no errors from ldapmodify?  What does it say in the directory server 
>> access log for the operation and result?  With log level 8192, is there 
>> anything in the errors log?
>>> -----Original Message-
>>> From: Rich Megginson [mailto:rmegg...@redhat.com]
>>> Sent: Friday, January 07, 2011 3:47 PM
>>> To: Reinhard Nappert
>>> Cc: 389-us...@lists.fedoraproject.org
>>> Subject: Re: Replication with 1.2.7.5
>>>
>>> On 01/07/2011 01:39 PM, Reinhard Nappert wrote:
>>>> Rich,
>>>>
>>>> I am not sure if I tested it with any 1.2.x release. I think, I did it, 
>>>> but this would have been some time back.
>>>>
>>>> It is really weird that I do not see anything in errors at all. Anyway, 
>>>> here are the ops from the access file:
>>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=1 ADD 
>>>> dn="cn=replica,cn=o\3DUMC,cn=mapping tree,cn=config"
>>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=1 RESULT err=0 tag=105
>>>> nentries=0 etime=0
>>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=2 ADD dn="cn=changelog5,cn=config"
>>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=2 RESULT err=0 tag=105
>>>> nentries=0 etime=0
>>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=3 MOD 
>>>> dn="cn=replica,cn=o\3DUMC,cn=mapping tree,cn=config"
>>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=3 RESULT err=0 tag=103
>>>> nentries=0 etime=0
>>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=4 ADD 
>>>> dn="cn=c4000-12c4000-2,cn=replica,cn=o\3DUMC,cn=mapping tree,cn=config"
&

Re: [389-users] Replication with 1.2.7.5

2011-01-10 Thread Reinhard Nappert
After I did set it to start and did do a ldapsearch and  
nsds5beginreplicarefresh was still set to start. None of the other replication 
attributes was set. It looks to me that the server did not do any replication 
related operations. For now, I suggest to not "waste" any time on it, since 
I've got it working with 1.2.6. Again, is there a compelling reason to switch 
to 1.2.7.5?

Once, I am done with my 1.2.x testing tasks, I will re-compile and build the 
1.2.7.5 code and you know.

-Reinhard

-Original Message-
From: Rich Megginson [mailto:rmegg...@redhat.com] 
Sent: Monday, January 10, 2011 1:10 PM
To: Reinhard Nappert
Cc: 389-us...@lists.fedoraproject.org
Subject: Re: Replication with 1.2.7.5

On 01/10/2011 08:18 AM, Reinhard Nappert wrote:
> Rich,
>
> I had log level set to 8192 and still there was nothing in errors.
I've tried to reproduce the problem with the latest epel released
1.2.7.5 on RHEL 5, and with 1.2.7.5 built from source on RHEL 6 - in both 
cases, I created the replication agreement, and did an ldapmodify to set 
nsds5beginreplicarefresh: start - in both cases, the repl. init works.

After doing the ldapmodify to set nsds5beginreplicarefresh: start, if you do an 
ldapsearch of that entry, do you see that attribute?  What about the other 
replication status attributes?
> I did compile, build and install 1.2.6. With that, it seems to work.
>
> I need to do some tests with 1.2.6, before I can re-build 1.2.7.5 and try to 
> re-produce.
> Are there some compelling reasons to use 1.2.7.5, instead of going with 1.2.6?
>
> -Reinhard
>
> -Original Message-
> From: Rich Megginson [mailto:rmegg...@redhat.com]
> Sent: Friday, January 07, 2011 4:00 PM
> To: Reinhard Nappert
> Cc: 389-us...@lists.fedoraproject.org
> Subject: Re: Replication with 1.2.7.5
>
> On 01/07/2011 01:52 PM, Reinhard Nappert wrote:
>> No, it does not.
> And no errors from ldapmodify?  What does it say in the directory server 
> access log for the operation and result?  With log level 8192, is there 
> anything in the errors log?
>> -Original Message-
>> From: Rich Megginson [mailto:rmegg...@redhat.com]
>> Sent: Friday, January 07, 2011 3:47 PM
>> To: Reinhard Nappert
>> Cc: 389-us...@lists.fedoraproject.org
>> Subject: Re: Replication with 1.2.7.5
>>
>> On 01/07/2011 01:39 PM, Reinhard Nappert wrote:
>>> Rich,
>>>
>>> I am not sure if I tested it with any 1.2.x release. I think, I did it, but 
>>> this would have been some time back.
>>>
>>> It is really weird that I do not see anything in errors at all. Anyway, 
>>> here are the ops from the access file:
>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=1 ADD 
>>> dn="cn=replica,cn=o\3DUMC,cn=mapping tree,cn=config"
>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=1 RESULT err=0 tag=105
>>> nentries=0 etime=0
>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=2 ADD dn="cn=changelog5,cn=config"
>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=2 RESULT err=0 tag=105
>>> nentries=0 etime=0
>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=3 MOD 
>>> dn="cn=replica,cn=o\3DUMC,cn=mapping tree,cn=config"
>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=3 RESULT err=0 tag=103
>>> nentries=0 etime=0
>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=4 ADD 
>>> dn="cn=c4000-12c4000-2,cn=replica,cn=o\3DUMC,cn=mapping tree,cn=config"
>>> [07/Jan/2011:15:17:13 -0500] conn=74 op=4 RESULT err=0 tag=105
>>> nentries=0 etime=0
>>>
>>> You see that the operations succeeded. Here is the result of the operations:
>>> dn: cn=o\3Dumc,cn=mapping tree,cn=config
>>> objectClass: top
>>> objectClass: extensibleObject
>>> objectClass: nsMappingTree
>>> cn: o=umc
>>> cn: "o=umc"
>>> nsslapd-state: backend
>>> nsslapd-backend: userRoot
>>>
>>> dn: cn=replica,cn=o\3DUMC,cn=mapping tree,cn=config
>>> nsDS5ReplicaBindDN: cn=replAdmin,cn=config
>>> nsDS5ReplicaRoot: o=UMC
>>> nsDS5ReplicaId: 4
>>> nsDS5Flags: 1
>>> nsDS5ReplicaType: 3
>>> nsds5ReplicaPurgeDelay: 43200
>>> objectClass: top
>>> objectClass: nsDS5Replica
>>> cn: replica
>>> nsDS5ReplicaReferral: ldap://c4000-2:389/o=UMC
>>>
>>> dn: cn=c4000-12c4000-2,cn=replica,cn=o\3DUMC,cn=mapping
>>> tree,cn=config
>>> nsDS5ReplicaBindDN: cn=replAdmin,cn=config
>>> nsDS5ReplicaTransportInfo: LDAP
>>> nsDS5ReplicaHost: c4000-2
>>> nsDS5ReplicaPort: 389
>>> objectClass: top
>>> objectClass: nsDS5ReplicationAgreement
&

Re: [389-users] slapd not responding

2010-11-22 Thread Reinhard Nappert
ok. I need to do a bit more testing, but disabling access logging may not solve 
the issue. I keep you updated


From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich Megginson
Sent: Monday, November 22, 2010 5:08 PM
To: 389-us...@lists.fedoraproject.org
Subject: Re: [389-users] slapd not responding

On 11/22/2010 03:00 PM, Reinhard Nappert wrote:
Should I open a bug for it?
Sure, but unless you can reproduce it with the latest code (1.2.6 or 1.2.7), 
it's going to be very difficult for us to fix it.

-Reinhard


From: 
389-users-boun...@lists.fedoraproject.org<mailto:389-users-boun...@lists.fedoraproject.org>
 [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich Megginson
Sent: Monday, November 22, 2010 4:49 PM
To: 389-us...@lists.fedoraproject.org<mailto:389-us...@lists.fedoraproject.org>
Subject: Re: [389-users] slapd not responding

On 11/22/2010 09:38 AM, Reinhard Nappert wrote:
Hi,

I have a 389 DS 1.1.2 server in Multi-Master mode. It happens that the server 
stops responding in some circumstances. When the server was in that state, I 
did a kill -11 on the pid in order to generate a coredump.

I got the following out of the core, by using gdb.

Any idea, what is going on on the server side. BTW, the server does not log 
anything during this time in either access nor errors.
Looks like the server is deadlocked in the access logging code.  I suppose you 
could try disabling access logging.

Thanks,
-Reinhard


--
389 users mailing list
389-us...@lists.fedoraproject.org<mailto:389-us...@lists.fedoraproject.org>
https://admin.fedoraproject.org/mailman/listinfo/389-users



--
389 users mailing list
389-us...@lists.fedoraproject.org<mailto:389-us...@lists.fedoraproject.org>
https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] slapd not responding

2010-11-22 Thread Reinhard Nappert
Should I open a bug for it?

-Reinhard


From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich Megginson
Sent: Monday, November 22, 2010 4:49 PM
To: 389-us...@lists.fedoraproject.org
Subject: Re: [389-users] slapd not responding

On 11/22/2010 09:38 AM, Reinhard Nappert wrote:
Hi,

I have a 389 DS 1.1.2 server in Multi-Master mode. It happens that the server 
stops responding in some circumstances. When the server was in that state, I 
did a kill -11 on the pid in order to generate a coredump.

I got the following out of the core, by using gdb.

Any idea, what is going on on the server side. BTW, the server does not log 
anything during this time in either access nor errors.
Looks like the server is deadlocked in the access logging code.  I suppose you 
could try disabling access logging.

Thanks,
-Reinhard


--
389 users mailing list
389-us...@lists.fedoraproject.org<mailto:389-us...@lists.fedoraproject.org>
https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] slapd not responding

2010-11-22 Thread Reinhard Nappert
Ok, I try disable the access logging. Let's see if I can reproduce.

Thanks,
-Reinhard


From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich Megginson
Sent: Monday, November 22, 2010 4:49 PM
To: 389-us...@lists.fedoraproject.org
Subject: Re: [389-users] slapd not responding

On 11/22/2010 09:38 AM, Reinhard Nappert wrote:
Hi,

I have a 389 DS 1.1.2 server in Multi-Master mode. It happens that the server 
stops responding in some circumstances. When the server was in that state, I 
did a kill -11 on the pid in order to generate a coredump.

I got the following out of the core, by using gdb.

Any idea, what is going on on the server side. BTW, the server does not log 
anything during this time in either access nor errors.
Looks like the server is deadlocked in the access logging code.  I suppose you 
could try disabling access logging.

Thanks,
-Reinhard


--
389 users mailing list
389-us...@lists.fedoraproject.org<mailto:389-us...@lists.fedoraproject.org>
https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] Multi-Master Replication

2010-10-04 Thread Reinhard Nappert
Thanks for the clarification. 

-Original Message-
From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich Megginson
Sent: Monday, October 04, 2010 11:04 AM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] Multi-Master Replication

Reinhard Nappert wrote:
> Rich, you mentioned in one of your answers regarding the limit of 
> Masters in a replicated environment , quote
>
> "There really isn't a limit. The limit was only for the old Red Hat 
> Directory Server, and only so far as customer support goes. The only 
> real hard limit is 65534 masters."
>
> I was wondering when this limit was gone. More specifically, does 
> Fedora Directory Server 1.1.2 already work without that limitation.
>
Yes.  It always did.  There never was a real technical limit of 4 masters - 
that was just for Red Hat Support.  MMR has always supported
65534 masters.
>
> Thanks,
>
> -Reinhard
>
> --
> --
>
> --
> 389 users mailing list
> 389-us...@lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users


Re: [389-users] 389 DS 1.2.6. and certificates

2010-09-28 Thread Reinhard Nappert
Yes, I built it myself on 4.4.

No, it does not make a difference when I change the files to read only, before 
I restart the server

 

-Original Message-
From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich Megginson
Sent: Tuesday, September 28, 2010 11:05 AM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] 389 DS 1.2.6. and certificates

Reinhard Nappert wrote:
> Hi,
> I built and installed the 389 Directory Server 1.2.6 on CentOS 4.4.
Do you mean 5.5?  Or did you build it yourself?
> The server works fine.
> Then, I generated the certs (using certutil) and imported them in the 
> cert-store. The certs are generated basically generated by the 
> setupssl2.sh script. When I list the certs afterwards, everything 
> looks fine:
>  
> certutil -L -d /etc/dirsrv/
> CA certificate   CTu,u,u
>  u,u,u
> However, when I restart the server, I get the following error and the 
> server does not come up anymore:
> [28/Sep/2010:10:45:40 -0400] - SSL alert: Security Initialization: NSS 
> initialization failed (Netscape Portable Runtime error -8174 - 
> security library: bad database.): certdir: /etc/dirsrv/
>  
> Not surprisingly, the certutil -L -d  comes up with the same error:
> certutil: function failed: security library: bad database.
>  
> Any idea, what goes wrong there?
Not sure.  After running the script to generate the certs, can you change the 
cert8.db, key3.db, and secmod.db files to be read only (mode 0400), before 
starting the directory server?  Does that help?
>  
> Thanks,
> -Reinhard
>
> --
> --
>
> --
> 389 users mailing list
> 389-us...@lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users


[389-users] 389 DS 1.2.6. and certificates

2010-09-28 Thread Reinhard Nappert
Hi,
I built and installed the 389 Directory Server 1.2.6 on CentOS 4.4. The server 
works fine.
Then, I generated the certs (using certutil) and imported them in the 
cert-store. The certs are generated basically generated by the setupssl2.sh 
script. When I list the certs afterwards, everything looks fine:

certutil -L -d /etc/dirsrv/
CA certificate   CTu,u,u
 u,u,u
However, when I restart the server, I get the following error and the server 
does not come up anymore:
[28/Sep/2010:10:45:40 -0400] - SSL alert: Security Initialization: NSS 
initialization failed (Netscape Portable Runtime error -8174 - security 
library: bad database.): certdir: /etc/dirsrv/

Not surprisingly, the certutil -L -d  comes up with the same error:
certutil: function failed: security library: bad database.

Any idea, what goes wrong there?

Thanks,
-Reinhard

--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] Multi-Master setup

2010-08-12 Thread Reinhard Nappert
Rich,

I did some additional tests regarding replicationIds.

Let's say, I just have two MM A <--> B. I start configuring the replica and 
agreement on A and assign id 1. Then I do the same for B with the id 2. 
Everything is fine. Then, I disable on both boxes the replication. Then, I 
start setting the same thing up, but I start with B and assign 1 as id. A gets 
2 as id assigned. Now, the replication fails with the message: "Unable to 
acquire replica: error: duplicate replica ID detected"

I am pretty sure that it has to do with the RUV entry 
"nsuniqueid=---,dc=your,dc=suffix", because it 
still shows:

dn: nsuniqueid=---, dc=your,dc=suffix
objectClass: top
objectClass: nsTombstone
objectClass: extensibleobject
nsds50ruv: {replicageneration} 4c6445e40001
nsds50ruv: {replica 1 ldap://A:389}
nsds50ruv: {replica 2 ldap://B:389}
nsruvReplicaLastModified: {replica 1 ldap://A:389} 
nsruvReplicaLastModified: {replica 2 ldap://B:389} 

My replica configuration objects use the correct ids (1 for B) and (2 for A).
All this said, I believe the server should internally delete the RUV entry, 
once the replica configuration object is deleted.

-Reinhard

-Original Message-
From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Reinhard Nappert
Sent: Thursday, August 12, 2010 2:56 PM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] Multi-Master setup

One more question:
Can I read/modify the  
nsuniqueid=---,dc=your,dc=suffix object, 
without being logged in as "Directory Manager". If so, what kind of aci's need 
I to set?

Thanks,
-Reinhard

-Original Message-
From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich Megginson
Sent: Wednesday, August 11, 2010 5:41 PM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] Multi-Master setup

Reinhard Nappert wrote:
> Actually I tried this.
>
> First, I just deleted the attributes  nsds50ruv, which was fine, from an ldap 
> operational point of view, but when I wanted to set replication up again, the 
> server complained with an operation error (nds50ruv attribute missing).
Right, you cannot just delete the attribute, you have to delete the entire 
entry.
> Then I though I just delete the entire entry 
> (nsuniqueid=---,dc=your,dc=suffix).
> This crashes the ldap server!
>
> Why does it not help, if I don't get rid of it? Since the info stays in 
> there, even after replication was disabled, I can not use this entry in order 
> to determine whether the server was already initialized, when I enable 
> replication for a second time.
>
> I think, I found a solution to this by using some objects from my internal 
> framework.
>
> However, the ldap server should not crash, when I try to delete this
> entry
>
I think we fixed that crashing bug a while ago.  Can you post a stack trace?
> -Reinhard
>
> -Original Message-
> From: 389-users-boun...@lists.fedoraproject.org
> [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich
> Megginson
> Sent: Wednesday, August 11, 2010 4:44 PM
> To: General discussion list for the 389 Directory server project.
> Subject: Re: [389-users] Multi-Master setup
>
> Reinhard Nappert wrote:
>
>> Thanks Rich.
>>
>> When does the server delete the RUV entry?
>> After I set the server back to "standalone", by removing the changelog 
>> entry, all agreements and the replica entry, I still see the RUV entry:
>> ldapsearch -D  -w  -b dc=your,dc=suffix
>> -x -LLL "(&(nsuniqueid=*)(objectclass=nsTombstone))" nsds50ruv
>> dn: nsuniqueid=---,dc=your,dc=suffix
>> nsds50ruv: {replicageneration} 4c61bf2e0001
>> nsds50ruv: {replica 7 ldap://yale:389}
>> nsds50ruv: {replica 6 ldap://mustrum:389}
>> nsds50ruv: {replica 1 ldap://louise:389} 4c62a60c0001
>> 4c62a60c0001
>> nsds50ruv: {replica 4 ldap://nix:389} 4c61c1720004
>> 4c62c59c0004
>> nsds50ruv: {replica 3 ldap://yale:389} 4c62a5c50003
>> 4c62a5f10003
>> nsds50ruv: {replica 2 ldap://mustrum:389}
>> nsds50ruv: {replica 8 ldap://nix:389} 4c62d0290008
>> 4c62efcc0008
>> nsds50ruv: {replica 5 ldap://louise:389} 4c62d1b90005
>> 4c62d1b90005
>>
>> I would expect that this would have been deleted by the server.
>>
> No, but you should be able to manually delete it.  It doesn't do anything i

Re: [389-users] Multi-Master setup

2010-08-12 Thread Reinhard Nappert
One more question:
Can I read/modify the  
nsuniqueid=---,dc=your,dc=suffix object, 
without being logged in as "Directory Manager". If so, what kind of aci's need 
I to set?

Thanks,
-Reinhard

-Original Message-
From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich Megginson
Sent: Wednesday, August 11, 2010 5:41 PM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] Multi-Master setup

Reinhard Nappert wrote:
> Actually I tried this.
>
> First, I just deleted the attributes  nsds50ruv, which was fine, from an ldap 
> operational point of view, but when I wanted to set replication up again, the 
> server complained with an operation error (nds50ruv attribute missing).
Right, you cannot just delete the attribute, you have to delete the entire 
entry.
> Then I though I just delete the entire entry 
> (nsuniqueid=---,dc=your,dc=suffix).
> This crashes the ldap server!
>
> Why does it not help, if I don't get rid of it? Since the info stays in 
> there, even after replication was disabled, I can not use this entry in order 
> to determine whether the server was already initialized, when I enable 
> replication for a second time.
>
> I think, I found a solution to this by using some objects from my internal 
> framework.
>
> However, the ldap server should not crash, when I try to delete this
> entry
>
I think we fixed that crashing bug a while ago.  Can you post a stack trace?
> -Reinhard
>
> -Original Message-
> From: 389-users-boun...@lists.fedoraproject.org
> [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich
> Megginson
> Sent: Wednesday, August 11, 2010 4:44 PM
> To: General discussion list for the 389 Directory server project.
> Subject: Re: [389-users] Multi-Master setup
>
> Reinhard Nappert wrote:
>
>> Thanks Rich.
>>
>> When does the server delete the RUV entry?
>> After I set the server back to "standalone", by removing the changelog 
>> entry, all agreements and the replica entry, I still see the RUV entry:
>> ldapsearch -D  -w  -b dc=your,dc=suffix
>> -x -LLL "(&(nsuniqueid=*)(objectclass=nsTombstone))" nsds50ruv
>> dn: nsuniqueid=---,dc=your,dc=suffix
>> nsds50ruv: {replicageneration} 4c61bf2e0001
>> nsds50ruv: {replica 7 ldap://yale:389}
>> nsds50ruv: {replica 6 ldap://mustrum:389}
>> nsds50ruv: {replica 1 ldap://louise:389} 4c62a60c0001
>> 4c62a60c0001
>> nsds50ruv: {replica 4 ldap://nix:389} 4c61c1720004
>> 4c62c59c0004
>> nsds50ruv: {replica 3 ldap://yale:389} 4c62a5c50003
>> 4c62a5f10003
>> nsds50ruv: {replica 2 ldap://mustrum:389}
>> nsds50ruv: {replica 8 ldap://nix:389} 4c62d0290008
>> 4c62efcc0008
>> nsds50ruv: {replica 5 ldap://louise:389} 4c62d1b90005
>> 4c62d1b90005
>>
>> I would expect that this would have been deleted by the server.
>>
> No, but you should be able to manually delete it.  It doesn't do anything if 
> you're using replication.
>
>> If not, this appraoch does not help.
>>
>>
> Why?
>
>> -Reinhard
>>
>> -Original Message-
>> From: 389-users-boun...@lists.fedoraproject.org
>> [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich
>> Megginson
>> Sent: Wednesday, August 11, 2010 12:32 PM
>> To: General discussion list for the 389 Directory server project.
>> Subject: Re: [389-users] Multi-Master setup
>>
>> Reinhard Nappert wrote:
>>
>>
>>> So,
>>> Is there a way to find out if a server was used for the initialization of 
>>> other servers?
>>>
>>>
>>>
>> You can query the RUV entry in the server:
>> ldapsearch -s one -b "dc=your,dc=suffix"
>> "(objectclass=nsTombstone)(nsuniqueid=---)"
>> The generation is a CSN.  The first 8 bytes are the timestamp.  The
>> next
>> 4 bytes are the sequence number.  The next 4 bytes are the replica ID of the 
>> original master.
>> If there is no RUV, or the generation is missing, the server has either not 
>> been configured for replication, or has not been initialized.
>>
>>
>>> I am still not convinced that this is the cause, because when I add another 
>>> server as a consumer (E) to A and I do a initReplication(E, A) I run into 
>>> the same issue.
>>>
>>>
>>>
>> I

Re: [389-users] Multi-Master setup

2010-08-11 Thread Reinhard Nappert
Thanks Rich.

When does the server delete the RUV entry?
After I set the server back to "standalone", by removing the changelog entry, 
all agreements and the replica entry, I still see the RUV entry:
ldapsearch -D  -w  -b dc=your,dc=suffix -x -LLL 
"(&(nsuniqueid=*)(objectclass=nsTombstone))" nsds50ruv
dn: nsuniqueid=---,dc=your,dc=suffix
nsds50ruv: {replicageneration} 4c61bf2e0001
nsds50ruv: {replica 7 ldap://yale:389}
nsds50ruv: {replica 6 ldap://mustrum:389}
nsds50ruv: {replica 1 ldap://louise:389} 4c62a60c0001 
4c62a60c0001
nsds50ruv: {replica 4 ldap://nix:389} 4c61c1720004 4c62c59c0004
nsds50ruv: {replica 3 ldap://yale:389} 4c62a5c50003 4c62a5f10003
nsds50ruv: {replica 2 ldap://mustrum:389}
nsds50ruv: {replica 8 ldap://nix:389} 4c62d0290008 4c62efcc0008
nsds50ruv: {replica 5 ldap://louise:389} 4c62d1b90005 
4c62d1b90005 

I would expect that this would have been deleted by the server. If not, this 
appraoch does not help.

-Reinhard

-Original Message-
From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich Megginson
Sent: Wednesday, August 11, 2010 12:32 PM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] Multi-Master setup

Reinhard Nappert wrote:
> So,
> Is there a way to find out if a server was used for the initialization of 
> other servers? 
>   
You can query the RUV entry in the server:
ldapsearch -s one -b "dc=your,dc=suffix" 
"(objectclass=nsTombstone)(nsuniqueid=---)"
The generation is a CSN.  The first 8 bytes are the timestamp.  The next
4 bytes are the sequence number.  The next 4 bytes are the replica ID of the 
original master.
If there is no RUV, or the generation is missing, the server has either not 
been configured for replication, or has not been initialized.
> I am still not convinced that this is the cause, because when I add another 
> server as a consumer (E) to A and I do a initReplication(E, A) I run into the 
> same issue.
>   
If you initReplication(A, D), then initReplication(E, A) you may run into the 
issue.
> -Reinhard
>
> -Original Message-
> From: 389-users-boun...@lists.fedoraproject.org 
> [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich 
> Megginson
> Sent: Wednesday, August 11, 2010 11:12 AM
> To: General discussion list for the 389 Directory server project.
> Subject: Re: [389-users] Multi-Master setup
>
> Reinhard Nappert wrote:
>   
>> Rick,
>>
>> Are you saying that, once I have replicated the data from A to B and from B 
>> to C and from C to D, I don't replicate it from D to A? If so, can you 
>> explain why? Anyway, this step works!
>>   
>> 
> If you replace the word "replicated" with "initialized", then yes, you don't 
> initialize from D to A.  Although it may work, I think it may introduce 
> subtle errors, such as the ones you see.
>   
>> So, 15 and 18 are up-to-date at this stage. Since the entire setup is done 
>> kind of automatically, the setting of nsds5BeginReplicaRefresh to start is 
>> always done, if the corresponding agreement exists on the remote box. Is 
>> there a way to find out when I have to set  nsds5BeginReplicaRefresh  to 
>> start? 
>>
>> In any case, this does not explain that I fix the issue by resetting 
>> nsds5BeginReplicaRefresh to start, once I run into this issue.
>>   
>> 
> I'm not exactly sure why you are seeing the errors you are seeing, nor 
> why you can fix the issue with start refresh.  I do know that you 
> should not re-initialize a server that has been used to initialize other 
> servers.
>   
>> -Reinhard
>>
>> -Original Message-
>> From: 389-users-boun...@lists.fedoraproject.org 
>> [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich 
>> Megginson
>> Sent: Wednesday, August 11, 2010 10:37 AM
>> To: General discussion list for the 389 Directory server project.
>> Subject: Re: [389-users] Multi-Master setup
>>
>> Reinhard Nappert wrote:
>>   
>> 
>>> To explain it a bit easier, I define two "methods":
>>> 1. createAgreement(): <-- creates locally replication 
>>> agreement for remote ldap
>>> nsDS5ReplicaType=3
>>> nsDS5Flags=1
>>> nsDS5ReplicaId=
>>> nsDS5ReplicaHost=
>>> nsDS5ReplicaTransportInfo=LDAP
>>> nsDS5ReplicaPort=389
>>> nsDS5ReplicaBindDN=
>>> nsDS5ReplicaBindMethod=SIMPLE
>>&

Re: [389-users] Multi-Master setup

2010-08-11 Thread Reinhard Nappert
So,
Is there a way to find out if a server was used for the initialization of other 
servers? 

I am still not convinced that this is the cause, because when I add another 
server as a consumer (E) to A and I do a initReplication(E, A) I run into the 
same issue.

-Reinhard

-Original Message-
From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich Megginson
Sent: Wednesday, August 11, 2010 11:12 AM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] Multi-Master setup

Reinhard Nappert wrote:
> Rick,
>
> Are you saying that, once I have replicated the data from A to B and from B 
> to C and from C to D, I don't replicate it from D to A? If so, can you 
> explain why? Anyway, this step works!
>   
If you replace the word "replicated" with "initialized", then yes, you don't 
initialize from D to A.  Although it may work, I think it may introduce subtle 
errors, such as the ones you see.
> So, 15 and 18 are up-to-date at this stage. Since the entire setup is done 
> kind of automatically, the setting of nsds5BeginReplicaRefresh to start is 
> always done, if the corresponding agreement exists on the remote box. Is 
> there a way to find out when I have to set  nsds5BeginReplicaRefresh  to 
> start? 
>
> In any case, this does not explain that I fix the issue by resetting 
> nsds5BeginReplicaRefresh to start, once I run into this issue.
>   
I'm not exactly sure why you are seeing the errors you are seeing, nor 
why you can fix the issue with start refresh.  I do know that you should 
not re-initialize a server that has been used to initialize other servers.
> -Reinhard
>
> -Original Message-
> From: 389-users-boun...@lists.fedoraproject.org 
> [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich Megginson
> Sent: Wednesday, August 11, 2010 10:37 AM
> To: General discussion list for the 389 Directory server project.
> Subject: Re: [389-users] Multi-Master setup
>
> Reinhard Nappert wrote:
>   
>> To explain it a bit easier, I define two "methods":
>> 1. createAgreement(): <-- creates locally replication 
>> agreement for remote ldap
>>  nsDS5ReplicaType=3
>>  nsDS5Flags=1
>>  nsDS5ReplicaId=
>>  nsDS5ReplicaHost=
>>  nsDS5ReplicaTransportInfo=LDAP
>>  nsDS5ReplicaPort=389
>>  nsDS5ReplicaBindDN=
>>  nsDS5ReplicaBindMethod=SIMPLE
>>  nsDS5ReplicaCredentials=
>>
>> 2. initReplication(, ): <-- modifies the 
>> existing remote replication agreement for the local ldap
>>  nsds5BeginReplicaRefresh=start
>>
>> So, the order is the following:
>> 1. On A: createAgreement(B)
>> 2. On B: createAgreement(A)
>> 3. On B: initReplication(B, A)
>> 4. On B: createAgreement(C)
>> 5. On C: createAgreement(B)
>> 6. On C: initReplication(C, B)
>> 7. On C: createAgreement(D)
>> 8. On D: createAgreement(C)
>> 9. On D: initReplication(D, C)
>> 10. On D: createAgreement(A)
>> 11. On A: createAgreement(D)
>> 12. On A: initReplication(A, D)
>>   
>> 
> 12 is a problem - you don't initialize the master (A) you started from
>   
>> Now, I have the ring A<-->B<-->C<-->D<-->A. All of this works fine!
>> Then, I want to create the cross-references from A to C and B to D 13. 
>> On A: createAgreement(C) 14. On C: createAgreement(A) 15. On C: 
>> initReplication(C, A)
>>   
>> 
> 15 is a problem - C has already been initialized
>   
>> After step 15, I run into this issue. The same thing happens, when I set B 
>> and D up.
>> 16. On B: createAgreement(D)
>> 17. On D: createAgreement(B)
>> 18. On D: initReplication(D, B)
>>   
>> 
> 18 is a problem - D has already been initialized
>   
>> -Reinhard
>>
>>
>> -Original Message-
>> From: 389-users-boun...@lists.fedoraproject.org 
>> [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich 
>> Megginson
>> Sent: Wednesday, August 11, 2010 10:09 AM
>> To: General discussion list for the 389 Directory server project.
>> Subject: Re: [389-users] Multi-Master setup
>>
>> Reinhard Nappert wrote:
>>   
>> 
>>> At first I create (besides the changelog and replica entry with 
>>> nsDS5ReplicaType=3, nsDS5Flags=1 and an unique nsDS5ReplicaId) the 
>>> shadowing agreement with nsDS5ReplicaHost=, 
>>> nsDS5ReplicaTransportInfo=LDAP, nsDS5ReplicaPort=389, 
>>> nsDS5ReplicaBindDN=, nsDS5ReplicaBindMethod=SIMPLE, 
>>> nsD

Re: [389-users] Multi-Master setup

2010-08-11 Thread Reinhard Nappert
Rick,

Are you saying that, once I have replicated the data from A to B and from B to 
C and from C to D, I don't replicate it from D to A? If so, can you explain 
why? Anyway, this step works!

So, 15 and 18 are up-to-date at this stage. Since the entire setup is done kind 
of automatically, the setting of nsds5BeginReplicaRefresh to start is always 
done, if the corresponding agreement exists on the remote box. Is there a way 
to find out when I have to set  nsds5BeginReplicaRefresh  to start? 

In any case, this does not explain that I fix the issue by resetting 
nsds5BeginReplicaRefresh to start, once I run into this issue.

-Reinhard

-Original Message-
From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich Megginson
Sent: Wednesday, August 11, 2010 10:37 AM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] Multi-Master setup

Reinhard Nappert wrote:
> To explain it a bit easier, I define two "methods":
> 1. createAgreement(): <-- creates locally replication 
> agreement for remote ldap
>   nsDS5ReplicaType=3
>   nsDS5Flags=1
>   nsDS5ReplicaId=
>   nsDS5ReplicaHost=
>   nsDS5ReplicaTransportInfo=LDAP
>   nsDS5ReplicaPort=389
>   nsDS5ReplicaBindDN=
>   nsDS5ReplicaBindMethod=SIMPLE
>   nsDS5ReplicaCredentials=
>
> 2. initReplication(, ):  <-- modifies the 
> existing remote replication agreement for the local ldap
>   nsds5BeginReplicaRefresh=start
>
> So, the order is the following:
> 1. On A: createAgreement(B)
> 2. On B: createAgreement(A)
> 3. On B: initReplication(B, A)
> 4. On B: createAgreement(C)
> 5. On C: createAgreement(B)
> 6. On C: initReplication(C, B)
> 7. On C: createAgreement(D)
> 8. On D: createAgreement(C)
> 9. On D: initReplication(D, C)
> 10. On D: createAgreement(A)
> 11. On A: createAgreement(D)
> 12. On A: initReplication(A, D)
>   
12 is a problem - you don't initialize the master (A) you started from
> Now, I have the ring A<-->B<-->C<-->D<-->A. All of this works fine!
> Then, I want to create the cross-references from A to C and B to D 13. 
> On A: createAgreement(C) 14. On C: createAgreement(A) 15. On C: 
> initReplication(C, A)
>   
15 is a problem - C has already been initialized
> After step 15, I run into this issue. The same thing happens, when I set B 
> and D up.
> 16. On B: createAgreement(D)
> 17. On D: createAgreement(B)
> 18. On D: initReplication(D, B)
>   
18 is a problem - D has already been initialized
>
> -Reinhard
>
>
> -Original Message-
> From: 389-users-boun...@lists.fedoraproject.org 
> [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich 
> Megginson
> Sent: Wednesday, August 11, 2010 10:09 AM
> To: General discussion list for the 389 Directory server project.
> Subject: Re: [389-users] Multi-Master setup
>
> Reinhard Nappert wrote:
>   
>> At first I create (besides the changelog and replica entry with 
>> nsDS5ReplicaType=3, nsDS5Flags=1 and an unique nsDS5ReplicaId) the shadowing 
>> agreement with nsDS5ReplicaHost=, 
>> nsDS5ReplicaTransportInfo=LDAP, nsDS5ReplicaPort=389, 
>> nsDS5ReplicaBindDN=, nsDS5ReplicaBindMethod=SIMPLE, 
>> nsDS5ReplicaCredentials= on both sides, let's say A 
>> and D (A first and then D).
>> Then, I do initiate the replication by setting nsds5BeginReplicaRefresh to 
>> start on A.
>>   
>> 
> And you do that for A -> B, A -> C?  How do you initialize D?
>   
>> -Reinhard
>>
>> -Original Message-
>> From: 389-users-boun...@lists.fedoraproject.org
>> [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich 
>> Megginson
>> Sent: Tuesday, August 10, 2010 5:57 PM
>> To: General discussion list for the 389 Directory server project.
>> Subject: Re: [389-users] Multi-Master setup
>>
>> Reinhard Nappert wrote:
>>   
>> 
>>> Rich,
>>>
>>> I have an setup like:
>>>
>>> A <-> B
>>>/\ \/ /\
>>> |  \  /   |
>>> |   \/|
>>> |  / \|
>>> | /   \   |
>>>/\/ \ /\
>>> D <-> C
>>>
>>> At first, I do set the agreements up for the Ring A to B to C to B to A. 
>>> This works. Then, I try to set the cross agreements from A to C and B to D 
>>> up. This is where I run into this issue.
>>>
>>> Let's have a look how I do those cross agreements. First I add an 
>>> agreement on A for C. This is fine. Then, I do the same on

Re: [389-users] Multi-Master setup

2010-08-10 Thread Reinhard Nappert
Rich,

I have an setup like:

A <-> B
   /\ \/ /\
|  \  /   |
|   \/|
|  / \|
| /   \   |
   /\/ \ /\
D <-> C

At first, I do set the agreements up for the Ring A to B to C to B to A. This 
works. Then, I try to set the cross agreements from A to C and B to D up. This 
is where I run into this issue.

Let's have a look how I do those cross agreements. First I add an agreement on 
A for C. This is fine. Then, I do the same on C (for A) and I get  the messages
NSMMReplicationPlugin - agmt="cn=nix2mustrum" (mustrum:389): Received error 89: 
NULL for total update operation 
On C and on A I get:
[10/Aug/2010:17:12:37 -0400] - somehow, there are still 16 entries in the entry 
cache. :/
[10/Aug/2010:17:12:38 -0400] - WARNING: Import is running with 
nsslapd-db-private-import-mem on; No other process is allowed to access the 
database
[10/Aug/2010:17:12:38 -0400] - BAD CACHE ASSERTION at 
../ldap/servers/slapd/back-ldbm/cache.c/765: e->ep_refcnt > 0


Hope, this helps.

Thanks,
-Reinhard
-Original Message-
From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Reinhard Nappert
Sent: Tuesday, August 10, 2010 2:42 PM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] Multi-Master setup

Rich, on the consumer, I see the following messages:

NSMMReplicationPlugin - agmt="cn=nix2mustrum" (mustrum:389): Received error 89: 
NULL for total update operation 

-Reinhard

-Original Message-
From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich Megginson
Sent: Tuesday, August 10, 2010 12:41 PM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] Multi-Master setup

Reinhard Nappert wrote:
> Hi,
>  
> I have seen the following message in the errors log file, when I set 
> MMR agreements up:
>  
> [10/Aug/2010:11:46:44 -0400] NSMMReplicationPlugin -
> repl_set_mtn_referrals: could not set referrals for replica o=base: 1
> [10/Aug/2010:11:46:44 -0400] NSMMReplicationPlugin -
> multimaster_be_state_change: replica o=base is going offline; 
> disabling replication
> [10/Aug/2010:11:46:46 -0400] - somehow, there are still 20 entries in 
> the entrycache. :/
> [10/Aug/2010:11:46:46 -0400] - WARNING: Import is running with 
> nsslapd-db-private-import-mem on; No other process is allowed to 
> access the database
> [10/Aug/2010:11:46:48 -0400] - BAD CACHE ASSERTION at
> ../ldap/servers/slapd/back-ldbm/cache.c/765: e->ep_refcnt > 0
> [10/Aug/2010:11:46:52 -0400] - Fedora-Directory/1.1.2 B2009.090.1643 
> starting up
> [10/Aug/2010:11:46:52 -0400] - Detected Disorderly Shutdown last time 
> DirectoryServer was running, recovering database.
>  
> After I re-initialize the database from the supplier (setting 
> attribute nsds5BeginReplicaRefresh to start of the agreement object), 
> the database gets correctly imported.
>  
> Any idea, what is going on?
No, not sure.  But if you can develop a reproducible test case, that would be 
helpful.
> Thanks,
> -Reinhard
>
> --
> --
>
> --
> 389 users mailing list
> 389-us...@lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users
--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users


[389-users] Referral not working...

2010-06-04 Thread Reinhard Nappert
Hi,

I configured a master slave setup where the userRoot db has an referral to the 
master configured. See dse.lif entry:

dn: cn="o=BASE",cn=mapping tree,cn=config
objectClass: top
objectClass: extensibleObject
objectClass: nsMappingTree
cn: "o=BASE"
nsslapd-state: referral on update
nsslapd-backend: userRoot
modifiersName: cn=server,cn=plugins,cn=config
modifyTimestamp: 20100604203934Z
nsslapd-referral: ldap://master:389/o=UMC
numSubordinates: 1
So, when I access the slave and try to add an object, I get the following error:

javax.naming.NamingException: [LDAP: error code 1 - Mapping tree node for 
o=base is set to return a referral, but no referral is configured for it].

This is weird, because you clearly see that the referral is configured.

The access file says:

[04/Jun/2010:16:40:18 -0400] conn=16 op=3 ADD dn="ou=test,o=base"
[04/Jun/2010:16:40:18 -0400] conn=16 op=3 RESULT err=10 tag=105 nentries=0 
etime=0

This is standard ldap stuff and I know that it worked before.

Any idea?

Thanks,

-Reinhard




--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] Skipped request ...

2010-05-13 Thread Reinhard Nappert
How can I make ns-slapd to produce a core.

I got it in the same state again, and I did a 
gdb /opt/UMC/jdb/sbin/ns-slapd 16712 

0x2b8e908889a2 in poll () from /lib64/tls/libc.so.6
(gdb) where
#0  0x2b8e908889a2 in poll () from /lib64/tls/libc.so.6
#1  0x2b8e906b6a5f in PR_Poll () from /opt/UMC/jdb/lib/dirsrv/libnspr4.so
#2  0x00415ae7 in slapd_daemon (ports=0x7fff1b44cf50)
at ../ldap/servers/slapd/daemon.c:662
#3  0x0041c0b3 in main (argc=7, argv=0x7fff1b44d098)
at ../ldap/servers/slapd/main.c:1162

Does this help?

-Reinhard

-Original Message-
From: 389-users-boun...@lists.fedoraproject.org 
[mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich Megginson
Sent: Thursday, May 13, 2010 6:04 PM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] Skipped request ...

Reinhard Nappert wrote:
> Hi Rick,
>
> I attached access and error file with debug level 8. The server does not 
> respond to any requests anymore. If you kill the client, it responds 
> afterwards.
>
> Let me know, what you see.
>   
I don't see anything obvious.  One thing I do know is that this code has been 
improved since 1.1.2 (especially the debugging, which not very usefully prints 
the file descriptor addresses in int format :P)  I don't suppose you could try 
to reproduce this with 1.2.5?
> Thanks,
> -Reinhard
>
> -Original Message-
> From: 389-users-boun...@lists.fedoraproject.org 
> [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich 
> Megginson
> Sent: Thursday, May 13, 2010 1:10 PM
> To: General discussion list for the 389 Directory server project.
> Subject: Re: [389-users] Skipped request ...
>
> Reinhard Nappert wrote:
>   
>> Rich, which debugging level do you suggest? Apparently, I tried to much, 
>> because it would crash the server constantly.
>> 
> Debugging levels should not crash the server - can provide more information 
> about the crash?
>   
>> For now, I go just with 8 (Connection Management). Seeing the problem, what 
>> would you enable?
>>   
>> 
> Yes, start with 8.
>   
>> Thanks,
>> -Reinhard
>>
>> -Original Message-
>> From: 389-users-boun...@lists.fedoraproject.org
>> [mailto:389-users-boun...@lists.fedoraproject.org] On Behalf Of Rich 
>> Megginson
>> Sent: Wednesday, May 12, 2010 6:50 PM
>> To: General discussion list for the 389 Directory server project.
>> Subject: Re: [389-users] Skipped request ...
>>
>> Reinhard Nappert wrote:
>>   
>> 
>>> Hi Rich,
>>>
>>> I ran some further tests. This entire thing looks kind of weird. I have a 
>>> kind of monitoring tool, I use to figure out if the server still responds 
>>> in a timely manner. This tool performs an anonymous bind and reads a 
>>> specific object, every 30 seconds.
>>> 
>>>   
>> Does it perform an unbind operation?  Does it disconnect the socket?
>>   
>> 
>>>  What I see is that the server responds to the incoming request and it 
>>> performs about 500 requests within those 30 seconds. Then, I see, when the 
>>> next monitoring connection request comes is, but I never see the bind. 
>>> Since this times out, the monitoring tool restarts the server after a while 
>>> (about 10 seconds).
>>>
>>> Here are the logs in access:
>>> [11/May/2010:22:12:20 -0400] conn=94 fd=83 slot=83 connection from
>>> 127.0.0.1 to 127.0.0.1
>>> [11/May/2010:22:13:24 -0400] conn=0 fd=64 slot=64 SSL connection 
>>> from
>>> 10.227.6.45 to 10.227.6.53
>>>
>>> So, you see the server does not respond to any requests after 
>>> [11/May/2010:22:12:20 -0400] conn=94 fd=83 slot=83 connection from
>>> 127.0.0.1 to 127.0.0.1
>>>
>>> And start responding, once it was restarted:
>>> [11/May/2010:22:13:24 -0400] conn=0 fd=64 slot=64 SSL connection 
>>> from
>>> 10.227.6.45 to 10.227.6.53
>>>
>>> I was wondering , if we could get somehow some debugging out of ns-slapd, 
>>> once it is in this state (truss or something else).
>>>   
>>> 
>>>   
>> http://directory.fedoraproject.org/wiki/FAQ#Troubleshooting
>> If that produces too much error log output, or kills the performance, 
>> you can also try replacing the error log with a named pipe+script - 
>> http://directory.fedoraproject.org/wiki/Named_Pipe_Log_Script
>> man ds-logpipe.py
>>   
>> 
>>> Any help is appreciated.
>>>
>>> Thanks,
>>> -Reinhard
&g

[389-users] Skipped request ...

2010-05-11 Thread Reinhard Nappert
Hi all,

I have seen a weird behavior of my DS (1.1.2). It has a very small database 
(only about 2300 objects). A client performed a one-level search retrieving the 
children. The server find 114 objects, but the search was very slow:

[06/May/2010:12:23:11 +] conn=127 op=149 SRCH base= scope=1 
filter="(&(&(objectClass=)(=value))(!(=TRUE)))"

yes, the filter is a bit complex, but both attribute types  and  
are indexed. This search usually is fast. It looks to me that the server is 
already in a funny state.
...
[06/May/2010:12:23:17 +] conn=127 op=149 RESULT err=3 tag=101 nentries=114 
etime=7

When the client gets the results, it iterates over those and gets its children, 
like:

[06/May/2010:12:23:17 +] conn=127 op=150 SRCH base= scope=1 
filter="(&(&(objectClass=)(=*))(!(=TRUE)))" attrs=ALL.
Those searches are quick:
[06/May/2010:12:23:17 +] conn=127 op=150 RESULT err=0 tag=101 nentries=1 
etime=0

but somehow the server does not process on of the requests, when the client 
iterates over the results:

[06/May/2010:12:23:18 +] conn=127 op=263 SRCH base= scope=1 
filter="(&(&(objectClass=)(=*))(!(=TRUE)))" attrs=ALL.
[06/May/2010:12:23:18 +] conn=127 op=263 RESULT err=0 tag=101 nentries=1 
etime=0
[06/May/2010:12:23:26 +] conn=127 op=265 SRCH base= scope=1 
filter="(&(&(objectClass=)(=*))(!(=TRUE)))" attrs=ALL.
[06/May/2010:12:23:26 +] conn=127 op=265 RESULT err=0 tag=101 nentries=0 
etime=0
You can see that the server skipped op=264. It looks to me that the request 
came in, but somehow the server joked up, before it could log the request in 
access.

Has anybody seen such a behavior before?

Thanks,
-Reinhard


--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users