Re: [ceph-users] Please help: change IP address of a cluster

2019-07-23 Thread Manuel Lausch
Hi,

I had to change the IPs of my cluster some time ago. The process was
quite easy.

I don't understand what you mean with configuring and deleting static
routes. The easies way is if the router allows (at least for the
change) all traffic between the old and the new network. 

I did the following steps.

1. Add the new IP Network space separated to the "public network" line
in your ceph.conf

2. OSDS: stop you OSDs on the first node. Reconfigure the host network
and start your OSDs again. Repeat this for all hosts one by one

3. MON: stop and remove one mon from cluster, delete all data
in /var/ceph/mon/mon. reconfigure the host network. Create the new
mon instance (don't forget the "mon host" entrys in your ceph.conf and
your clients as well)
Of course this requires at least 3 Mons in your cluster!
After 2 of 5 Mons in my cluster I added the new mon adresses to my
clients and restarted them. 

4. MGR: stop the mgr daemon. reconfigure the host network. Start the
mgr daemon one by one


I wouldn't recomend the "messy way" to reconfigure your mons. removing
and adding mons to the cluster is quite easy and in my opinion the most
secure.

The complet IP change in our cluster worked without outage while the
cluster was in production.

I hope I could help you.

Regards
Manuel



On Fri, 19 Jul 2019 10:22:37 +
"ST Wong (ITSC)"  wrote:

> Hi all,
> 
> Our cluster has to change to new IP range in same VLAN:  10.0.7.0/24
> -> 10.0.18.0/23, while IP address on private network for OSDs
> remains unchanged. I wonder if we can do that in either one following
> ways:
> 
> =
> 
> 1.
> 
> a.   Define static route for 10.0.18.0/23 on each node
> 
> b.   Do it one by one:
> 
> For each monitor/mgr:
> 
> -  remove from cluster
> 
> -  change IP address
> 
> -  add static route to original IP range 10.0.7.0/24
> 
> -  delete static route for 10.0.18.0/23
> 
> -  add back to cluster
> 
> For each OSD:
> 
> -  stop OSD daemons
> 
> -   change IP address
> 
> -  add static route to original IP range 10.0.7.0/24
> 
> -  delete static route for 10.0.18.0/23
> 
> -  start OSD daemons
> 
> c.   Clean up all static routes defined.
> 
> 
> 
> 2.
> 
> a.   Export and update monmap using the messy way as described in
> http://docs.ceph.com/docs/mimic/rados/operations/add-or-rm-mons/
> 
> 
> 
> ceph mon getmap -o {tmp}/{filename}
> 
> monmaptool -rm node1 -rm node2 ... --rm node n {tmp}/{filename}
> 
> monmaptool -add node1 v2:10.0.18.1:3330,v1:10.0.18.1:6789 -add node2
> v2:10.0.18.2:3330,v1:10.0.18.2:6789 ... --add nodeN
> v2:10.0.18.N:3330,v1:10.0.18.N:6789  {tmp}/{filename}
> 
> 
> 
> b.   stop entire cluster daemons and change IP addresses
> 
> 
> c.   For each mon node:  ceph-mon -I {mon-id} -inject-monmap
> {tmp}/{filename}
> 
> 
> 
> d.   Restart cluster daemons.
> 
> 
> 
> 3.   Or any better method...
> =
> 
> Would anyone please help?   Thanks a lot.
> Rgds
> /st wong
> 



-- 
Manuel Lausch

Systemadministrator
Storage Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
76135 Karlsruhe | Germany Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Alexander Charles, Thomas Ludwig, Jan Oetjen, Sascha
Vollmer


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat
sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie
bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem
bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern,
weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu
verwenden.

This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient of this e-mail, you are hereby
notified that saving, distribution or use of the content of this e-mail
in any way is prohibited. If you have received this e-mail in error,
please notify the sender and delete the e-mail.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] chown -R on every osd activating

2019-03-05 Thread Manuel Lausch
On Tue, 5 Mar 2019 11:04:16 +0100
Paul Emmerich  wrote:

> On Tue, Mar 5, 2019 at 10:51 AM Manuel Lausch
>  wrote:
> > Now after rebooting a host I see there is a chown -R ceph:ceph
> > running on each OSD before the OSD daemon starts.
> >
> > This takes a lot of time (-> millions of objects per OSD) and I
> > think this is unneccessary on each startup. In my opinion chowning
> > was a case with the update from hammer to jewel.
> >
> > I found this commit:
> > https://github.com/ceph/ceph/commit/100f2613a4659b3bd4e550250a41593860118010
> >
> > Is this intentional or is there a check missing if chown is
> > realy neccessary?  
> 
> This is clearly a bug; it should either have the recursive=False
> parameter set or explicitly chown the necessary files.
> I think it should be the former.
> 
> Please open an issue at http://tracker.ceph.com/
> 

Thanks. There is the ticket: http://tracker.ceph.com/issues/38581
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] chown -R on every osd activating

2019-03-05 Thread Manuel Lausch
Hi,

we recently updated to ceph luminous 12.2.11 after running in this bug:
http://tracker.ceph.com/issues/37784. But this is a other story.

Now after rebooting a host I see there is a chown -R ceph:ceph running
on each OSD before the OSD daemon starts.

This takes a lot of time (-> millions of objects per OSD) and I think
this is unneccessary on each startup. In my opinion chowning was a case
with the update from hammer to jewel.

I found this commit:
https://github.com/ceph/ceph/commit/100f2613a4659b3bd4e550250a41593860118010

Is this intentional or is there a check missing if chown is
realy neccessary?

Regards
Manuel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] migrate ceph-disk to ceph-volume fails with dmcrypt

2019-01-24 Thread Manuel Lausch



On Wed, 23 Jan 2019 16:32:08 +0100
Manuel Lausch  wrote:


> > 
> > The key api for encryption is *very* odd and a lot of its quirks are
> > undocumented. For example, ceph-volume is stuck supporting naming
> > files and keys 'lockbox'
> > (for backwards compatibility) but there is no real lockbox anymore.
> > Another quirk is that when storing the secret in the monitor, it is
> > done using the following convention:
> > 
> > dm-crypt/osd/{OSD FSID}/luks
> > 
> > The 'luks' part there doesn't indicate anything about the type of
> > encryption (!!) so regardless of the type of encryption (luks or
> > plain) the key would still go there.
> > 
> > If you manage to get the keys into the monitors you still wouldn't
> > be able to scan OSDs to produce the JSON files, but you would be
> > able to create the JSON file with the
> > metadata that ceph-volume needs to run the OSD.  
> 
> I think it is not that problem to create the json files by myself.
> Moving the Keys to the monitors and creating appropriate auth-keys
> should be more or less easy as well.
> 
> The problem I see is, that there are individual keys for the journal
> and data partition while the new process useses only one key for both
> partitions. 
> 
> maybe I can recreate the journal partition with the other key. But is
> this possible? Are there important data ramaining on the journal after
> clean stopping the OSD which I cannot throw away without trashing the
> whole OSD?
> 

Ok with a new empty journal the OSD will not start. I have now rescued
the data with dd and the recrypt it with a other key and copied the
data back. This worked so far

Now I encoded the key with base64 and put it to the key-value store.
Also created the neccessary authkeys. Creating the json File by hand
was quiet easy.

But now there is one problem.
ceph-disk opens the crypt like
cryptsetup --key-file /etc/ceph/dmcrypt-keys/foobar ...
ceph-volume pipes the key via stdin like this
cat foobar | cryptsetup --key-file - ...

The big problem. if the key is given via stdin cryptsetup hashes this
key per default with some hash. Only if I set --hash plain it works. I
think this is a bug in ceph-volume. 

Can someone confirm this?

there is the related code I mean in ceph-volume
https://github.com/ceph/ceph/blob/v12.2.10/src/ceph-volume/ceph_volume/util/encryption.py#L59

Regards
Manuel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] migrate ceph-disk to ceph-volume fails with dmcrypt

2019-01-23 Thread Manuel Lausch
On Wed, 23 Jan 2019 08:11:31 -0500
Alfredo Deza  wrote:


> I don't know how that would look like, but I think it is worth a try
> if re-deploying OSDs is not feasible for you.

yes, is there a working way to migrate this I will have a try it.

> 
> The key api for encryption is *very* odd and a lot of its quirks are
> undocumented. For example, ceph-volume is stuck supporting naming
> files and keys 'lockbox'
> (for backwards compatibility) but there is no real lockbox anymore.
> Another quirk is that when storing the secret in the monitor, it is
> done using the following convention:
> 
> dm-crypt/osd/{OSD FSID}/luks
> 
> The 'luks' part there doesn't indicate anything about the type of
> encryption (!!) so regardless of the type of encryption (luks or
> plain) the key would still go there.
> 
> If you manage to get the keys into the monitors you still wouldn't be
> able to scan OSDs to produce the JSON files, but you would be able to
> create the JSON file with the
> metadata that ceph-volume needs to run the OSD.

I think it is not that problem to create the json files by myself.
Moving the Keys to the monitors and creating appropriate auth-keys
should be more or less easy as well.

The problem I see is, that there are individual keys for the journal
and data partition while the new process useses only one key for both
partitions. 

maybe I can recreate the journal partition with the other key. But is
this possible? Are there important data ramaining on the journal after
clean stopping the OSD which I cannot throw away without trashing the
whole OSD?



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] migrate ceph-disk to ceph-volume fails with dmcrypt

2019-01-23 Thread Manuel Lausch
On Wed, 23 Jan 2019 14:25:00 +0100
Jan Fajerski  wrote:


> I might be wrong on this, since its been a while since I played with
> that. But iirc you can't migrate a subset of ceph-disk OSDs to
> ceph-volume on one host. Once you run ceph-volume simple activate,
> the ceph-disk systemd units and udev profiles will be disabled. While
> the remaining ceph-disk OSDs will continue to run, they won't come up
> after a reboot. I'm sure there's a way to get them running again, but
> I imagine you'd rather not manually deal with that.


yes you are right. The activate disables system wide the ceph-disk.
This is done by symlinking /etc/systemd/system/ceph-disk@.service
to /dev/null. 
After deleting this symlink my OSDs started again after reboot.
The startup processes from ceph-volume and ceph-disk might conflicts
each other but on a QA system this did work.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] migrate ceph-disk to ceph-volume fails with dmcrypt

2019-01-23 Thread Manuel Lausch
Hi,

thats a bad news.

round about 5000 OSDs are affected from this issue. It's not realy a
solution to redeploy this OSDs.

Is it possible to migrate the local keys to the monitors?
I see that the OSDs with the "lockbox feature" has only one key for
data and journal partition and the older OSDs have individual keys for
journal and data. Might this be a problem?

And a other question.
Is it a good idea to mix ceph-disk and ceph-volume managed OSDSs on one
host?
So I could only migrate newer OSDs to ceph-volume and deploy new
ones (after disk replacements) with ceph-volume until hopefuly there is
a solution.

Regards
Manuel


On Tue, 22 Jan 2019 07:44:02 -0500
Alfredo Deza  wrote:


> This is one case we didn't anticipate :/ We supported the wonky
> lockbox setup and thought we wouldn't need to go further back,
> although we did add support for both
> plain and luks keys.
> 
> Looking through the code, it is very tightly couple to
> storing/retrieving keys from the monitors, and I don't know what
> workarounds might be possible here other than throwing away the OSD
> and deploying a new one (I take it this is not an option for you at
> all)
> 
> 
Manuel Lausch

Systemadministrator
Storage Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
76135 Karlsruhe | Germany Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Thomas Ludwig, Jan Oetjen, Sascha Vollmer


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat
sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie
bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem
bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern,
weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu
verwenden.

This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient of this e-mail, you are hereby
notified that saving, distribution or use of the content of this e-mail
in any way is prohibited. If you have received this e-mail in error,
please notify the sender and delete the e-mail.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] migrate ceph-disk to ceph-volume fails with dmcrypt

2019-01-22 Thread Manuel Lausch
turned non-zero exit status: 32



ceph version 12.2.10 (177915764b752804194937482a39e95e0ca3de94)
luminous (stable)


Regards
Manuel




-- 
Manuel Lausch

Systemadministrator
Storage Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 
Karlsruhe | Germany
Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Thomas Ludwig, Jan Oetjen, Sascha Vollmer


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen 
enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail 
irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten 
Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, 
diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise 
auch immer zu verwenden.

This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient of this e-mail, you are hereby notified that saving, 
distribution or use of the content of this e-mail in any way is prohibited. If 
you have received this e-mail in error, please notify the sender and delete the 
e-mail.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] tunable question

2017-10-02 Thread Manuel Lausch
Hi, 

We have similar issues.
After upgradeing from hammer to jewel the tunable "choose leave stabel"
was introduces. If we activate it nearly all data will be moved. The
cluster has 2400 OSD on 40 nodes over two datacenters and is filled with
2,5 PB Data. 

We tried to enable it but the backfillingtraffic is to high to be
handled without impacting other services on the Network.

Do someone know if it is neccessary to enable this tunable? And could
it be a problem in the future if we want to upgrade to newer versions
wihout it enabled?

Regards,
Manuel Lausch

Am Thu, 28 Sep 2017 10:29:58 +0200
schrieb Dan van der Ster :

> Hi,
> 
> How big is your cluster and what is your use case?
> 
> For us, we'll likely never enable the recent tunables that need to
> remap *all* PGs -- it would simply be too disruptive for marginal
> benefit.
> 
> Cheers, Dan
> 
> 
> On Thu, Sep 28, 2017 at 9:21 AM, mj  wrote:
> > Hi,
> >
> > We have completed the upgrade to jewel, and we set tunables to
> > hammer. Cluster again HEALTH_OK. :-)
> >
> > But now, we would like to proceed in the direction of luminous and
> > bluestore OSDs, and we would like to ask for some feedback first.
> >
> > From the jewel ceph docs on tubables: "Changing tunable to
> > "optimal" on an existing cluster will result in a very large amount
> > of data movement as almost every PG mapping is likely to change."
> >
> > Given the above, and the fact that we would like to proceed to
> > luminous/bluestore in the not too far away future: What is cleverer:
> >
> > 1 - keep the cluster at tunable hammer now, upgrade to luminous in
> > a little while, change OSDs to bluestore, and then set tunables to
> > optimal
> >
> > or
> >
> > 2 - set tunable to optimal now, take the impact of "almost all PG
> > remapping", and when that is finished, upgrade to luminous,
> > bluestore etc.
> >
> > Which route is the preferred one?
> >
> > Or is there a third (or fourth?) option..? :-)
> >
> > MJ
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Manuel Lausch

Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
76135 Karlsruhe | Germany Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Thomas Ludwig, Jan Oetjen


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat
sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie
bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem
bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern,
weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu
verwenden.

This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient of this e-mail, you are hereby
notified that saving, distribution or use of the content of this e-mail
in any way is prohibited. If you have received this e-mail in error,
please notify the sender and delete the e-mail.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Very slow start of osds after reboot

2017-09-20 Thread Manuel Lausch
Hi, 

I have the same issue with Ceph Jewel (10.2.9), RedHat7 and dmcrypt
Is there any fix or at least a workaround available ?

Regards,
Manuel 


Am Thu, 31 Aug 2017 16:24:10 +0200
schrieb Piotr Dzionek :

> Hi,
> 
> For a last 3 weeks I have been running latest LTS Luminous Ceph
> release on CentOS7. It started with 4th RC and now I have Stable
> Release. Cluster runs fine, however I noticed that if I do a reboot
> of one the nodes, it takes a really long time for cluster to be in ok
> status. Osds are starting up, but not as soon as the server is up.
> They are up one by one during a period of 5 minutes. I checked the
> logs and all osds have following errors.
> 
> 
> 
> As you can see the xfs volume(the part with meta-data) is not mounted 
> yet. My question here, what mounts it and why it takes so long ?
> Maybe there is a setting that randomizes the start up process of osds
> running on the same node?
> 
> Kind regards,
> Piotr Dzionek
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Manuel Lausch

Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
76135 Karlsruhe | Germany Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Thomas Ludwig, Jan Oetjen


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat
sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie
bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem
bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern,
weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu
verwenden.

This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient of this e-mail, you are hereby
notified that saving, distribution or use of the content of this e-mail
in any way is prohibited. If you have received this e-mail in error,
please notify the sender and delete the e-mail.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-osd restartd via systemd in case of disk error

2017-09-19 Thread Manuel Lausch
Am Tue, 19 Sep 2017 08:24:48 +
schrieb Adrian Saul :

> > I understand what you mean and it's indeed dangerous, but see:
> > https://github.com/ceph/ceph/blob/master/systemd/ceph-osd%40.service
> >
> > Looking at the systemd docs it's difficult though:
> > https://www.freedesktop.org/software/systemd/man/systemd.service.ht
> > ml
> >
> > If the OSD crashes due to another bug you do want it to restart.
> >
> > But for systemd it's not possible to see if the crash was due to a
> > disk I/O- error or a bug in the OSD itself or maybe the OOM-killer
> > or something.
> 
> Perhaps using something like RestartPreventExitStatus and defining a
> specific exit code for the OSD to exit on when it is exiting due to
> an IO error.

A other idea: The OSD daemon keeps running in a defined error state
and only stops the listeners with other OSDs and the clients. 


-- 
Manuel Lausch

Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
76135 Karlsruhe | Germany Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Thomas Ludwig, Jan Oetjen


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat
sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie
bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem
bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern,
weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu
verwenden.

This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient of this e-mail, you are hereby
notified that saving, distribution or use of the content of this e-mail
in any way is prohibited. If you have received this e-mail in error,
please notify the sender and delete the e-mail.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-osd restartd via systemd in case of disk error

2017-09-19 Thread Manuel Lausch
Hi,

I see a issue with systemd's restart behaviour and disk IO-errors 
If a disk fails with IO-errors ceph-osd stops running. Systemd detects
this and starts the daemon again. In our cluster I did see some loops
with osd crashes caused by disk failure and restarts triggerd by
systemd. Every time with peering impact and timeouts to our application
until systemd gave up.

Obviously ceph needs the restart feature (at least with dmcrypt) to
avoid raceconditions In the startup process. But in the
case of disk related failures this is contraproductive. 

What do you think about this? Is this a bug which should be fixed?

We use ceph jewel (10.2.9)


Regards
Manuel 


-- 
Manuel Lausch

Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
76135 Karlsruhe | Germany Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Thomas Ludwig, Jan Oetjen


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat
sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie
bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem
bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern,
weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu
verwenden.

This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient of this e-mail, you are hereby
notified that saving, distribution or use of the content of this e-mail
in any way is prohibited. If you have received this e-mail in error,
please notify the sender and delete the e-mail.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Blocked requests problem

2017-08-23 Thread Manuel Lausch
Hi,

Sometimes we have the same issue on our 10.2.9 Cluster. (24 Nodes á 60
OSDs)

I think there is some racecondition or something like that
which results in this state. The blocking requests starts exactly at
the time the PG begins to scrub. 

you can try the following. The OSD will automaticaly recover and the
blocked requests will disapear.

ceph osd down 31 


In my opinion this is a bug but I have note investigated so far. Mayby
some developer can say something about this issue 


Regards,
Manuel


Am Tue, 22 Aug 2017 16:20:14 +0300
schrieb Ramazan Terzi :

> Hello,
> 
> I have a Ceph Cluster with specifications below:
> 3 x Monitor node
> 6 x Storage Node (6 disk per Storage Node, 6TB SATA Disks, all disks
> have SSD journals) Distributed public and private networks. All NICs
> are 10Gbit/s osd pool default size = 3
> osd pool default min size = 2
> 
> Ceph version is Jewel 10.2.6.
> 
> My cluster is active and a lot of virtual machines running on it
> (Linux and Windows VM's, database clusters, web servers etc).
> 
> During normal use, cluster slowly went into a state of blocked
> requests. Blocked requests periodically incrementing. All OSD's seems
> healthy. Benchmark, iowait, network tests, all of them succeed.
> 
> Yerterday, 08:00:
> $ ceph health detail
> HEALTH_WARN 3 requests are blocked > 32 sec; 3 osds have slow requests
> 1 ops are blocked > 134218 sec on osd.31
> 1 ops are blocked > 134218 sec on osd.3
> 1 ops are blocked > 8388.61 sec on osd.29
> 3 osds have slow requests
> 
> Todat, 16:05:
> $ ceph health detail
> HEALTH_WARN 32 requests are blocked > 32 sec; 3 osds have slow
> requests 1 ops are blocked > 134218 sec on osd.31
> 1 ops are blocked > 134218 sec on osd.3
> 16 ops are blocked > 134218 sec on osd.29
> 11 ops are blocked > 67108.9 sec on osd.29
> 2 ops are blocked > 16777.2 sec on osd.29
> 1 ops are blocked > 8388.61 sec on osd.29
> 3 osds have slow requests
> 
> $ ceph pg dump | grep scrub
> dumped all in format plain
> pg_stat   objects mip degrmisp
> unf   bytes   log disklog state
> state_stamp   v   reportedup
> up_primaryacting  acting_primary
> last_scrubscrub_stamp last_deep_scrub
> deep_scrub_stamp 20.1e25183   0   0   0
> 0 98332537930 30663066
> active+clean+scrubbing2017-08-21 04:55:13.354379
> 6930'23908781 6930:20905696   [29,31,3]   29
> [29,31,3] 29  6712'22950171   2017-08-20
> 04:46:59.208792   6712'22950171   2017-08-20 04:46:59.208792
> 
> Active scrub does not finish (about 24 hours). I did not restart any
> OSD meanwhile. I'm thinking set noscrub, noscrub-deep, norebalance,
> nobackfill, and norecover flags and restart 3,29,31th OSDs. Is this
> solve my problem? Or anyone has suggestion about this problem?
> 
> Thanks,
> Ramazan
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Manuel Lausch

Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
76135 Karlsruhe | Germany Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Thomas Ludwig, Jan Oetjen


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat
sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie
bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem
bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern,
weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu
verwenden.

This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient of this e-mail, you are hereby
notified that saving, distribution or use of the content of this e-mail
in any way is prohibited. If you have received this e-mail in error,
please notify the sender and delete the e-mail.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] purpose of ceph-mgr daemon

2017-06-14 Thread Manuel Lausch
Hi,

we decided to test a bit the upcoming ceph release (luminous). It seems
that I need to install this ceph-mgr daemon as well. But I don't
understand exactly why I need this service and what I can do with it.

The ceph Cluster is working well without installing any manager daemon.
However in the ceph status output there is a health error ("no active
mgr") 

My questions:
Can we run the cluster without this additional daemon? If yes, is it
possible to supress this health error?

What exactly is the purpose of this service. The documentation oncontains
very rare information about it.


Regards
Manuel 





-- 
Manuel Lausch

Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
76135 Karlsruhe | Germany Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat
sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie
bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem
bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern,
weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu
verwenden.

This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient of this e-mail, you are hereby
notified that saving, distribution or use of the content of this e-mail
in any way is prohibited. If you have received this e-mail in error,
please notify the sender and delete the e-mail.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] releasedate for 10.2.8?

2017-05-30 Thread Manuel Lausch
Hi,

is there a release date for the next Jewel release (10.2.8)? I'm
waiting for it since a few weeks because there are some fixes included
related to snapshot deleting and snap trim sleep.

Thanks
Manuel

-- 
Manuel Lausch

Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
76135 Karlsruhe | Germany Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat
sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie
bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem
bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern,
weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu
verwenden.

This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient of this e-mail, you are hereby
notified that saving, distribution or use of the content of this e-mail
in any way is prohibited. If you have received this e-mail in error,
please notify the sender and delete the e-mail.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] memory usage ceph jewel OSDs

2017-03-24 Thread Manuel Lausch
Hello,

in the last days I try to figure out why my OSDs needs a huge amount of
RAM. (1,2 - 4 GB). With this my System memory is on limit. At
beginning I thougt it is because of huge amount of backfilling (some
disks died). But now since a few days all is good but the memory keeps
at its level. Restarting of the OSDs did nothing on this behaviour. 

I running Ceph Jewel (10.2.6) on RedHat7. The cluster has 8 Hosts with
36 4TB OSDs each and 4 Hosts with 15 4 TB OSDs

I tried to profile the used memory like documented here: 
http://docs.ceph.com/docs/jewel/rados/troubleshooting/memory-profiling/

But the output of this commands didn't help me. But I am confused about
the used memory.

from ceph tell osd.98 heap dump I get the following output:
# ceph tell osd.98 heap dump
osd.98 dumping heap profile now.

MALLOC: 1290458456 ( 1230.7 MiB) Bytes in use by application
MALLOC: +0 (0.0 MiB) Bytes in page heap freelist
MALLOC: + 63583000 (   60.6 MiB) Bytes in central cache freelist
MALLOC: +  5896704 (5.6 MiB) Bytes in transfer cache freelist
MALLOC: +102784400 (   98.0 MiB) Bytes in thread cache freelists
MALLOC: + 11350176 (   10.8 MiB) Bytes in malloc metadata
MALLOC:   
MALLOC: =   1474072736 ( 1405.8 MiB) Actual memory used (physical +
swap) MALLOC: +129064960 (  123.1 MiB) Bytes released to OS (aka
unmapped) MALLOC:   
MALLOC: =   1603137696 ( 1528.9 MiB) Virtual address space used
MALLOC:
MALLOC:  88305  Spans in use
MALLOC:   1627  Thread heaps in use
MALLOC:   8192  Tcmalloc page size

Call ReleaseFreeMemory() to release freelist memory to the OS (via
madvise()). Bytes released to the OS take up virtual address space but
no physical memory.


I would say the application needs 1230.7 MB of RAM. But if I analyse
the corresponding dump whit pprof The are only a few Megabytes
mentioned. Follwing the first few lines of pprof:

# pprof --text /usr/bin/ceph-osd osd.98.profile.0002.heap 
Using local file /usr/bin/ceph-osd.
Using local file osd.98.profile.0002.heap.
Total: 8.9 MB
 3.3  36.7%  36.7%  3.3  36.7% ceph::log::Log::create_entry
 2.3  25.5%  62.2%  2.3  25.5% ceph::buffer::list::append@a1f280
 1.1  12.1%  74.3%  2.0  23.1% SimpleMessenger::add_accept_pipe
 0.9  10.4%  84.7%  0.9  10.5% Pipe::Pipe
 0.2   2.8%  87.5%  0.2   2.8% std::map::operator[]
 0.2   2.2%  89.7%  0.2   2.2% std::vector::_M_default_append
 0.2   1.8%  91.5%  0.2   1.8% std::_Rb_tree::_M_copy
 0.1   0.8%  92.4%  0.1   0.8% ceph::buffer::create_aligned
 0.1   0.8%  93.2%  0.1   0.8% std::string::_Rep::_S_create


Is this normal? Do I do something wrong? Is there a Bug? Why need my
OSDs so much RAM?

Thanks for your help

Regards,
Manuel

-- 
Manuel Lausch

Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
76135 Karlsruhe | Germany Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat
sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie
bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem
bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern,
weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu
verwenden.

This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient of this e-mail, you are hereby
notified that saving, distribution or use of the content of this e-mail
in any way is prohibited. If you have received this e-mail in error,
please notify the sender and delete the e-mail.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd down detection broken in jewel?

2016-11-30 Thread Manuel Lausch
Yes. This parameter is used in the condition described there: 
http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/#osds-report-their-status 
and works. I think the default timeout of 900s is quiet a bit large.


Also in the documentation is a other function wich checks the health of 
OSDs and report them down: 
http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/#osds-report-down-osds


As far as I see in the sourcode this documentation is not valid anymore!
I found this commit -> 
https://github.com/ceph/ceph/commit/bcb8f362ec6ac47c4908118e7860dec7971d001f#diff-0a5db46a44ae9900e226289a810f10e8


"mon_osd_min_down_reporters" now is the threshold how many 
"mon_osd_reporter_subtree_level" has to report a down OSD. in Hammer 
this was how many other OSDs had to report. And in Hammer there was also 
the parameter "mon_osd_min_down_reports" which sets how often a other 
OSD has to report a other OSD. In Jewel the parameter doesn't exists 
anymore.


With this "knowlege" I adjusted my configuration.  And will now test it.


BTW:
While reading the source code I may found a other bug. Can you confirm this?
In the function "OSDMonitor::check_failure" in src/mon/OSDMonitor.cc  
the code which counts the "reporters_by_subtree" is in the if block "if 
(g_conf->mon_osd_adjust_heartbeat_grace) {".  So if I disable
adjust_heartbeat_grace the reporters_by_subtree functionality will not 
work at all.



Regards,
Manuel


Am 30.11.2016 um 15:24 schrieb John Petrini:

It's right there in your config.

mon osd report timeout = 900

See: 
http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/


___

John Petrini

NOC Systems Administrator   // *CoreDial, LLC*   // coredial.com 
<http://coredial.com/>   // Twitter <https://twitter.com/coredial> 
LinkedIn <http://www.linkedin.com/company/99631> Google Plus 
<https://plus.google.com/104062177220750809525/posts> Blog 
<http://success.coredial.com/blog>

Hillcrest I, 751 Arbor Way, Suite 150, Blue Bell PA, 19422
*P: *215.297.4400 x232   // *F: *215.297.4401   // *E: 
*jpetr...@coredial.com <mailto:jpetr...@coredial.com>


Exceptional people. Proven Processes. Innovative Technology. Discover 
CoreDial - watch our video 
<http://cta-redirect.hubspot.com/cta/redirect/210539/4c492538-6e4b-445e-9480-bef676787085>


The information transmitted is intended only for the person or entity 
to which it is addressed and may contain confidential and/or 
privileged material. Any review, retransmission,  dissemination or 
other use of, or taking of any action in reliance upon, this 
information by persons or entities other than the intended recipient 
is prohibited. If you received this in error, please contact the 
sender and delete the material from any computer.



On Wed, Nov 30, 2016 at 6:39 AM, Manuel Lausch <mailto:manuel.lau...@1und1.de>> wrote:


Hi,

In a test with ceph jewel we tested how long the cluster needs to
detect and mark down OSDs after they are killed (with kill -9).
The result -> 900 seconds.

In Hammer this took about 20 - 30 seconds.

In the Logfile from the leader monitor are a lot of messeages like
2016-11-30 11:32:20.966567 7f158f5ab700  0 log_channel(cluster)
log [DBG] : osd.7 10.78.43.141:8120/106673
<http://10.78.43.141:8120/106673> reported failed by osd.272
10.78.43.145:8106/117053 <http://10.78.43.145:8106/117053>
A deeper look at this. A lot of OSDs reported this exactly one
time. In Hammer The OSDs reported a down OSD a few more times.

Finaly there is the following and the osd is marked down.
2016-11-30 11:36:22.633253 7f158fdac700  0 log_channel(cluster)
log [INF] : osd.7 marked down after no pg stats for 900.982893seconds

In my ceph.conf I have the following lines in the global section
mon osd min down reporters = 10
mon osd min down reports = 3
mon osd report timeout = 900

It seems the parameter "mon osd min down reports" is removed in
jewel but the documentation is not updated ->
http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/
<http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/>


Can someone tell me how ceph jewel detects down OSDs and mark them
down in a appropriated time?


The Cluster:
ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b)
24 hosts á 60 OSDs -> 1440 OSDs
2 pool with replication factor 4
65536 PGs
5 Mons

-- 
Manuel Lausch


Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
76135 Karlsruhe | Germany
Phone: +49 721 91374-1847 
E-Mail: manuel.lau...@1und1.de <mailto:manuel.lau...@1und1.de> |
Web: www.1und1.de <http://

[ceph-users] osd down detection broken in jewel?

2016-11-30 Thread Manuel Lausch

Hi,

In a test with ceph jewel we tested how long the cluster needs to detect 
and mark down OSDs after they are killed (with kill -9). The result -> 
900 seconds.


In Hammer this took about 20 - 30 seconds.

In the Logfile from the leader monitor are a lot of messeages like
2016-11-30 11:32:20.966567 7f158f5ab700  0 log_channel(cluster) log 
[DBG] : osd.7 10.78.43.141:8120/106673 reported failed by osd.272 
10.78.43.145:8106/117053
A deeper look at this. A lot of OSDs reported this exactly one time. In 
Hammer The OSDs reported a down OSD a few more times.


Finaly there is the following and the osd is marked down.
2016-11-30 11:36:22.633253 7f158fdac700  0 log_channel(cluster) log 
[INF] : osd.7 marked down after no pg stats for 900.982893seconds


In my ceph.conf I have the following lines in the global section
mon osd min down reporters = 10
mon osd min down reports = 3
mon osd report timeout = 900

It seems the parameter "mon osd min down reports" is removed in jewel 
but the documentation is not updated -> 
http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction/



Can someone tell me how ceph jewel detects down OSDs and mark them down 
in a appropriated time?



The Cluster:
ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b)
24 hosts á 60 OSDs -> 1440 OSDs
2 pool with replication factor 4
65536 PGs
5 Mons

--
Manuel Lausch

Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 
Karlsruhe | Germany
Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen 
enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail 
irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten 
Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, 
diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise 
auch immer zu verwenden.

This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient of this e-mail, you are hereby notified that saving, 
distribution or use of the content of this e-mail in any way is prohibited. If 
you have received this e-mail in error, please notify the sender and delete the 
e-mail.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] resolve split brain situation in ceph cluster

2016-10-17 Thread Manuel Lausch

Hi Gregory,

each datacenter has its own IP subnet which is routed. We created 
simultaneously iptables rules on each host wich drops all packages in 
and outgoing to the other datacenter. After this our application wrote 
to DC A, there are 3 of 5 Monitor Nodes.
Now we modified in B the monmap (removed all mon nodes from DC A, so 
there are now 2 of 2 mon active). The monmap in A is untouched. The 
cluster in B was now active as well and the applications in B could now 
write to it. So we wrote definitely data in both clusterparts.
After this we shut down the mon nodes in A. The part in A was now 
unavailable.


Some hours later we removed the iptables rules and tried to rejoin the 
tow parts.
we rejoined he three mon nodes from A as new nodes. the old mon data 
from this nodes was destroyed.



Do you need further information?

Regards,
Manuel


Am 14.10.2016 um 17:58 schrieb Gregory Farnum:

On Fri, Oct 14, 2016 at 7:27 AM, Manuel Lausch  wrote:

Hi,

I need some help to fix a broken cluster. I think we broke the cluster, but
I want to know your opinion and if you see a possibility to recover it.

Let me explain what happend.

We have a cluster (Version 0.94.9) in two datacenters (A and B). In each 12
nodes á 60 ODSs. In A we have 3 monitor nodes and in B  2. The crushrule and
replication factor forces two replicas in each datacenter.

We write objects via librados in the cluster. The objects are immutable, so
they are either present or absent.

In this cluster we tested what happens if datacenter A will fail and we need
to bring up the cluster in B by creating a monitor quorum in B. We did this
by cut off the network connection betwenn the two datacenters. The OSDs from
DC B went down like expected. Now we removed the mon Nodes from the monmap
in B (by extracting it offline and edit it). Our clients wrote now data in
both independent clusterparts before we stopped the mons in A. (YES I know.
This is a really bad thing).

This story line seems to be missing some points. How did you cut off
the network connection? What leads you to believe the OSDs accepted
writes on both sides of the split? Did you edit the monmap in both
data centers, or just DC A (that you wanted to remain alive)? What
monitor counts do you have in each DC?
-Greg


Now we try to join the two sides again. But so far without success.

Only the OSDs in B are running. The OSDs in A started but the OSDs stay
down. In the mon log we see a lot of „...(leader).pg v3513957 ignoring stats
from non-active osd“ alerts.

We see, that the current osdmap epoch in the running cluster is „28873“. In
the OSDs in A the epoch is „29003“. We assume that this is the reason why
the OSDs won't to jump in.


BTW: This is only a testcluster, so no important data are harmed.


Regards
Manuel


--
Manuel Lausch

Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135
Karlsruhe | Germany
Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen
enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese
E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und
vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten
ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt
auf welche Weise auch immer zu verwenden.

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient of this e-mail, you are hereby notified that
saving, distribution or use of the content of this e-mail in any way is
prohibited. If you have received this e-mail in error, please notify the
sender and delete the e-mail.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Manuel Lausch

Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 
Karlsruhe | Germany
Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen 
enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail 
irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten 
Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, 
diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise 
auch immer zu verwenden.

This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient of this e-mail, you are hereby notified that sa

[ceph-users] resolve split brain situation in ceph cluster

2016-10-14 Thread Manuel Lausch

Hi,

I need some help to fix a broken cluster. I think we broke the cluster, 
but I want to know your opinion and if you see a possibility to recover it.


Let me explain what happend.

We have a cluster (Version 0.94.9) in two datacenters (A and B). In each 
12 nodes á 60 ODSs. In A we have 3 monitor nodes and in B  2. The 
crushrule and replication factor forces two replicas in each datacenter.


We write objects via librados in the cluster. The objects are immutable, 
so they are either present or absent.


In this cluster we tested what happens if datacenter A will fail and we 
need to bring up the cluster in B by creating a monitor quorum in B. We 
did this by cut off the network connection betwenn the two datacenters. 
The OSDs from DC B went down like expected. Now we removed the mon Nodes 
from the monmap in B (by extracting it offline and edit it). Our clients 
wrote now data in both independent clusterparts before we stopped the 
mons in A. (YES I know. This is a really bad thing).


Now we try to join the two sides again. But so far without success.

Only the OSDs in B are running. The OSDs in A started but the OSDs stay 
down. In the mon log we see a lot of „...(leader).pg v3513957 ignoring 
stats from non-active osd“ alerts.


We see, that the current osdmap epoch in the running cluster is „28873“. 
In the OSDs in A the epoch is „29003“. We assume that this is the reason 
why the OSDs won't to jump in.



BTW: This is only a testcluster, so no important data are harmed.


Regards
Manuel


--
Manuel Lausch

Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 
Karlsruhe | Germany
Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen 
enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail 
irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten 
Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, 
diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise 
auch immer zu verwenden.

This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient of this e-mail, you are hereby notified that saving, 
distribution or use of the content of this e-mail in any way is prohibited. If 
you have received this e-mail in error, please notify the sender and delete the 
e-mail.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Try to install ceph hammer on CentOS7

2016-07-25 Thread Manuel Lausch

Hi,
Thanks for your help.
I found the failure. Via puppet I configured a versionlock. There I had 
a wrong version-epoch configured.


Regards,
Manuel

Am 23.07.2016 um 05:11 schrieb Brad Hubbard:

On Sat, Jul 23, 2016 at 1:41 AM, Ruben Kerkhof  wrote:

Please keep the mailing list on the CC.

On Fri, Jul 22, 2016 at 3:40 PM, Manuel Lausch  wrote:

oh. This was a copy&pase failure.
Of course I checked my config again. Some other variations of configurating
didn't help as well.

Finaly I took the ceph-0.94.7-0.el7.x86_64.rpm in a directory and created
with createrepo the neccessary repository index files. Also with this as a
repository the ceph package is not visible. Other packages in the repository
works fine.

If I try to install the package with yum install
~/ceph-0.94.7-0.el7.x86_64.rpm the Installation including the dependencys is
successfull.

My knowledge with rpm and yum is not as big as it should be. So I don't know
how to debug further.

What does yum repolist show?

This is good advice.

I'd also advise running "yum clean all" before proceeding once you
have confirmed everything is configured correctly.

HTH,
Brad


It looks like the ceph-noarch repo is ok, the ceph repo isn't.


Regards,
Manuel

Regards,

Ruben
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


--
Manuel Lausch

Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 
Karlsruhe | Germany
Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen 
enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail 
irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten 
Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, 
diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise 
auch immer zu verwenden.

This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient of this e-mail, you are hereby notified that saving, 
distribution or use of the content of this e-mail in any way is prohibited. If 
you have received this e-mail in error, please notify the sender and delete the 
e-mail.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Try to install ceph hammer on CentOS7

2016-07-21 Thread Manuel Lausch

Hi,

I try to install ceph hammer on centos7 but something with the RPM 
Repository seems to be wrong.


In my yum.repos.d/ceph.repo file I have the following configuration:

[ceph]
name=Ceph packages for $basearch
baseurl=baseurl=http://download.ceph.com/rpm-hammer/el7/$basearch
enabled=1
priority=2
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc

[ceph-noarch]
name=Ceph noarch packages
baseurl=http://download.ceph.com/rpm-hammer/el7/noarch
enabled=1
priority=2
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc


Now I can find with "yum search ceph" some packages out of this 
repository but the "ceph" package itself is missing. Is there something 
wrong with my configuration or is there some issue with the rpository 
itself?


This is the output from the yum search ceph:
# yum search ceph
Loaded plugins: fastestmirror, priorities, versionlock
Loading mirror speeds from cached hostfile
112 packages excluded due to repository priority protections
== 
N/S matched: ceph 
===
centos-release-ceph-hammer.noarch : Ceph Hammer packages from the CentOS 
Storage SIG repository

ceph-common.x86_64 : Ceph Common
ceph-dash.noarch : ceph dashboard
ceph-debuginfo.x86_64 : Debug information for package ceph
ceph-deploy.noarch : Admin and deploy tool for Ceph
ceph-devel-compat.x86_64 : Compatibility package for Ceph headers
ceph-fuse.x86_64 : Ceph fuse-based client
ceph-libs-compat.x86_64 : Meta package to include ceph libraries
ceph-test.x86_64 : Ceph benchmarks and test tools
cephfs-java.x86_64 : Java libraries for the Ceph File System
collectd-ceph.x86_64 : Ceph plugin for collectd
libcephfs1.x86_64 : Ceph distributed file system client library
libcephfs1-devel.x86_64 : Ceph distributed file system headers
libcephfs_jni1.x86_64 : Java Native Interface library for CephFS Java 
bindings
libcephfs_jni1-devel.x86_64 : Development files for CephFS Java Native 
Interface library

python-ceph-compat.x86_64 : Compatibility package for Cephs python libraries
python-cephfs.x86_64 : Python libraries for Ceph distributed file system
ceph-radosgw.x86_64 : Rados REST gateway
rbd-fuse.x86_64 : Ceph fuse-based client

  Name and summary matches only, use "search all" for everything.



Regards,
Manuel


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Question about how to start ceph OSDs with systemd

2016-07-08 Thread Manuel Lausch

hi,

In the last days I do play around with ceph jewel on debian Jessie and 
CentOS 7. Now I have a question about systemd on this Systems.


I installed ceph jewel (ceph version 10.2.2 
(45107e21c568dd033c2f0a3107dec8f0b0e58374)) on debian Jessie and 
prepared some OSDs. While playing around I decided to reinstall my 
operating system (of course without deleting the OSD devices ). After 
the reinstallation of ceph and put in the old ceph.conf I thought the 
previously prepared OSDs do easily start and all will be fine after that.


With debian Wheezy and ceph firefly this worked well, but with the new 
versions and systemd this doesn't work at all. Now what have I to do to 
get the OSDs running again?


The following command didn't work and I didn't get any output from it.
  systemctl start ceph-osd.target

And this is the output from systemctl status ceph-osd.target
● ceph-osd.target - ceph target allowing to start/stop all 
ceph-osd@.service instances at once
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd.target; enabled; 
vendor preset: enabled)

   Active: active since Fri 2016-07-08 17:19:29 CEST; 36min ago

Jul 08 17:19:29 cs-dellbrick01.server.lan systemd[1]: Reached target 
ceph target allowing to start/stop all ceph-osd@.service instances at once.
Jul 08 17:19:29 cs-dellbrick01.server.lan systemd[1]: Starting ceph 
target allowing to start/stop all ceph-osd@.service instances at once.
Jul 08 17:31:15 cs-dellbrick01.server.lan systemd[1]: Reached target 
ceph target allowing to start/stop all ceph-osd@.service instances at once.




thanks,
Manuel

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Need help with synchronizing ceph mons

2015-05-13 Thread Manuel Lausch

Hi,

I have some problems with my ceph monitor nodes in my Cluster.

I had 5 mons in the cluster. On all 5 nodes the leveldb store grew up to 
about 80 – 90 GB in size. To get rid of it I triggerd a compaction with 
the following command on one node.


  ceph tell mon.d compact

The Monitor compacted his data to about 5 GB. After this the mon tried 
to synchronize his data with the other mons. And start my problems.


After a short time (20 – 30 sec) of streaming data from another mon node 
The stream breaks and further the sender reads oviously all the other 
data in his store with maximum speed. At this point the cluster lost his 
leader and trying to elect a new one. The leader election only works if 
the reading of the data is done.


I tried to remove the mon from the cluster completly and rejoin it as a 
new one but while syncing I experience the same issue. So currently the 
cluster has only 4 Mons.


While further investigation and testing I lost another mon which wants 
to sync data after starting with the same behavior.


It seems the Node which streams data while syncing is in stress with 
reading and sending data. I tried to limit the network bandwith of the 
joining node to reduce the load. I also tried to set ionice -c3 on the 
process which does all the disk IO while reading. But nothing helped.


Because the cluster is productive I don't want to expermient further 
more without knowing what's going on.


Does anyone have any ideas what's going on and how I can try to fix this?

I am using ceph version 0.67.11 (bc8b67bef6309a32361be76cd11fb56b057ea9d2)
5 Monitor Nodes with SSD as leveldb store
24 OSD Hosts with 1416 OSDs

Thank you
Manuel

--
Manuel Lausch

Systemadministrator
Cloud Backend Services

1&1 Mail & Media Development  & Technology GmbH | Brauerstraße 48 | 76135 
Karlsruhe | Germany
Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Frank Einhellinger, Hans-Henning Kettler, Jan Oetjen


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen 
enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail 
irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten 
Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, 
diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise 
auch immer zu verwenden.

This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient of this e-mail, you are hereby notified that saving, 
distribution or use of the content of this e-mail in any way is prohibited. If 
you have received this e-mail in error, please notify the sender and delete the 
e-mail.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com