Re: [ceph-users] Ceph Scientific Computing User Group

2019-07-05 Thread Kevin Hrpcek
We've had some positive feedback and will be moving forward with this user 
group. The first virtual user group meeting is planned for July 24th at 4:30pm 
central European time/10:30am American eastern time. We will keep it to an hour 
in length. The plan is to use the ceph bluejeans video conferencing and it will 
be put on the ceph community calendar. I will send out links when it is closer 
to the 24th.

The goal of this user group is to promote conversations and sharing ideas for 
how ceph is used in the the scientific/hpc/htc communities. Please be willing 
to discuss your use cases, cluster configs, problems you've had, shortcomings 
in ceph, etc... Not everyone pays attention to the ceph lists so feel free to 
share the meeting information with others you know that may be interested in 
joining in.

Contact me if you have questions, comments, suggestions, or want to volunteer a 
topic for meetings. I will be brainstorming some conversation starters but it 
would also be interesting to have people give a deep dive into their use of 
ceph and what they have built around it to support the science being done at 
their facility.

Kevin



On 6/17/19 10:43 AM, Kevin Hrpcek wrote:
Hey all,

At cephalocon some of us who work in scientific computing got together for a 
BoF and had a good conversation. There was some interest in finding a way to 
continue the conversation focused on ceph in scientific computing and htc/hpc 
environments. We are considering putting together monthly video conference user 
group meeting to facilitate sharing thoughts and ideas for this part of the 
ceph community. At cephalocon we mostly had teams present from the EU so I'm 
interested in hearing how much community interest there is in a 
ceph+science/HPC/HTC user group meeting. It will be impossible to pick a time 
that works well for everyone but initially we considered something later in the 
work day for EU countries.

Reply to me if you're interested and please include your timezone.

Kevin



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] To backport or not to backport

2019-07-05 Thread Robert LeBlanc
On Thu, Jul 4, 2019 at 8:00 AM Stefan Kooman  wrote:

> Hi,
>
> Now the release cadence has been set, it's time for another discussion
> :-).
>
> During Ceph day NL we had a panel q/a [1]. One of the things that was
> discussed were backports. Occasionally users will ask for backports of
> functionality in newer releases to older releases (that are still in
> support).
>
> Ceph is quite a unique project in the sense that new functionality gets
> backported to older releases. Sometimes even functionality gets changed
> in the lifetime of a release. I can recall "ceph-volume" change to LVM
> in the beginning of the Luminous release. While backports can enrich the
> user experience of a ceph operator, it's not without risks. There have
> been several issues with "incomplete" backports and or unforeseen
> circumstances that had the reverse effect: downtime of (part of) ceph
> services. The ones that come to my mind are:
>
> - MDS (cephfs damaged)  mimic backport (13.2.2)
> - RADOS (pg log hard limit) luminous / mimic backport (12.2.8 / 13.2.2)
>
> I would like to define a simple rule of when to backport:
>
> - Only backport fixes that do not introduce new functionality, but
> addresses
>   (impaired) functionality already present in the release.
>
> Example of, IMHO, a backport that matches the backport criteria was the
> "bitmap_allocator" fix. It fixed a real problem, not some corner case.
> Don't get me wrong here, it is important to catch corner cases, but it
> should not put the majority of clusters at risk.
>
> The time and effort that might be saved with this approach can indeed be
> spend in one of the new focus areas Sage mentioned during his keynote
> talk at Cephalocon Barcelona: quality. Quality of the backports that are
> needed, improved testing, especially for upgrades to newer releases. If
> upgrades are seemless, people are more willing to upgrade, because hey,
> it just works(tm). Upgrades should be boring.
>
> How many clusters (not nautilus ;-)) are running with "bitmap_allocator" or
> with the pglog_hardlimit enabled? If a new feature is not enabled by
> default and it's unclear how "stable" it is to use, operators tend to not
> enable it, defeating the purpose of the backport.
>
> Backporting fixes to older releases can be considered a "business
> opportunity" for the likes of Red Hat, SUSE, Fujitsu, etc. Especially
> for users that want a system that "keeps on running forever" and never
> needs "dangerous" updates.
>
> This is my view on the matter, please let me know what you think of
> this.
>
> Gr. Stefan
>
> P.s. Just to make things clear: this thread is in _no way_ intended to
> pick on
> anybody.
>
>
> [1]: https://pad.ceph.com/p/ceph-day-nl-2019-panel


I prefer a released version to be fairly static and not have new features
introduced, only bug fixes. For one, I'd prefer not to have to read the
release notes to figure out how dangerous a "bug-fix" release should be.
The fixes in a released version should be tested extremely well so it "Just
Works".

By not back porting new features, I think it gives more time to bake the
features into the new version and frees up the developers to focus on the
forward direction of the product. If I want a new feature, then the burden
is on me to test a new version and verify that it works in my environment
(or vendors), not the developers.

I wholeheartedly support only bug fixes and security fixes going into
released versions.

Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cannot add fuse options to ceph-fuse command

2019-07-05 Thread Robert LeBlanc
Is this a Ceph specific option? If so, you may need to prefix it with
"ceph.", at least I had to for FUSE to pass it to the Ceph module/code
portion.

Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Thu, Jul 4, 2019 at 7:35 AM songz.gucas 
wrote:

> Hi,
>
>
> I try to add some fuse options when mount cephfs using ceph-fuse tool, but
> it errored:
>
>
> ceph-fuse -m 10.128.5.1,10.128.5.2,10.128.5.3 -r /test1 /cephfs/test1 -o
> entry_timeout=5
>
> ceph-fuse[3857515]: starting ceph client2019-07-04 21:55:37.767
> 7fc1d9cbdbc0 -1 init, newargv = 0x555d6f847490 newargc=9
>
>
> fuse: unknown option `entry_timeout=5'
>
> ceph-fuse[3857515]: fuse failed to start
>
> 2019-07-04 21:55:37.796 7fc1d9cbdbc0 -1 fuse_lowlevel_new failed
>
>
>
> How can I pass options to fuse?
>
>
> Thank you for your precious help !
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Understanding incomplete PGs

2019-07-05 Thread Kyle
On Friday, July 5, 2019 11:50:44 AM CDT Paul Emmerich wrote:
> * There are virtually no use cases for ec pools with m=1, this is a bad
> configuration as you can't have both availability and durability

I'll have to look into this more. The cluster only has 4 hosts, so it might be 
worth switching to osd failure domain for the EC pools and using k=5,m=2.

> 
> * Due to weird internal restrictions ec pools below their min size can't
> recover, you'll probably have to reduce min_size temporarily to recover it

Lowering min_size to 2 did allow it to recover.

> 
> * Depending on your version it might be necessary to restart some of the
> OSDs due to a bug (fixed by now) that caused it to mark some objects as
> degraded if you remove or restart an OSD while you have remapped objects
> 
> * run "ceph osd safe-to-destroy X" to check if it's safe to destroy a given
> OSD

Excellent, thanks!

> 
> > Hello,
> > 
> > I'm working with a small ceph cluster (about 10TB, 7-9 OSDs, all Bluestore
> > on
> > lvm) and recently ran into a problem with 17 pgs marked as incomplete
> > after
> > adding/removing OSDs.
> > 
> > Here's the sequence of events:
> > 1. 7 osds in the cluster, health is OK, all pgs are active+clean
> > 2. 3 new osds on a new host are added, lots of backfilling in progress
> > 3. osd 6 needs to be removed, so we do "ceph osd crush reweight osd.6 0"
> > 4. after a few hours we see "min osd.6 with 0 pgs" from "ceph osd
> > utilization"
> > 5. ceph osd out 6
> > 6. systemctl stop ceph-osd@6
> > 7. the drive backing osd 6 is pulled and wiped
> > 8. backfilling has now finished all pgs are active+clean except for 17
> > incomplete pgs
> > 
> > From reading the docs, it sounds like there has been unrecoverable data
> > loss
> > in those 17 pgs. That raises some questions for me:
> > 
> > Was "ceph osd utilization" only showing a goal of 0 pgs allocated instead
> > of
> > the current actual allocation?
> > 
> > Why is there data loss from a single osd being removed? Shouldn't that be
> > recoverable?
> > All pools in the cluster are either replicated 3 or erasure-coded k=2,m=1
> > with
> > default "host" failure domain. They shouldn't suffer data loss with a
> > single
> > osd being removed even if there were no reweighting beforehand. Does the
> > backfilling temporarily reduce data durability in some way?
> > 
> > Is there a way to see which pgs actually have data on a given osd?
> > 
> > I attached an example of one of the incomplete pgs.
> > 
> > Thanks for any help,
> > 
> > Kyle___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph performance IOPS

2019-07-05 Thread solarflow99
Just set 1 or more SSDs for bluestore, as long as you're within the 4% rule
I think it should be enough.


On Fri, Jul 5, 2019 at 7:15 AM Davis Mendoza Paco 
wrote:

> Hi all,
> I have installed ceph luminous, witch 5 nodes(45 OSD), each OSD server
> supports up to 16HD and I'm only using 9
>
> I wanted to ask for help to improve IOPS performance since I have about
> 350 virtual machines of approximately 15 GB in size and I/O processes are
> very slow.
> You who recommend me?
>
> In the documentation of ceph recommend using SSD for the journal, my
> question is
> How many SSD do I have to enable per server so that the journals of the 9
> OSDs can be separated into SSDs?
>
> I currently use ceph with OpenStack, on 11 servers with SO Debian Stretch:
> * 3 controller
> * 3 compute
> * 5 ceph-osd
>   network: bond lacp 10GB
>   RAM: 96GB
>   HD: 9 disk SATA-3TB (bluestore)
>
> --
> *Davis Mendoza P.*
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph zabbix monitoring

2019-07-05 Thread Heðin Ejdesgaard Møller
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Check the following:
#1 zabbix_sender must be installed.
#2 Firewall configured.
#3 Verify that the settings in ceph zabbix config-show are correct.
#4 Verify that you created the host in zabbix server, with your mgr IP's as
"Agent interface"'s

Regards
Heðin Ejdesgaard
Synack sp/f


On hós, 2019-06-27 at 13:12 +0430, Majid Varzideh wrote:
>  Hi friends
> i have installed ceph mimic with zabbix 3.0. i configured everything to
> monitor my cluster with zabbix and i could get data from zabbix frontend.
> but in ceph -s command it says Failed to send data to Zabbix.
> why this happen?
> my ceph version :ceph version 13.2.6
> (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
> zabbix 3.0.14
> thanks,
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
-BEGIN PGP SIGNATURE-

iQIzBAEBCAAdFiEElZWfRQVsNukQFi9Ko80MCbT/An0FAl0frrkACgkQo80MCbT/
An3jYA/7B81/EZHFb7pUbx9jDxh6L/Yz5+VhVsHeXV6MqzX6FHtBHTrmLq7raEgm
DpN5QUygHFEW1YTJrKRlb2YulhwFwkqKg0PgdLB6M+JkZDbEMMeorptLLsLNfvGn
cDn2pgp8A/vE3qPDkCQiteB5bI3C3aQbHhRFi6tIq+zehTL6Vz8J9Yp0R2bp+TT4
FFspA1U3INi39VWOHChh8FO8X6jNbzWRGukYqwikobAvTsBwDPu6zRxOMxLJFLOk
5rv932EjwmJ5zVRKgcXiBzdlX5i4b6yhj+EtaEDvCsvy5z4ahPk+FbxUQHgb+OAO
5ryQHzmHG4wEAuUbaMo4YaW4IU6/WsZH7g3oxekhXcC0K9JFsOg4djIB30ZiywPZ
QJp1YH9CPHRwMaiE3eD32cnaGLLak4tw1goz4lbrdgMrDahLYRPHDjivwOOcJ+e1
HrIthzsdmrqMY2Gh5ire871qKB6+qmz3Eej2ZZGAydlOSm7rGUKxuCGRJxRh4LD3
mmK5JnZj9/VkWxsq3RPxppYEMcMx1z876FJHqQ4dC6mHVaHK+JiYksFkdKaVSYWU
WkiiZ/Tg8NfI/ADeQ5BlWX3EcPjf9qdfzpUfCLmNanEEgRAOG5t7Q9Tjx9mMaTns
fGTuFgQEm9WyZjhUGy3M/+FbGAYoZSZnPpeTUxPsqQybdkbzNpU=
=hmDq
-END PGP SIGNATURE-

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Understanding incomplete PGs

2019-07-05 Thread Kyle
On Friday, July 5, 2019 11:28:32 AM CDT Caspar Smit wrote:
> Kyle,
> 
> Was the cluster still backfilling when you removed osd 6 or did you only
> check its utilization?

Yes, still backfilling.

> 
> Running an EC pool with m=1 is a bad idea. EC pool min_size = k+1 so losing
> a single OSD results in inaccessible data.
> Your incomplete PG's are probably all EC pool pgs, please verify.

Yes, also correct.

> 
> If the above statement is true, you could *temporarily* set min_size to 2
> (on your EC pools) to get back access to your data again but this is a very
> dangerous action. Losing another OSD during this period results in actual
> data loss.

This resolved the issue. I had seen reducing min_size mentioned elsewhere, but 
for some reason I thought that applied only to replicated pools. Thank you!

> 
> Kind regards,
> Caspar Smit
> 
> Op vr 5 jul. 2019 om 01:17 schreef Kyle :
> > Hello,
> > 
> > I'm working with a small ceph cluster (about 10TB, 7-9 OSDs, all Bluestore
> > on
> > lvm) and recently ran into a problem with 17 pgs marked as incomplete
> > after
> > adding/removing OSDs.
> > 
> > Here's the sequence of events:
> > 1. 7 osds in the cluster, health is OK, all pgs are active+clean
> > 2. 3 new osds on a new host are added, lots of backfilling in progress
> > 3. osd 6 needs to be removed, so we do "ceph osd crush reweight osd.6 0"
> > 4. after a few hours we see "min osd.6 with 0 pgs" from "ceph osd
> > utilization"
> > 5. ceph osd out 6
> > 6. systemctl stop ceph-osd@6
> > 7. the drive backing osd 6 is pulled and wiped
> > 8. backfilling has now finished all pgs are active+clean except for 17
> > incomplete pgs
> > 
> > From reading the docs, it sounds like there has been unrecoverable data
> > loss
> > in those 17 pgs. That raises some questions for me:
> > 
> > Was "ceph osd utilization" only showing a goal of 0 pgs allocated instead
> > of
> > the current actual allocation?
> > 
> > Why is there data loss from a single osd being removed? Shouldn't that be
> > recoverable?
> > All pools in the cluster are either replicated 3 or erasure-coded k=2,m=1
> > with
> > default "host" failure domain. They shouldn't suffer data loss with a
> > single
> > osd being removed even if there were no reweighting beforehand. Does the
> > backfilling temporarily reduce data durability in some way?
> > 
> > Is there a way to see which pgs actually have data on a given osd?
> > 
> > I attached an example of one of the incomplete pgs.
> > 
> > Thanks for any help,
> > 
> > Kyle___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD's won't start - thread abort

2019-07-05 Thread Gregory Farnum
n Wed, Jul 3, 2019 at 11:09 AM Austin Workman  wrote:
> Decided that if all the data was going to move, I should adjust my jerasure 
> ec profile from k=4, m=1 -> k=5, m=1 with force(is this even recommended vs. 
> just creating new pools???)
>
> Initially it unset crush-device-class=hdd to be blank
> Re-set crush-device-class
> Couldn't determine if this had any effect on the move operations.
> Changed back to k=4

You can't change the EC parameters on existing pools; Ceph has no way
of dealing with that. If it's possible to change the profile and break
the pool (which given the striping mismatch you cite later seems to be
what happened), we need to fix that.
Can you describe the exact commands you ran in that timeline?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph pool EC with overwrite enabled

2019-07-05 Thread Fabio Abreu
Hi,

I look the newbie problem rs , i've made a little confusion when I read the
ec_pool in community document.

Thanks .

On Fri, Jul 5, 2019 at 1:08 AM huang jun  wrote:

> try: rbd create backup2/teste --size 5T --data-pool ec_pool
>
> Fabio Abreu  于2019年7月5日周五 上午1:49写道:
> >
> > Hi Everybody,
> >
> > I have a doubt about the usability of rbd with EC pool , I tried to use
> this in my CentOS lab but  I just receive some errors when I try create a
> rbd image inside this pool.
> >
> > For luminous environment this feature is supported?
> >
> >
> http://docs.ceph.com/docs/mimic/rados/operations/erasure-code/#erasure-coding-with-overwrites
> >
> > ceph osd pool set ec_pool allow_ec_overwrites true
> >
> >
> > This error bellow happened when I try to create the RBD image :
> >
> >
> > [root@mon1 ceph-key]# rbd create backup2/teste --size 5T --data-pool
> backup2
> >
> > ...
> >
> > warning: line 9: 'osd_pool_default_crush_rule' in section 'global'
> redefined
> >
> > 2019-07-03 17:27:33.721593 7f12c3fff700 -1 librbd::image::CreateRequest:
> 0x560f2f0db0a0 handle_add_image_to_directory: error adding image to
> directory: (95) Operation not supported
> >
> > rbd: create error: (95) Operation not supported
> >
> >
> > Regards,
> > Fabio Abreu Reis
> > http://fajlinux.com.br
> > Tel : +55 21 98244-0161
> > Skype : fabioabreureis
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


-- 
Atenciosamente,
Fabio Abreu Reis
http://fajlinux.com.br
*Tel : *+55 21 98244-0161
*Skype : *fabioabreureis
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph performance IOPS

2019-07-05 Thread Davis Mendoza Paco
Hi all,
I have installed ceph luminous, witch 5 nodes(45 OSD), each OSD server
supports up to 16HD and I'm only using 9

I wanted to ask for help to improve IOPS performance since I have about 350
virtual machines of approximately 15 GB in size and I/O processes are very
slow.
You who recommend me?

In the documentation of ceph recommend using SSD for the journal, my
question is
How many SSD do I have to enable per server so that the journals of the 9
OSDs can be separated into SSDs?

I currently use ceph with OpenStack, on 11 servers with SO Debian Stretch:
* 3 controller
* 3 compute
* 5 ceph-osd
  network: bond lacp 10GB
  RAM: 96GB
  HD: 9 disk SATA-3TB (bluestore)

-- 
*Davis Mendoza P.*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-volume failed after replacing disk

2019-07-05 Thread Erik McCormick
If you create the OSD without specifying an ID it will grab the lowest
available one. Unless you have other gaps somewhere, that ID would probably
be the one you just removed.

-Erik

On Fri, Jul 5, 2019, 9:19 AM Paul Emmerich  wrote:

>
> On Fri, Jul 5, 2019 at 2:17 PM Alfredo Deza  wrote:
>
>> On Fri, Jul 5, 2019 at 6:23 AM ST Wong (ITSC) 
>> wrote:
>> >
>> > Hi,
>> >
>> >
>> >
>> > I target to run just destroy and re-use the ID as stated in manual but
>> seems not working.
>> >
>> > Seems I’m unable to re-use the ID ?
>>
>> The OSD replacement guide does not mention anything about crush and
>> auth commands. I believe you are now in a situation where the ID is no
>> longer able to be re-used, and ceph-volume
>> will not create one for you when specifying it in the CLI.
>>
>> I don't know why there is so much attachment to these ID numbers, why
>> is it desirable to have that 71 number back again?
>>
>
> it avoids unnecessary rebalances
>
>
>> >
>> >
>> >
>> > Thanks.
>> >
>> > /stwong
>> >
>> >
>> >
>> >
>> >
>> > From: Paul Emmerich 
>> > Sent: Friday, July 5, 2019 5:54 PM
>> > To: ST Wong (ITSC) 
>> > Cc: Eugen Block ; ceph-users@lists.ceph.com
>> > Subject: Re: [ceph-users] ceph-volume failed after replacing disk
>> >
>> >
>> >
>> >
>> >
>> > On Fri, Jul 5, 2019 at 11:25 AM ST Wong (ITSC) 
>> wrote:
>> >
>> > Hi,
>> >
>> > Yes, I run the commands before:
>> >
>> > # ceph osd crush remove osd.71
>> > device 'osd.71' does not appear in the crush map
>> > # ceph auth del osd.71
>> > entity osd.71 does not exist
>> >
>> >
>> >
>> > which is probably the reason why you couldn't recycle the OSD ID.
>> >
>> >
>> >
>> > Either run just destroy and re-use the ID or run purge and not re-use
>> the ID.
>> >
>> > Manually deleting auth and crush entries is no longer needed since
>> purge was introduced.
>> >
>> >
>> >
>> >
>> >
>> > Paul
>> >
>> >
>> > --
>> > Paul Emmerich
>> >
>> > Looking for help with your Ceph cluster? Contact us at https://croit.io
>> >
>> > croit GmbH
>> > Freseniusstr. 31h
>> > 81247 München
>> > www.croit.io
>> > Tel: +49 89 1896585 90
>> >
>> >
>> >
>> >
>> > Thanks.
>> > /stwong
>> >
>> > -Original Message-
>> > From: ceph-users  On Behalf Of
>> Eugen Block
>> > Sent: Friday, July 5, 2019 4:54 PM
>> > To: ceph-users@lists.ceph.com
>> > Subject: Re: [ceph-users] ceph-volume failed after replacing disk
>> >
>> > Hi,
>> >
>> > did you also remove that OSD from crush and also from auth before
>> recreating it?
>> >
>> > ceph osd crush remove osd.71
>> > ceph auth del osd.71
>> >
>> > Regards,
>> > Eugen
>> >
>> >
>> > Zitat von "ST Wong (ITSC)" :
>> >
>> > > Hi all,
>> > >
>> > > We replaced a faulty disk out of N OSD and tried to follow steps
>> > > according to "Replacing and OSD" in
>> > > http://docs.ceph.com/docs/nautilus/rados/operations/add-or-rm-osds/,
>> > > but got error:
>> > >
>> > > # ceph osd destroy 71--yes-i-really-mean-it # ceph-volume lvm create
>> > > --bluestore --data /dev/data/lv01 --osd-id
>> > > 71 --block.db /dev/db/lv01
>> > > Running command: /bin/ceph-authtool --gen-print-key Running command:
>> > > /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring
>> > > /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
>> > > -->  RuntimeError: The osd ID 71 is already in use or does not exist.
>> > >
>> > > ceph -s still shows  N OSDS.   I then remove with "ceph osd rm 71".
>> > >  Now "ceph -s" shows N-1 OSDS and id 71 doesn't appear in "ceph osd
>> > > ls".
>> > >
>> > > However, repeating the ceph-volume command still gets same error.
>> > > We're running CEPH 14.2.1.   I must have some steps missed.Would
>> > > anyone please help? Thanks a lot.
>> > >
>> > > Rgds,
>> > > /stwong
>> >
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-volume failed after replacing disk

2019-07-05 Thread Paul Emmerich
On Fri, Jul 5, 2019 at 2:17 PM Alfredo Deza  wrote:

> On Fri, Jul 5, 2019 at 6:23 AM ST Wong (ITSC)  wrote:
> >
> > Hi,
> >
> >
> >
> > I target to run just destroy and re-use the ID as stated in manual but
> seems not working.
> >
> > Seems I’m unable to re-use the ID ?
>
> The OSD replacement guide does not mention anything about crush and
> auth commands. I believe you are now in a situation where the ID is no
> longer able to be re-used, and ceph-volume
> will not create one for you when specifying it in the CLI.
>
> I don't know why there is so much attachment to these ID numbers, why
> is it desirable to have that 71 number back again?
>

it avoids unnecessary rebalances


> >
> >
> >
> > Thanks.
> >
> > /stwong
> >
> >
> >
> >
> >
> > From: Paul Emmerich 
> > Sent: Friday, July 5, 2019 5:54 PM
> > To: ST Wong (ITSC) 
> > Cc: Eugen Block ; ceph-users@lists.ceph.com
> > Subject: Re: [ceph-users] ceph-volume failed after replacing disk
> >
> >
> >
> >
> >
> > On Fri, Jul 5, 2019 at 11:25 AM ST Wong (ITSC) 
> wrote:
> >
> > Hi,
> >
> > Yes, I run the commands before:
> >
> > # ceph osd crush remove osd.71
> > device 'osd.71' does not appear in the crush map
> > # ceph auth del osd.71
> > entity osd.71 does not exist
> >
> >
> >
> > which is probably the reason why you couldn't recycle the OSD ID.
> >
> >
> >
> > Either run just destroy and re-use the ID or run purge and not re-use
> the ID.
> >
> > Manually deleting auth and crush entries is no longer needed since purge
> was introduced.
> >
> >
> >
> >
> >
> > Paul
> >
> >
> > --
> > Paul Emmerich
> >
> > Looking for help with your Ceph cluster? Contact us at https://croit.io
> >
> > croit GmbH
> > Freseniusstr. 31h
> > 81247 München
> > www.croit.io
> > Tel: +49 89 1896585 90
> >
> >
> >
> >
> > Thanks.
> > /stwong
> >
> > -Original Message-
> > From: ceph-users  On Behalf Of Eugen
> Block
> > Sent: Friday, July 5, 2019 4:54 PM
> > To: ceph-users@lists.ceph.com
> > Subject: Re: [ceph-users] ceph-volume failed after replacing disk
> >
> > Hi,
> >
> > did you also remove that OSD from crush and also from auth before
> recreating it?
> >
> > ceph osd crush remove osd.71
> > ceph auth del osd.71
> >
> > Regards,
> > Eugen
> >
> >
> > Zitat von "ST Wong (ITSC)" :
> >
> > > Hi all,
> > >
> > > We replaced a faulty disk out of N OSD and tried to follow steps
> > > according to "Replacing and OSD" in
> > > http://docs.ceph.com/docs/nautilus/rados/operations/add-or-rm-osds/,
> > > but got error:
> > >
> > > # ceph osd destroy 71--yes-i-really-mean-it # ceph-volume lvm create
> > > --bluestore --data /dev/data/lv01 --osd-id
> > > 71 --block.db /dev/db/lv01
> > > Running command: /bin/ceph-authtool --gen-print-key Running command:
> > > /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring
> > > /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
> > > -->  RuntimeError: The osd ID 71 is already in use or does not exist.
> > >
> > > ceph -s still shows  N OSDS.   I then remove with "ceph osd rm 71".
> > >  Now "ceph -s" shows N-1 OSDS and id 71 doesn't appear in "ceph osd
> > > ls".
> > >
> > > However, repeating the ceph-volume command still gets same error.
> > > We're running CEPH 14.2.1.   I must have some steps missed.Would
> > > anyone please help? Thanks a lot.
> > >
> > > Rgds,
> > > /stwong
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] set_mon_vals failed to set cluster_network Configuration option 'cluster_network' may not be modified at runtime

2019-07-05 Thread Vandeir Eduardo
I found a workarount for this problem.

As described by Manuel Rios in https://tracker.ceph.com/issues/40282 ,
the workaround is to include configs:

public_network = x.x.x.x
cluster_network = y.y.y.y

also into ceph.conf on ceph client machines.

On Tue, Jul 2, 2019 at 11:28 AM Vandeir Eduardo
 wrote:
>
> Hi,
>
> on client machines, when I use the command rbd, for example, rbd ls
> poolname, this message is always displayed:
>
> 2019-07-02 11:18:10.613 7fb2eaffd700 -1 set_mon_vals failed to set
> cluster_network = 10.1.2.0/24: Configuration option 'cluster_network'
> may not be modified at runtime
> 2019-07-02 11:18:10.613 7fb2eaffd700 -1 set_mon_vals failed to set
> public_network = 10.1.1.0/24: Configuration option 'public_network'
> may not be modified at runtime
> 2019-07-02 11:18:10.621 7fb2ea7fc700 -1 set_mon_vals failed to set
> cluster_network = 10.1.2.0/24: Configuration option 'cluster_network'
> may not be modified at runtime
> 2019-07-02 11:18:10.621 7fb2ea7fc700 -1 set_mon_vals failed to set
> public_network = 10.1.1.0/24: Configuration option 'public_network'
> may not be modified at runtime
>
> After this, rbd image names are displayed normally.
>
> If I run this command on a ceph node, this "warning/information???"
> messages are not displayed. Is there a way to get ride of this? Its
> really annoying.
>
> The only thread I found about something similar was this:
> https://www.spinics.net/lists/ceph-devel/msg42657.html
>
> I already tryied the commands "ceph config rm global cluster_network"
> and "ceph config rm global public_network", but the messages still
> persist.
>
> Any ideas?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-volume failed after replacing disk

2019-07-05 Thread Alfredo Deza
On Fri, Jul 5, 2019 at 6:23 AM ST Wong (ITSC)  wrote:
>
> Hi,
>
>
>
> I target to run just destroy and re-use the ID as stated in manual but seems 
> not working.
>
> Seems I’m unable to re-use the ID ?

The OSD replacement guide does not mention anything about crush and
auth commands. I believe you are now in a situation where the ID is no
longer able to be re-used, and ceph-volume
will not create one for you when specifying it in the CLI.

I don't know why there is so much attachment to these ID numbers, why
is it desirable to have that 71 number back again?
>
>
>
> Thanks.
>
> /stwong
>
>
>
>
>
> From: Paul Emmerich 
> Sent: Friday, July 5, 2019 5:54 PM
> To: ST Wong (ITSC) 
> Cc: Eugen Block ; ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] ceph-volume failed after replacing disk
>
>
>
>
>
> On Fri, Jul 5, 2019 at 11:25 AM ST Wong (ITSC)  wrote:
>
> Hi,
>
> Yes, I run the commands before:
>
> # ceph osd crush remove osd.71
> device 'osd.71' does not appear in the crush map
> # ceph auth del osd.71
> entity osd.71 does not exist
>
>
>
> which is probably the reason why you couldn't recycle the OSD ID.
>
>
>
> Either run just destroy and re-use the ID or run purge and not re-use the ID.
>
> Manually deleting auth and crush entries is no longer needed since purge was 
> introduced.
>
>
>
>
>
> Paul
>
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
>
>
>
> Thanks.
> /stwong
>
> -Original Message-
> From: ceph-users  On Behalf Of Eugen Block
> Sent: Friday, July 5, 2019 4:54 PM
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] ceph-volume failed after replacing disk
>
> Hi,
>
> did you also remove that OSD from crush and also from auth before recreating 
> it?
>
> ceph osd crush remove osd.71
> ceph auth del osd.71
>
> Regards,
> Eugen
>
>
> Zitat von "ST Wong (ITSC)" :
>
> > Hi all,
> >
> > We replaced a faulty disk out of N OSD and tried to follow steps
> > according to "Replacing and OSD" in
> > http://docs.ceph.com/docs/nautilus/rados/operations/add-or-rm-osds/,
> > but got error:
> >
> > # ceph osd destroy 71--yes-i-really-mean-it # ceph-volume lvm create
> > --bluestore --data /dev/data/lv01 --osd-id
> > 71 --block.db /dev/db/lv01
> > Running command: /bin/ceph-authtool --gen-print-key Running command:
> > /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring
> > /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
> > -->  RuntimeError: The osd ID 71 is already in use or does not exist.
> >
> > ceph -s still shows  N OSDS.   I then remove with "ceph osd rm 71".
> >  Now "ceph -s" shows N-1 OSDS and id 71 doesn't appear in "ceph osd
> > ls".
> >
> > However, repeating the ceph-volume command still gets same error.
> > We're running CEPH 14.2.1.   I must have some steps missed.Would
> > anyone please help? Thanks a lot.
> >
> > Rgds,
> > /stwong
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] details about cloning objects using librados

2019-07-05 Thread nokia ceph
Thank you Greg, we will try this out .

Thanks,
Muthu

On Wed, Jul 3, 2019 at 11:12 PM Gregory Farnum  wrote:

> Well, the RADOS interface doesn't have a great deal of documentation
> so I don't know if I can point you at much.
>
> But if you look at Objecter.h, you see that the ObjectOperation has
> this function:
> void copy_from(object_t src, snapid_t snapid, object_locator_t
> src_oloc, version_t src_version, unsigned flags, unsigned
> src_fadvise_flags)
>
> src: the object to copy from
> snapid: if you want to copy a specific snap instead of HEAD
> src_oloc: the object locator for the object
> src_version: the version of the object to copy from (helps identify if
> it was updated in the meantime)
> flags: probably don't want to set these, but see
> PrimaryLogPG::_copy_some for the choices
> src_fadvise_flags: these are the fadvise flags we have in various
> places that let you specify things like not to cache the data.
> Probably leave them unset.
>
> -Greg
>
>
>
> On Wed, Jul 3, 2019 at 2:47 AM nokia ceph 
> wrote:
> >
> > Hi Greg,
> >
> > Can you please share the api details  for COPY_FROM or any reference
> document?
> >
> > Thanks ,
> > Muthu
> >
> > On Wed, Jul 3, 2019 at 4:12 AM Brad Hubbard  wrote:
> >>
> >> On Wed, Jul 3, 2019 at 4:25 AM Gregory Farnum 
> wrote:
> >> >
> >> > I'm not sure how or why you'd get an object class involved in doing
> >> > this in the normal course of affairs.
> >> >
> >> > There's a copy_from op that a client can send and which copies an
> >> > object from another OSD into the target object. That's probably the
> >> > primitive you want to build on. Note that the OSD doesn't do much
> >>
> >> Argh! yes, good idea. We really should document that!
> >>
> >> > consistency checking (it validates that the object version matches an
> >> > input, but if they don't it just returns an error) so the client
> >> > application is responsible for any locking needed.
> >> > -Greg
> >> >
> >> > On Tue, Jul 2, 2019 at 3:49 AM Brad Hubbard 
> wrote:
> >> > >
> >> > > Yes, this should be possible using an object class which is also a
> >> > > RADOS client (via the RADOS API). You'll still have some client
> >> > > traffic as the machine running the object class will still need to
> >> > > connect to the relevant primary osd and send the write (presumably
> in
> >> > > some situations though this will be the same machine).
> >> > >
> >> > > On Tue, Jul 2, 2019 at 4:08 PM nokia ceph 
> wrote:
> >> > > >
> >> > > > Hi Brett,
> >> > > >
> >> > > > I think I was wrong here in the requirement description. It is
> not about data replication , we need same content stored in different
> object/name.
> >> > > > We store video contents inside the ceph cluster. And our new
> requirement is we need to store same content for different users , hence
> need same content in different object name . if client sends write request
> for object x and sets number of copies as 100, then cluster has to clone
> 100 copies of object x and store it as object x1, objectx2,etc. Currently
> this is done in the client side where objectx1, object x2...objectx100 are
> cloned inside the client and write request sent for all 100 objects which
> we want to avoid to reduce network consumption.
> >> > > >
> >> > > > Similar usecases are rbd snapshot , radosgw copy .
> >> > > >
> >> > > > Is this possible in object class ?
> >> > > >
> >> > > > thanks,
> >> > > > Muthu
> >> > > >
> >> > > >
> >> > > > On Mon, Jul 1, 2019 at 7:58 PM Brett Chancellor <
> bchancel...@salesforce.com> wrote:
> >> > > >>
> >> > > >> Ceph already does this by default. For each replicated pool, you
> can set the 'size' which is the number of copies you want Ceph to maintain.
> The accepted norm for replicas is 3, but you can set it higher if you want
> to incur the performance penalty.
> >> > > >>
> >> > > >> On Mon, Jul 1, 2019, 6:01 AM nokia ceph <
> nokiacephus...@gmail.com> wrote:
> >> > > >>>
> >> > > >>> Hi Brad,
> >> > > >>>
> >> > > >>> Thank you for your response , and we will check this video as
> well.
> >> > > >>> Our requirement is while writing an object into the cluster ,
> if we can provide number of copies to be made , the network consumption
> between client and cluster will be only for one object write. However , the
> cluster will clone/copy multiple objects and stores inside the cluster.
> >> > > >>>
> >> > > >>> Thanks,
> >> > > >>> Muthu
> >> > > >>>
> >> > > >>> On Fri, Jun 28, 2019 at 9:23 AM Brad Hubbard <
> bhubb...@redhat.com> wrote:
> >> > > 
> >> > >  On Thu, Jun 27, 2019 at 8:58 PM nokia ceph <
> nokiacephus...@gmail.com> wrote:
> >> > >  >
> >> > >  > Hi Team,
> >> > >  >
> >> > >  > We have a requirement to create multiple copies of an object
> and currently we are handling it in client side to write as separate
> objects and this causes huge network traffic between client and cluster.
> >> > >  > Is there possibility of cloning an object to multiple copies
> using librados api?
> >> > 

Re: [ceph-users] ceph-volume failed after replacing disk

2019-07-05 Thread ST Wong (ITSC)
Hi,

I target to run just destroy and re-use the ID as stated in manual but seems 
not working.
Seems I’m unable to re-use the ID ?

Thanks.
/stwong


From: Paul Emmerich 
Sent: Friday, July 5, 2019 5:54 PM
To: ST Wong (ITSC) 
Cc: Eugen Block ; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] ceph-volume failed after replacing disk


On Fri, Jul 5, 2019 at 11:25 AM ST Wong (ITSC) 
mailto:s...@itsc.cuhk.edu.hk>> wrote:
Hi,

Yes, I run the commands before:

# ceph osd crush remove osd.71
device 'osd.71' does not appear in the crush map
# ceph auth del osd.71
entity osd.71 does not exist

which is probably the reason why you couldn't recycle the OSD ID.

Either run just destroy and re-use the ID or run purge and not re-use the ID.
Manually deleting auth and crush entries is no longer needed since purge was 
introduced.


Paul

--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90


Thanks.
/stwong

-Original Message-
From: ceph-users 
mailto:ceph-users-boun...@lists.ceph.com>> 
On Behalf Of Eugen Block
Sent: Friday, July 5, 2019 4:54 PM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] ceph-volume failed after replacing disk

Hi,

did you also remove that OSD from crush and also from auth before recreating it?

ceph osd crush remove osd.71
ceph auth del osd.71

Regards,
Eugen


Zitat von "ST Wong (ITSC)" 
mailto:s...@itsc.cuhk.edu.hk>>:

> Hi all,
>
> We replaced a faulty disk out of N OSD and tried to follow steps
> according to "Replacing and OSD" in
> http://docs.ceph.com/docs/nautilus/rados/operations/add-or-rm-osds/,
> but got error:
>
> # ceph osd destroy 71--yes-i-really-mean-it # ceph-volume lvm create
> --bluestore --data /dev/data/lv01 --osd-id
> 71 --block.db /dev/db/lv01
> Running command: /bin/ceph-authtool --gen-print-key Running command:
> /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring
> /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
> -->  RuntimeError: The osd ID 71 is already in use or does not exist.
>
> ceph -s still shows  N OSDS.   I then remove with "ceph osd rm 71".
>  Now "ceph -s" shows N-1 OSDS and id 71 doesn't appear in "ceph osd
> ls".
>
> However, repeating the ceph-volume command still gets same error.
> We're running CEPH 14.2.1.   I must have some steps missed.Would
> anyone please help? Thanks a lot.
>
> Rgds,
> /stwong



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-volume failed after replacing disk

2019-07-05 Thread Paul Emmerich
On Fri, Jul 5, 2019 at 11:25 AM ST Wong (ITSC)  wrote:

> Hi,
>
> Yes, I run the commands before:
>
> # ceph osd crush remove osd.71
> device 'osd.71' does not appear in the crush map
> # ceph auth del osd.71
> entity osd.71 does not exist
>

which is probably the reason why you couldn't recycle the OSD ID.

Either run just destroy and re-use the ID or run purge and not re-use the
ID.
Manually deleting auth and crush entries is no longer needed since purge
was introduced.


Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90


>
> Thanks.
> /stwong
>
> -Original Message-
> From: ceph-users  On Behalf Of Eugen
> Block
> Sent: Friday, July 5, 2019 4:54 PM
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] ceph-volume failed after replacing disk
>
> Hi,
>
> did you also remove that OSD from crush and also from auth before
> recreating it?
>
> ceph osd crush remove osd.71
> ceph auth del osd.71
>
> Regards,
> Eugen
>
>
> Zitat von "ST Wong (ITSC)" :
>
> > Hi all,
> >
> > We replaced a faulty disk out of N OSD and tried to follow steps
> > according to "Replacing and OSD" in
> > http://docs.ceph.com/docs/nautilus/rados/operations/add-or-rm-osds/,
> > but got error:
> >
> > # ceph osd destroy 71--yes-i-really-mean-it # ceph-volume lvm create
> > --bluestore --data /dev/data/lv01 --osd-id
> > 71 --block.db /dev/db/lv01
> > Running command: /bin/ceph-authtool --gen-print-key Running command:
> > /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring
> > /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
> > -->  RuntimeError: The osd ID 71 is already in use or does not exist.
> >
> > ceph -s still shows  N OSDS.   I then remove with "ceph osd rm 71".
> >  Now "ceph -s" shows N-1 OSDS and id 71 doesn't appear in "ceph osd
> > ls".
> >
> > However, repeating the ceph-volume command still gets same error.
> > We're running CEPH 14.2.1.   I must have some steps missed.Would
> > anyone please help? Thanks a lot.
> >
> > Rgds,
> > /stwong
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Understanding incomplete PGs

2019-07-05 Thread Paul Emmerich
* There are virtually no use cases for ec pools with m=1, this is a bad
configuration as you can't have both availability and durability

* Due to weird internal restrictions ec pools below their min size can't
recover, you'll probably have to reduce min_size temporarily to recover it

* Depending on your version it might be necessary to restart some of the
OSDs due to a bug (fixed by now) that caused it to mark some objects as
degraded if you remove or restart an OSD while you have remapped objects

* run "ceph osd safe-to-destroy X" to check if it's safe to destroy a given
OSD




-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90


On Fri, Jul 5, 2019 at 1:17 AM Kyle  wrote:

> Hello,
>
> I'm working with a small ceph cluster (about 10TB, 7-9 OSDs, all Bluestore
> on
> lvm) and recently ran into a problem with 17 pgs marked as incomplete
> after
> adding/removing OSDs.
>
> Here's the sequence of events:
> 1. 7 osds in the cluster, health is OK, all pgs are active+clean
> 2. 3 new osds on a new host are added, lots of backfilling in progress
> 3. osd 6 needs to be removed, so we do "ceph osd crush reweight osd.6 0"
> 4. after a few hours we see "min osd.6 with 0 pgs" from "ceph osd
> utilization"
> 5. ceph osd out 6
> 6. systemctl stop ceph-osd@6
> 7. the drive backing osd 6 is pulled and wiped
> 8. backfilling has now finished all pgs are active+clean except for 17
> incomplete pgs
>
> From reading the docs, it sounds like there has been unrecoverable data
> loss
> in those 17 pgs. That raises some questions for me:
>
> Was "ceph osd utilization" only showing a goal of 0 pgs allocated instead
> of
> the current actual allocation?
>
> Why is there data loss from a single osd being removed? Shouldn't that be
> recoverable?
> All pools in the cluster are either replicated 3 or erasure-coded k=2,m=1
> with
> default "host" failure domain. They shouldn't suffer data loss with a
> single
> osd being removed even if there were no reweighting beforehand. Does the
> backfilling temporarily reduce data durability in some way?
>
> Is there a way to see which pgs actually have data on a given osd?
>
> I attached an example of one of the incomplete pgs.
>
> Thanks for any help,
>
> Kyle___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Understanding incomplete PGs

2019-07-05 Thread Caspar Smit
Kyle,

Was the cluster still backfilling when you removed osd 6 or did you only
check its utilization?

Running an EC pool with m=1 is a bad idea. EC pool min_size = k+1 so losing
a single OSD results in inaccessible data.
Your incomplete PG's are probably all EC pool pgs, please verify.

If the above statement is true, you could *temporarily* set min_size to 2
(on your EC pools) to get back access to your data again but this is a very
dangerous action. Losing another OSD during this period results in actual
data loss.

Kind regards,
Caspar Smit

Op vr 5 jul. 2019 om 01:17 schreef Kyle :

> Hello,
>
> I'm working with a small ceph cluster (about 10TB, 7-9 OSDs, all Bluestore
> on
> lvm) and recently ran into a problem with 17 pgs marked as incomplete
> after
> adding/removing OSDs.
>
> Here's the sequence of events:
> 1. 7 osds in the cluster, health is OK, all pgs are active+clean
> 2. 3 new osds on a new host are added, lots of backfilling in progress
> 3. osd 6 needs to be removed, so we do "ceph osd crush reweight osd.6 0"
> 4. after a few hours we see "min osd.6 with 0 pgs" from "ceph osd
> utilization"
> 5. ceph osd out 6
> 6. systemctl stop ceph-osd@6
> 7. the drive backing osd 6 is pulled and wiped
> 8. backfilling has now finished all pgs are active+clean except for 17
> incomplete pgs
>
> From reading the docs, it sounds like there has been unrecoverable data
> loss
> in those 17 pgs. That raises some questions for me:
>
> Was "ceph osd utilization" only showing a goal of 0 pgs allocated instead
> of
> the current actual allocation?
>
> Why is there data loss from a single osd being removed? Shouldn't that be
> recoverable?
> All pools in the cluster are either replicated 3 or erasure-coded k=2,m=1
> with
> default "host" failure domain. They shouldn't suffer data loss with a
> single
> osd being removed even if there were no reweighting beforehand. Does the
> backfilling temporarily reduce data durability in some way?
>
> Is there a way to see which pgs actually have data on a given osd?
>
> I attached an example of one of the incomplete pgs.
>
> Thanks for any help,
>
> Kyle___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-volume failed after replacing disk

2019-07-05 Thread ST Wong (ITSC)
Hi,

Yes, I run the commands before:

# ceph osd crush remove osd.71
device 'osd.71' does not appear in the crush map
# ceph auth del osd.71
entity osd.71 does not exist

Thanks.
/stwong

-Original Message-
From: ceph-users  On Behalf Of Eugen Block
Sent: Friday, July 5, 2019 4:54 PM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] ceph-volume failed after replacing disk

Hi,

did you also remove that OSD from crush and also from auth before recreating it?

ceph osd crush remove osd.71
ceph auth del osd.71

Regards,
Eugen


Zitat von "ST Wong (ITSC)" :

> Hi all,
>
> We replaced a faulty disk out of N OSD and tried to follow steps 
> according to "Replacing and OSD" in 
> http://docs.ceph.com/docs/nautilus/rados/operations/add-or-rm-osds/,
> but got error:
>
> # ceph osd destroy 71--yes-i-really-mean-it # ceph-volume lvm create 
> --bluestore --data /dev/data/lv01 --osd-id
> 71 --block.db /dev/db/lv01
> Running command: /bin/ceph-authtool --gen-print-key Running command: 
> /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring 
> /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
> -->  RuntimeError: The osd ID 71 is already in use or does not exist.
>
> ceph -s still shows  N OSDS.   I then remove with "ceph osd rm 71".   
>  Now "ceph -s" shows N-1 OSDS and id 71 doesn't appear in "ceph osd 
> ls".
>
> However, repeating the ceph-volume command still gets same error.
> We're running CEPH 14.2.1.   I must have some steps missed.Would  
> anyone please help? Thanks a lot.
>
> Rgds,
> /stwong



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-volume failed after replacing disk

2019-07-05 Thread Eugen Block

Hi,

did you also remove that OSD from crush and also from auth before  
recreating it?


ceph osd crush remove osd.71
ceph auth del osd.71

Regards,
Eugen


Zitat von "ST Wong (ITSC)" :


Hi all,

We replaced a faulty disk out of N OSD and tried to follow steps  
according to "Replacing and OSD" in  
http://docs.ceph.com/docs/nautilus/rados/operations/add-or-rm-osds/,  
but got error:


# ceph osd destroy 71--yes-i-really-mean-it
# ceph-volume lvm create --bluestore --data /dev/data/lv01 --osd-id  
71 --block.db /dev/db/lv01

Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph --cluster ceph --name  
client.bootstrap-osd --keyring  
/var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json

-->  RuntimeError: The osd ID 71 is already in use or does not exist.

ceph -s still shows  N OSDS.   I then remove with "ceph osd rm 71".   
 Now "ceph -s" shows N-1 OSDS and id 71 doesn't appear in "ceph osd  
ls".


However, repeating the ceph-volume command still gets same error.
We're running CEPH 14.2.1.   I must have some steps missed.Would  
anyone please help? Thanks a lot.


Rgds,
/stwong




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Invalid metric type, prometheus module with rbd mirroring

2019-07-05 Thread Michel Raabe

Hi Brett!

fyi it's fixed last month:

https://github.com/ceph/ceph/commit/425c5358fed9376939cff8a922c3ce1186d6b9e2


HTH,
Michel

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com