Re: [ceph-users] Consumer-grade SSD in Ceph

2019-12-20 Thread Matthew H
Hi Sinan,

I would not recommend using 860 EVO or Crucial MX500 SSD's in a Ceph cluster, 
as those are consumer grade solutions and not enterprise ones.

Performance and durability will be issues. If feasible, I would simply go NVMe  
as it sounds like you will be using this disk to store the journal or db 
partition.


From: ceph-users  on behalf of Antoine 
Lecrux 
Sent: Thursday, December 19, 2019 4:02 PM
To: Udo Lembke ; Sinan Polat 
Cc: ceph-users@lists.ceph.com 
Subject: Re: [ceph-users] Consumer-grade SSD in Ceph

Hi,

If you're looking for a consumer grade SSD, make sure it has capacitors to 
protect you from data corruption in case of a power outage on the entire Ceph 
Cluster.
That's the most important technical specification to look for.

- Antoine

-Original Message-
From: ceph-users  On Behalf Of Udo Lembke
Sent: Thursday, December 19, 2019 3:22 PM
To: Sinan Polat 
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Consumer-grade SSD in Ceph

Hi,
if you add on more than one server an SSD with an short lifetime, you can run 
in real trouble (dataloss)!
Even if, all other SSDs are enterprise grade.
Ceph mix all data in PGs, which are spread over many disks - if one disk fails 
- no poblem, but if the next two fails after that due high io
(recovery) you will have data loss.
But if you have only one node with consumer SSDs, the whole node can go down 
without trouble...

I've tried consumer SSDs as yournal a long time ago - was an bad idea!
But this SSDs are cheap - buy one and do the io-test.
If you monitoring the live-time it's perhaps possible for your setup.

Udo


Am 19.12.19 um 20:20 schrieb Sinan Polat:
> Hi all,
>
> Thanks for the replies. I am not worried about their lifetime. We will be 
> adding only 1 SSD disk per physical server. All SSD’s are enterprise drives. 
> If the added consumer grade disk will fail, no problem.
>
> I am more curious regarding their I/O performance. I do want to have 50% drop 
> in performance.
>
> So anyone any experience with 860 EVO or Crucial MX500 in a Ceph setup?
>
> Thanks!
>
>> Op 19 dec. 2019 om 19:18 heeft Mark Nelson  het volgende 
>> geschreven:
>>
>> The way I try to look at this is:
>>
>>
>> 1) How much more do the enterprise grade drives cost?
>>
>> 2) What are the benefits? (Faster performance, longer life, etc)
>>
>> 3) How much does it cost to deal with downtime, diagnose issues, and replace 
>> malfunctioning hardware?
>>
>>
>> My personal take is that enterprise drives are usually worth it. There may 
>> be consumer grade drives that may be worth considering in very specific 
>> scenarios if they still have power loss protection and high write 
>> durability.  Even when I was in academia years ago with very limited 
>> budgets, we got burned with consumer grade SSDs to the point where we had to 
>> replace them all.  You have to be very careful and know exactly what you are 
>> buying.
>>
>>
>> Mark
>>
>>
>>> On 12/19/19 12:04 PM, jes...@krogh.cc wrote:
>>> I dont think “usually” is good enough in a production setup.
>>>
>>>
>>>
>>> Sent from myMail for iOS
>>>
>>>
>>> Thursday, 19 December 2019, 12.09 +0100 from Виталий Филиппов 
>>> :
>>>
>>>Usually it doesn't, it only harms performance and probably SSD
>>>lifetime
>>>too
>>>
>>>> I would not be running ceph on ssds without powerloss protection. I
>>>> delivers a potential data loss scenario
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rebalancing ceph cluster

2019-06-25 Thread Matthew H
If you are running Luminous or newer, you can simply enable the balancer module 
[1].

[1]
http://docs.ceph.com/docs/luminous/mgr/balancer/



From: ceph-users  on behalf of Robert 
LeBlanc 
Sent: Tuesday, June 25, 2019 5:22 PM
To: jinguk.k...@ungleich.ch
Cc: ceph-users
Subject: Re: [ceph-users] rebalancing ceph cluster

The placement of PGs is random in the cluster and takes into account any CRUSH 
rules which may also skew the distribution. Having more PGs will help give more 
options for placing PGs, but it still may not be adequate. It is recommended to 
have between 100-150 PGs per OSD, and you are pretty close. If you aren't 
planning to add any more pools, then splitting the PGs for pools that have a 
lot of data can help.

To get things to be more balanced, you can reweight the high utlization OSDs 
down to cause CRUSH to migrate some PGs off. This won't mean that they will get 
moved to the lowest utilized OSDs (they might wind up on another one that is 
pretty full). So, it may take several iterations to get things balanced. Just 
be sure that if you reweighted one down and it is now much lower usage than the 
others to reweight it back up to attract some PGs back to it.

```ceph osd reweight {osd-num} {weight}```

Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Mon, Jun 24, 2019 at 2:25 AM 
jinguk.k...@ungleich.ch 
mailto:jinguk.k...@ungleich.ch>> wrote:
Hello everyone,

We have some osd on the ceph.
Some osd's usage is more than 77% and another osd's usage is 39% in the same 
host.

I wonder why osd’s usage is different.(Difference is large) and how can i fix 
it?

ID  CLASS   WEIGHTREWEIGHT SIZEUSE AVAIL   %USE  VAR  PGS TYPE NAME
 -2  93.26010- 93.3TiB 52.3TiB 41.0TiB 56.04 0.98   - host 
serverA
…...
 33 HDD  9.09511  1.0 9.10TiB 3.55TiB 5.54TiB 39.08 0.68  66 osd.4
 45 HDD   7.27675  1.0 7.28TiB 5.64TiB 1.64TiB 77.53 1.36  81 osd.7
…...

-5  79.99017- 80.0TiB 47.7TiB 32.3TiB 59.62 1.04   - host 
serverB
  1 HDD   9.09511  1.0 9.10TiB 4.79TiB 4.31TiB 52.63 0.92  87 osd.1
  6 HDD   9.09511  1.0 9.10TiB 6.62TiB 2.48TiB 72.75 1.27  99 osd.6
 …...

Thank you
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw sync falling behind regularly

2019-03-05 Thread Matthew H
Hi Christian,

To be on the safe side and future proof yourself will want to go ahead and set 
the following in your ceph.conf file, and then issue a restart to your RGW 
instances.

rgw_dynamic_resharding = false

There are a number of issues with dynamic resharding, multisite rgw problems 
being just one of them. However I thought it was disabled automatically when 
multisite rgw is used (but I will have to double check the code on that). What 
version of Ceph did you initially install the cluster with? Prior to v12.2.2 
this feature was enabled by default for all rgw use cases.

Thanks,


From: Christian Rice 
Sent: Tuesday, March 5, 2019 2:07 PM
To: Matthew H; ceph-users
Subject: Re: radosgw sync falling behind regularly


Matthew, first of all, let me say we very much appreciate your help!



So I don’t think we turned dynamic resharding on, nor did we manually reshard 
buckets.  Seems like it defaults to on for luminous but the mimic docs say it’s 
not supported in multisite.  So do we need to disable it manually via tell and 
ceph.conf?



Also, after running the command you suggested, all the stale instances are 
gone…these from my examples were in output:

"bucket_instance": 
"sysad_task/sysad-task:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18467.303",

"bucket_instance": 
"sysad_task/sysad-task:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.299",

"bucket_instance": 
"sysad_task/sysad-task:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.301",



Though we still get lots of log messages like so in rgw:



2019-03-05 11:01:09.526120 7f64120ae700  0 ERROR: failed to get bucket instance 
info for 
.bucket.meta.sysad_task:sysad-task:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.299

2019-03-05 11:01:09.528664 7f63e5016700  1 civetweb: 0x55976f1c2000: 
172.17.136.17 - - [05/Mar/2019:10:54:06 -0800] "GET 
/admin/metadata/bucket.instance/sysad_task/sysad-task:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.299?key=sysad_task%2Fsysad-task%3A1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.299&rgwx-zonegroup=de6af748-1a2f-44a1-9d44-30799cf1313e
 HTTP/1.1" 404 0 - -

2019-03-05 11:01:09.529648 7f64130b0700  0 meta sync: ERROR: can't remove key: 
bucket.instance:sysad_task/sysad-task:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.299
 ret=-2

2019-03-05 11:01:09.530324 7f64138b1700  0 ERROR: failed to get bucket instance 
info for 
.bucket.meta.sysad_task:sysad-task:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.299

2019-03-05 11:01:09.530345 7f6405094700  0 data sync: ERROR: failed to retrieve 
bucket info for 
bucket=sysad_task/sysad-task:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.299

2019-03-05 11:01:09.531774 7f6405094700  0 data sync: WARNING: skipping data 
log entry for missing bucket 
sysad_task/sysad-task:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.299

2019-03-05 11:01:09.571680 7f6405094700  0 data sync: ERROR: init sync on 
sysad_task/sysad-task:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.302 failed, 
retcode=-2

2019-03-05 11:01:09.573179 7f6405094700  0 data sync: WARNING: skipping data 
log entry for missing bucket 
sysad_task/sysad-task:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.302

2019-03-05 11:01:13.504308 7f63f903e700  1 civetweb: 0x55976f0f2000: 
10.105.18.20 - - [05/Mar/2019:11:00:57 -0800] "GET 
/admin/metadata/bucket.instance/sysad_task/sysad-task:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.299?key=sysad_task%2Fsysad-task%3A1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.299&rgwx-zonegroup=de6af748-1a2f-44a1-9d44-30799cf1313e
 HTTP/1.1" 404 0 - -



From: Matthew H 
Date: Tuesday, March 5, 2019 at 10:03 AM
To: Christian Rice , ceph-users 
Subject: Re: radosgw sync falling behind regularly



Hi Christian,



You have stale bucket instances that need to be clean up, which is what 
'radosgw-admin reshard stale-instances list' is showing you. Have you or were 
you manually resharding your buckets? The errors you are seeing in the logs are 
related to these stale instances being kept around.



In v12.2.11 this command along with 'radosgw-admin reshard stale-instance rm' 
was introduced [1].



Hopefully this helps.



[1]

https://ceph.com/releases/v12-2-11-luminous-released/<https://urldefense.proofpoint.com/v2/url?u=https-3A__ceph.com_releases_v12-2D2-2D11-2Dluminous-2Dreleased_&d=DwMF-g&c=gFTBenQ7Vj71sUi1A4CkFnmPzqwDo07QsHw-JRepxyw&r=NE1NbWtVhgG-K7YvLdoLZigfFx8zGPwOGk6HWpYK04I&m=vdtYIn6lEKaWD9wW297aHjQLpmQdHZrOVpOhmCBqkqo&s=nGCpS4p5jnaSpPUFlziSi3Y3pFijhVDy6e3867jA9BE&e=>



"There have been fixes to RGW dynamic and manual resharding, which no longer
leaves behind stale bucket instances to be removed manually. For finding and
cleaning up older instances from a reshard a radosgw-admin command reshard
stale-instances list and reshard stale-instances rm should do the necessary
cleanup."






Re: [ceph-users] radosgw sync falling behind regularly

2019-03-04 Thread Matthew H
Christian,

Can you provide your zonegroup and zones configurations for all 3 rgw sites? 
(run the commands for each site please)

Thanks,


From: Christian Rice 
Sent: Monday, March 4, 2019 5:34 PM
To: Matthew H; ceph-users
Subject: Re: radosgw sync falling behind regularly


So we upgraded everything from 12.2.8 to 12.2.11, and things have gone to hell. 
 Lots of sync errors, like so:



sudo radosgw-admin sync error list

[

{

"shard_id": 0,

"entries": [

{

"id": "1_1549348245.870945_5163821.1",

"section": "data",

"name": 
"dora/catalogmaker-redis:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18467.470/56fbc9685d609b4c8cdbd11dd60bf03bedcb613b438c663c9899d930b25f0405",

"timestamp": "2019-02-05 06:30:45.870945Z",

"info": {

"source_zone": "1e27bf9c-3a2f-4845-85b6-33a24bbe1c04",

"error_code": 5,

"message": "failed to sync object(5) Input/output error"

}

},

…



radosgw logs are full of:

2019-03-04 14:32:58.039467 7f90e81eb700  0 data sync: ERROR: failed to read 
remote data log info: ret=-2

2019-03-04 14:32:58.041296 7f90e81eb700  0 data sync: ERROR: init sync on 
escarpment/escarpment:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18467.146 failed, 
retcode=-2

2019-03-04 14:32:58.041662 7f90e81eb700  0 meta sync: ERROR: 
RGWBackoffControlCR called coroutine returned -2

2019-03-04 14:32:58.042949 7f90e81eb700  0 data sync: WARNING: skipping data 
log entry for missing bucket 
escarpment/escarpment:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18467.146

2019-03-04 14:32:58.823501 7f90e81eb700  0 data sync: ERROR: failed to read 
remote data log info: ret=-2

2019-03-04 14:32:58.825243 7f90e81eb700  0 meta sync: ERROR: 
RGWBackoffControlCR called coroutine returned -2



dc11-ceph-rgw2:~$ sudo radosgw-admin sync status

  realm b3e2afe7-2254-494a-9a34-ce50358779fd (savagebucket)

  zonegroup de6af748-1a2f-44a1-9d44-30799cf1313e (us)

   zone 107d29a0-b732-4bf1-a26e-1f64f820e839 (dc11-prod)

2019-03-04 14:26:21.351372 7ff7ae042e40  0 meta sync: ERROR: failed to fetch 
mdlog info

  metadata sync syncing

full sync: 0/64 shards

failed to fetch local sync status: (5) Input/output error

^C



Any advice?  All three clusters on 12.2.11, Debian stretch.



From: Christian Rice 
Date: Thursday, February 28, 2019 at 9:06 AM
To: Matthew H , ceph-users 

Subject: Re: radosgw sync falling behind regularly



Yeah my bad on the typo, not running 12.8.8 ☺  It’s 12.2.8.  We can upgrade and 
will attempt to do so asap.  Thanks for that, I need to read my release notes 
more carefully, I guess!



From: Matthew H 
Date: Wednesday, February 27, 2019 at 8:33 PM
To: Christian Rice , ceph-users 
Subject: Re: radosgw sync falling behind regularly



Hey Christian,



I'm making a while guess, but assuming this is 12.2.8. If so, it it possible 
that you can upgrade to 12.2.11? There's been rgw multisite bug fixes for 
metadata syncing and data syncing ( both separate issues ) that you could be 
hitting.



Thanks,



From: ceph-users  on behalf of Christian 
Rice 
Sent: Wednesday, February 27, 2019 7:05 PM
To: ceph-users
Subject: [ceph-users] radosgw sync falling behind regularly



Debian 9; ceph 12.8.8-bpo90+1; no rbd or cephfs, just radosgw; three clusters 
in one zonegroup.



Often we find either metadata or data sync behind, and it doesn’t look to ever 
recover until…we restart the endpoint radosgw target service.



eg at 15:45:40:



dc11-ceph-rgw1:/var/log/ceph# radosgw-admin sync status

  realm b3e2afe7-2254-494a-9a34-ce50358779fd (savagebucket)

  zonegroup de6af748-1a2f-44a1-9d44-30799cf1313e (us)

   zone 107d29a0-b732-4bf1-a26e-1f64f820e839 (dc11-prod)

  metadata sync syncing

full sync: 0/64 shards

incremental sync: 64/64 shards

metadata is behind on 2 shards

behind shards: [19,41]

oldest incremental change not applied: 2019-02-27 
14:42:24.0.408263s

  data sync source: 1e27bf9c-3a2f-4845-85b6-33a24bbe1c04 (sv5-corp)

syncing

full sync: 0/128 shards

incremental sync: 128/128 shards

data is caught up with source

source: 331d3f1e-1b72-4c56-bb5a-d1d0fcf6d0b8 (sv3-prod)

syncing

full sync: 0/128 shards

incremental sync: 128/128 shards

data is caught up with source





so at 15:46:07:



dc11-ceph-rgw1:/var/log/ceph# sudo syste

Re: [ceph-users] Problems creating a balancer plan

2019-03-02 Thread Matthew H
Hi Massimo!

What version of Ceph is in use?

Thanks,


From: ceph-users  on behalf of Massimo 
Sgaravatto 
Sent: Friday, March 1, 2019 1:24 PM
To: Ceph Users
Subject: [ceph-users] Problems creating a balancer plan

Hi

I already used the balancer in my ceph luminous cluster a while ago when all 
the OSDs were using filestore.

Now, after having added some bluestore OSDs, if I try to create a plan:


[root@ceph-mon-01 ~]# ceph balancer status
{
"active": false,
"plans": [],
"mode": "crush-compat"
}

[root@ceph-mon-01 ~]# ceph balancer eval
current cluster score 0.051599 (lower is better)

[root@ceph-mon-01 ~]# ceph balancer optimize 01-march-2019
Error EINVAL: Traceback (most recent call last):
  File "/usr/lib64/ceph/mgr/balancer/module.py", line 340, in handle_command
r, detail = self.optimize(plan)
  File "/usr/lib64/ceph/mgr/balancer/module.py", line 670, in optimize
return self.do_crush_compat(plan)
  File "/usr/lib64/ceph/mgr/balancer/module.py", line 814, in do_crush_compat
weight = best_ws[osd]
KeyError: (64,)

This is what I see in the mgr log:




2019-03-01 19:15:20.310116 7faeff76c700  0 log_channel(audit) log [DBG] : 
from='client.194721456 
192.168.61.206:0/585546872' 
entity='client.admin' cmd=[{"prefix": "balancer optimize", "plan": 
"01-march-2019", "target": ["mgr", ""]}]: dispatch
2019-03-01 19:15:20.310162 7faeff76c700  1 mgr.server handle_command 
pyc_prefix: 'balancer status'
2019-03-01 19:15:20.310171 7faeff76c700  1 mgr.server handle_command 
pyc_prefix: 'balancer mode'
2019-03-01 19:15:20.310179 7faeff76c700  1 mgr.server handle_command 
pyc_prefix: 'balancer on'
2019-03-01 19:15:20.310186 7faeff76c700  1 mgr.server handle_command 
pyc_prefix: 'balancer off'
2019-03-01 19:15:20.310195 7faeff76c700  1 mgr.server handle_command 
pyc_prefix: 'balancer eval'
2019-03-01 19:15:20.310203 7faeff76c700  1 mgr.server handle_command 
pyc_prefix: 'balancer eval-verbose'
2019-03-01 19:15:20.310211 7faeff76c700  1 mgr.server handle_command 
pyc_prefix: 'balancer optimize'
2019-03-01 19:15:20.310487 7faefff6d700  1 mgr[balancer] Handling command: 
'{'prefix': 'balancer optimize', 'plan': '01-march-2019', 'target': ['mgr', 
'']}'
2019-03-01 19:15:20.530784 7faf204c9700  1 mgr send_beacon active
2019-03-01 19:15:20.559914 7faefff6d700  1 mgr.server reply handle_command (22) 
Invalid argument Traceback (most recent call last):
  File "/usr/lib64/ceph/mgr/balancer/module.py", line 340, in handle_command
r, detail = self.optimize(plan)
  File "/usr/lib64/ceph/mgr/balancer/module.py", line 670, in optimize
return self.do_crush_compat(plan)
  File "/usr/lib64/ceph/mgr/balancer/module.py", line 814, in do_crush_compat
weight = best_ws[osd]
KeyError: (64,)



Thanks, Massimo
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd unmap fails with error: rbd: sysfs write failed rbd: unmap failed: (16) Device or resource busy

2019-03-02 Thread Matthew H
You can force an rbd unmap with the command below:

rbd unmap -o force $DEV

If it still doesn't unmap, then you have pending IO blocking you.

As llya mentioned for good measure you should also check to see if LVM is in 
use on this RBD volume. If it is, then that could be blocking you from 
unmapping the RBD device normally.


From: ceph-users  on behalf of David Turner 

Sent: Friday, March 1, 2019 8:03 PM
To: solarflow99
Cc: ceph-users
Subject: Re: [ceph-users] rbd unmap fails with error: rbd: sysfs write failed 
rbd: unmap failed: (16) Device or resource busy

True, but not before you unmap it from the previous server. It's like 
physically connecting a harddrive to two servers at the same time. Neither 
knows what the other is doing to it and can corrupt your data. You should 
always make sure to unmap an rbd before mapping it to another server.

On Fri, Mar 1, 2019, 6:28 PM solarflow99 
mailto:solarflo...@gmail.com>> wrote:
It has to be mounted from somewhere, if that server goes offline, you need to 
mount it from somewhere else right?


On Thu, Feb 28, 2019 at 11:15 PM David Turner 
mailto:drakonst...@gmail.com>> wrote:
Why are you making the same rbd to multiple servers?

On Wed, Feb 27, 2019, 9:50 AM Ilya Dryomov 
mailto:idryo...@gmail.com>> wrote:
On Wed, Feb 27, 2019 at 12:00 PM Thomas 
<74cmo...@gmail.com> wrote:
>
> Hi,
> I have noticed an error when writing to a mapped RBD.
> Therefore I unmounted the block device.
> Then I tried to unmap it w/o success:
> ld2110:~ # rbd unmap /dev/rbd0
> rbd: sysfs write failed
> rbd: unmap failed: (16) Device or resource busy
>
> The same block device is mapped on another client and there are no issues:
> root@ld4257:~# rbd info hdb-backup/ld2110
> rbd image 'ld2110':
> size 7.81TiB in 2048000 objects
> order 22 (4MiB objects)
> block_name_prefix: rbd_data.3cda0d6b8b4567
> format: 2
> features: layering
> flags:
> create_timestamp: Fri Feb 15 10:53:50 2019
> root@ld4257:~# rados -p hdb-backup  listwatchers rbd_data.3cda0d6b8b4567
> error listing watchers hdb-backup/rbd_data.3cda0d6b8b4567: (2) No such
> file or directory
> root@ld4257:~# rados -p hdb-backup  listwatchers rbd_header.3cda0d6b8b4567
> watcher=10.76.177.185:0/1144812735 
> client.21865052 cookie=1
> watcher=10.97.206.97:0/4023931980 
> client.18484780
> cookie=18446462598732841027
>
>
> Question:
> How can I force to unmap the RBD on client ld2110 (= 10.76.177.185)?

Hi Thomas,

It appears that /dev/rbd0 is still open on that node.

Was the unmount successful?  Which filesystem (ext4, xfs, etc)?

What is the output of "ps aux | grep rbd" on that node?

Try lsof, fuser, check for LVM volumes and multipath -- these have been
reported to cause this issue previously:

  http://tracker.ceph.com/issues/12763

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PG Calculations Issue

2019-03-01 Thread Matthew H
I believe the question was in regards to which formula to use. There are two 
different formulas here [1] and here [2].

The difference being the additional steps used to calculate the appropriate PG 
counts for a pool. In Nautilus though, this mostly moot as the mgr service now 
has a module to automatically scale PG's. It would be great to see how this 
could be back ported to Mimic or Luminous though. 😉

[1]
https://ceph.com/pgcalc_assets/pgcalc.html
(this one is used by the PGCalc javascript)

[2]
https://ceph.com/pgcalc/



From: ceph-users  on behalf of David Turner 

Sent: Thursday, February 28, 2019 10:34 PM
To: Krishna Venkata
Cc: ceph-users
Subject: Re: [ceph-users] PG Calculations Issue

Those numbers look right for a pool only containing 10% of your data. Now 
continue to calculate the pg counts for the remaining 90% of your data.

On Wed, Feb 27, 2019, 12:17 PM Krishna Venkata 
mailto:kvenkata...@gmail.com>> wrote:

Greetings,


I am having issues in the way PGs are calculated in https://ceph.com/pgcalc/ 
[Ceph PGs per Pool Calculator ] and the formulae mentioned in the site.

Below are my findings

The formula to calculate PGs as mentioned in the https://ceph.com/pgcalc/ :

1.  Need to pick the highest value from either of the formulas

(( Target PGs per OSD ) x ( OSD # ) x ( %Data ))/(size)

Or

( OSD# ) / ( Size )

2.  The output value is then rounded to the nearest power of 2

  1.  If the nearest power of 2 is more than 25% below the original value, the 
next higher power of 2 is used.



Based on the above procedure, we calculated PGs for 25, 32 and 64 OSDs

Our Dataset:

%Data: 0.10

Target PGs per OSD: 100

OSDs 25, 32 and 64



For 25 OSDs



(100*25* (0.10/100))/(3) = 0.833



( 25 ) / ( 3 ) = 8.33



1. Raw pg num 8.33  ( Since we need to pick the highest of (0.833, 8.33))

2. max pg 16 ( For, 8.33 the nearest power of 2 is 16)

3. 16 > 2.08  ( 25 % of 8.33 is 2.08 which is more than 25% the power of 2)



So 16 PGs

•  GUI Calculator gives the same value and matches with Formula.



For 32 OSD



(100*32*(0.10/100))/3 = 1.066

( 32 ) / ( 3 ) = 10.66



1. Raw pg num 10.66 ( Since we need to pick the highest of (1.066, 10.66))

2. max pg 16 ( For, 10.66 the nearest power of 2 is 16)

3.  16 > 2.655 ( 25 % of 10.66 is 2.655 which is more than 25% the power of 2)



So 16 PGs

•  GUI Calculator gives different value (32 PGs) which doesn’t match with 
Formula.



For 64 OSD



(100 * 64 * (0.10/100))/3 = 2.133

( 64 ) / ( 3 ) 21.33



1. Raw pg num 21.33 ( Since we need to pick the highest of (2.133, 21.33))

2. max pg 32 ( For, 21.33 the nearest power of 2 is 32)

3. 32 > 5.3325 ( 25 % of 21.33 is 5.3325 which is more than 25% the power of 2)



So 32 PGs

•  GUI Calculator gives different value (64 PGs) which doesn’t match with 
Formula.



We checked the PG calculator logic from [ 
https://ceph.com/pgcalc_assets/pgcalc.js ] which is not matching from above 
formulae.



Can someone Guide/reference us to correct formulae to calculate PGs.



Thanks in advance.



Regards,

Krishna Venkata

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd space usage

2019-02-28 Thread Matthew H
It looks like he used 'rbd map' to map his volume. If so, then yes just run 
fstrim on the device.

If it's an instance with a cinder, or a nova ephemeral disk (on ceph) then you 
have to use virtio-scsi to run discard in your instance.


From: ceph-users  on behalf of Jack 

Sent: Thursday, February 28, 2019 5:39 PM
To: solarflow99
Cc: Ceph Users
Subject: Re: [ceph-users] rbd space usage

Ha, that was your issue

RBD does not know that your space (on the filesystem level) is now free
to use

You have to trim your filesystem, see fstrim(8) as well as the discard
mount option

The related scsi command have to be passed down the stack, so you may
need to check on other level (for instance, your hypervisor's configuration)

Regards,

On 02/28/2019 11:31 PM, solarflow99 wrote:
> yes, but:
>
> # rbd showmapped
> id pool image snap device
> 0  rbd  nfs1  -/dev/rbd0
> 1  rbd  nfs2  -/dev/rbd1
>
>
> # df -h
> Filesystem  Size  Used Avail Use% Mounted on
> /dev/rbd0   8.0T  4.8T  3.3T  60% /mnt/nfsroot/rbd0
> /dev/rbd1   9.8T   34M  9.8T   1% /mnt/nfsroot/rbd1
>
>
> only 5T is taken up
>
>
> On Thu, Feb 28, 2019 at 2:26 PM Jack  wrote:
>
>> Are not you using 3-replicas pool ?
>>
>> (15745GB + 955GB + 1595M) * 3 ~= 51157G (there is overhead involved)
>>
>> Best regards,
>>
>> On 02/28/2019 11:09 PM, solarflow99 wrote:
>>> thanks, I still can't understand whats taking up all the space 27.75
>>>
>>> On Thu, Feb 28, 2019 at 7:18 AM Mohamad Gebai  wrote:
>>>
 On 2/27/19 4:57 PM, Marc Roos wrote:
> They are 'thin provisioned' meaning if you create a 10GB rbd, it does
> not use 10GB at the start. (afaik)

 You can use 'rbd -p rbd du' to see how much of these devices is
 provisioned and see if it's coherent.

 Mohamad

>
>
> -Original Message-
> From: solarflow99 [mailto:solarflo...@gmail.com]
> Sent: 27 February 2019 22:55
> To: Ceph Users
> Subject: [ceph-users] rbd space usage
>
> using ceph df it looks as if RBD images can use the total free space
> available of the pool it belongs to, 8.54% yet I know they are created
> with a --size parameter and thats what determines the actual space.  I
> can't understand the difference i'm seeing, only 5T is being used but
> ceph df shows 51T:
>
>
> /dev/rbd0   8.0T  4.8T  3.3T  60% /mnt/nfsroot/rbd0
> /dev/rbd1   9.8T   34M  9.8T   1% /mnt/nfsroot/rbd1
>
>
>
> # ceph df
> GLOBAL:
> SIZE AVAIL RAW USED %RAW USED
> 180T  130T   51157G 27.75
> POOLS:
> NAMEID USED   %USED MAX AVAIL
> OBJECTS
> rbd 0  15745G  8.543G
> 4043495
> cephfs_data 1   0 03G
> 0
> cephfs_metadata 21962 03G
>20
> spider_stage 9   1595M 03G47835
> spider   10   955G  0.523G
> 42541237
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


>>>
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd space usage

2019-02-28 Thread Matthew H
I think the command you are looking for is 'rbd du'

example

rbd du rbd/myimagename


From: ceph-users  on behalf of solarflow99 

Sent: Thursday, February 28, 2019 5:31 PM
To: Jack
Cc: Ceph Users
Subject: Re: [ceph-users] rbd space usage

yes, but:

# rbd showmapped
id pool image snap device
0  rbd  nfs1  -/dev/rbd0
1  rbd  nfs2  -/dev/rbd1


# df -h
Filesystem  Size  Used Avail Use% Mounted on
/dev/rbd0   8.0T  4.8T  3.3T  60% /mnt/nfsroot/rbd0
/dev/rbd1   9.8T   34M  9.8T   1% /mnt/nfsroot/rbd1


only 5T is taken up


On Thu, Feb 28, 2019 at 2:26 PM Jack 
mailto:c...@jack.fr.eu.org>> wrote:
Are not you using 3-replicas pool ?

(15745GB + 955GB + 1595M) * 3 ~= 51157G (there is overhead involved)

Best regards,

On 02/28/2019 11:09 PM, solarflow99 wrote:
> thanks, I still can't understand whats taking up all the space 27.75
>
> On Thu, Feb 28, 2019 at 7:18 AM Mohamad Gebai 
> mailto:mge...@suse.de>> wrote:
>
>> On 2/27/19 4:57 PM, Marc Roos wrote:
>>> They are 'thin provisioned' meaning if you create a 10GB rbd, it does
>>> not use 10GB at the start. (afaik)
>>
>> You can use 'rbd -p rbd du' to see how much of these devices is
>> provisioned and see if it's coherent.
>>
>> Mohamad
>>
>>>
>>>
>>> -Original Message-
>>> From: solarflow99 
>>> [mailto:solarflo...@gmail.com]
>>> Sent: 27 February 2019 22:55
>>> To: Ceph Users
>>> Subject: [ceph-users] rbd space usage
>>>
>>> using ceph df it looks as if RBD images can use the total free space
>>> available of the pool it belongs to, 8.54% yet I know they are created
>>> with a --size parameter and thats what determines the actual space.  I
>>> can't understand the difference i'm seeing, only 5T is being used but
>>> ceph df shows 51T:
>>>
>>>
>>> /dev/rbd0   8.0T  4.8T  3.3T  60% /mnt/nfsroot/rbd0
>>> /dev/rbd1   9.8T   34M  9.8T   1% /mnt/nfsroot/rbd1
>>>
>>>
>>>
>>> # ceph df
>>> GLOBAL:
>>> SIZE AVAIL RAW USED %RAW USED
>>> 180T  130T   51157G 27.75
>>> POOLS:
>>> NAMEID USED   %USED MAX AVAIL
>>> OBJECTS
>>> rbd 0  15745G  8.543G
>>> 4043495
>>> cephfs_data 1   0 03G
>>> 0
>>> cephfs_metadata 21962 03G
>>>20
>>> spider_stage 9   1595M 03G47835
>>> spider   10   955G  0.523G
>>> 42541237
>>>
>>>
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: Re: Blocked ops after change from filestore on HDD to bluestore on SDD

2019-02-28 Thread Matthew H
Could you send your ceph.conf file over please? Are you setting any tunables 
for OSD or Bluestore currently?


From: ceph-users  on behalf of Uwe Sauter 

Sent: Thursday, February 28, 2019 8:33 AM
To: Marc Roos; ceph-users; vitalif
Subject: Re: [ceph-users] Fwd: Re: Blocked ops after change from filestore on 
HDD to bluestore on SDD

Do you have anything particular in mind? I'm using mdb backend with maxsize = 
1GB but currently the files are only about 23MB.


>
> I am having quite a few openldap servers (slaves) running also, make
> sure to use proper caching that saves a lot of disk io.
>
>
>
>
> -Original Message-
> Sent: 28 February 2019 13:56
> To: uwe.sauter...@gmail.com; Uwe Sauter; Ceph Users
> Subject: *SPAM* Re: [ceph-users] Fwd: Re: Blocked ops after
> change from filestore on HDD to bluestore on SDD
>
> "Advanced power loss protection" is in fact a performance feature, not a
> safety one.
>
>
> 28 февраля 2019 г. 13:03:51 GMT+03:00, Uwe Sauter
>  пишет:
>
>Hi all,
>
>thanks for your insights.
>
>Eneko,
>
>
>We tried to use a Samsung 840 Pro SSD as OSD some time ago and
> it was a no-go; it wasn't that performance was bad, it
>just didn't work for the kind of use of OSD. Any HDD was
> better than it (the disk was healthy and have been used in a
>software raid-1 for a pair of years).
>
>I suggest you check first that your Samsung 860 Pro disks work
> well for Ceph. Also, how is your host's RAM?
>
>
>As already mentioned the hosts each have 64GB RAM. Each host has 3
> SSDs for OSD usage. Each OSD is using about 1.3GB virtual
>memory / 400MB residual memory.
>
>
>
>Joachim,
>
>
>I can only recommend the use of enterprise SSDs. We've tested
> many consumer SSDs in the past, including your SSDs. Many
>of them are not suitable for long-term use and some weard out
> within 6 months.
>
>
>Unfortunately I couldn't afford enterprise grade SSDs. But I
> suspect that my workload (about 20 VMs for our infrastructure, the
>most IO demanding is probably LDAP) is light enough that wearout
> won't be a problem.
>
>The issue I'm seeing then is probably related to direct IO if using
> bluestore. But with filestore, the file system cache probably
>hides the latency issues.
>
>
>Igor,
>
>
>AFAIR Samsung 860 Pro isn't for enterprise market, you
> shouldn't use consumer SSDs for Ceph.
>
>I had some experience with Samsung 960 Pro a while ago and it
> turned out that it handled fsync-ed writes very slowly
>(comparing to the original/advertised performance). Which one
> can probably explain by the lack of power loss protection
>for these drives. I suppose it's the same in your case.
>
>Here are a couple links on the topic:
>
>
> https://www.percona.com/blog/2018/02/08/fsync-performance-storage-devices/
>
>
> https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
>
>
>Power loss protection wasn't a criteria for me as the cluster hosts
> are distributed in two buildings with separate battery backed
>UPSs. As mentioned above I suspect the main difference for my case
> between filestore and bluestore is file system cache vs. direct
>IO. Which means I will keep using filestore.
>
>Regards,
>
>Uwe
> 
>
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> --
> With best regards,
> Vitaliy Filippov
>
>

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] REQUEST_SLOW across many OSDs at the same time

2019-02-28 Thread Matthew H
Is fstrim or discard enabled for these SSD's? If so, how did you enable it?

I've seen similiar issues with poor controllers on SSDs. They tend to block I/O 
when trim kicks off.

Thanks,


From: ceph-users  on behalf of Paul Emmerich 

Sent: Friday, February 22, 2019 9:04 AM
To: Massimo Sgaravatto
Cc: Ceph Users
Subject: Re: [ceph-users] REQUEST_SLOW across many OSDs at the same time

Bad SSDs can also cause this. Which SSD are you using?

Paul

--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Fri, Feb 22, 2019 at 2:53 PM Massimo Sgaravatto
 wrote:
>
> A couple of hints to debug the issue (since I had to recently debug a problem 
> with the same symptoms):
>
> - As far as I understand the reported 'implicated osds' are only the primary 
> ones. In the log of the osds you should find also the relevant pg number, and 
> with this information you can get all the involved OSDs. This might be useful 
> e.g. to see if a specific OSD node is always involved. This was my case (a 
> the problem was with the patch cable connecting the node)
>
> - You can use the "ceph daemon osd.x dump_historic_ops" command to debug some 
> of these slow requests (to see which events take much time)
>
> Cheers, Massimo
>
> On Fri, Feb 22, 2019 at 10:28 AM mart.v  wrote:
>>
>> Hello everyone,
>>
>> I'm experiencing a strange behaviour. My cluster is relatively small (43 
>> OSDs, 11 nodes), running Ceph 12.2.10 (and Proxmox 5). Nodes are connected 
>> via 10 Gbit network (Nexus 6000). Cluster is mixed (SSD and HDD), but with 
>> different pools. Descibed error is only on the SSD part of the cluster.
>>
>> I noticed that few times a day the cluster slows down a bit and I have 
>> discovered this in logs:
>>
>> 2019-02-22 08:21:20.064396 mon.node1 mon.0 172.16.254.101:6789/0 1794159 : 
>> cluster [WRN] Health check failed: 27 slow requests are blocked > 32 sec. 
>> Implicated osds 10,22,33 (REQUEST_SLOW)
>> 2019-02-22 08:21:26.589202 mon.node1 mon.0 172.16.254.101:6789/0 1794169 : 
>> cluster [WRN] Health check update: 199 slow requests are blocked > 32 sec. 
>> Implicated osds 0,4,5,6,7,8,9,10,12,16,17,19,20,21,22,25,26,33,41 
>> (REQUEST_SLOW)
>> 2019-02-22 08:21:32.655671 mon.node1 mon.0 172.16.254.101:6789/0 1794183 : 
>> cluster [WRN] Health check update: 448 slow requests are blocked > 32 sec. 
>> Implicated osds 0,3,4,5,6,7,8,9,10,12,15,16,17,19,20,21,22,24,25,26,33,41 
>> (REQUEST_SLOW)
>> 2019-02-22 08:21:38.744210 mon.node1 mon.0 172.16.254.101:6789/0 1794210 : 
>> cluster [WRN] Health check update: 388 slow requests are blocked > 32 sec. 
>> Implicated osds 4,8,10,16,24,33 (REQUEST_SLOW)
>> 2019-02-22 08:21:42.790346 mon.node1 mon.0 172.16.254.101:6789/0 1794214 : 
>> cluster [INF] Health check cleared: REQUEST_SLOW (was: 18 slow requests are 
>> blocked > 32 sec. Implicated osds 8,16)
>>
>> "ceph health detail" shows nothing more
>>
>> It is happening through the whole day and the times can't be linked to any 
>> read or write intensive task (e.g. backup). I also tried to disable 
>> scrubbing, but it kept on going. These errors were not there since 
>> beginning, but unfortunately I cannot track the day they started (it is 
>> beyond my logs).
>>
>> Any ideas?
>>
>> Thank you!
>> Martin
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Blocked ops after change from filestore on HDD to bluestore on SDD

2019-02-28 Thread Matthew H
Have you made any changes to your ceph.conf? If so, would you mind copying them 
into this thread?


From: ceph-users  on behalf of Vitaliy 
Filippov 
Sent: Wednesday, February 27, 2019 4:21 PM
To: Ceph Users
Subject: Re: [ceph-users] Blocked ops after change from filestore on HDD to 
bluestore on SDD

I think this should not lead to blocked ops in any case, even if the
performance is low...

--
With best regards,
   Vitaliy Filippov
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Ceph-community] How does ceph use the STS service?

2019-02-28 Thread Matthew H
This feature is in the Nautilus release.

The first release (14.1.0) of Nautilus is available from download.ceph.com as 
of last Friday.


From: ceph-users  on behalf of admin 

Sent: Thursday, February 28, 2019 4:22 AM
To: Pritha Srivastava; Sage Weil; ceph-us...@ceph.com
Subject: Re: [ceph-users] [Ceph-community] How does ceph use the STS service?

Hi, can you tell me the version that includes STS lite?
Thanks,
myxingkong


发件人: Pritha Srivastava
发送时间: 2019-02-27 23:53:58
收件人:  Sage Weil
抄送:  admin; 
ceph-us...@ceph.com
主题: Re: [ceph-users] [Ceph-community] How does ceph use the STS service?
Sorry I overlooked the ceph versions in the email.

STS Lite is not a part of ceph version 12.2.11 or ceph version 13.2.2.

Thanks,
Pritha

On Wed, Feb 27, 2019 at 9:09 PM Pritha Srivastava 
mailto:prsri...@redhat.com>> wrote:
You need to attach a policy to be able to invoke GetSessionToken. Please read 
the documentation below at:

https://github.com/ceph/ceph/pull/24818/commits/512b6d8bd951239d44685b25dccaf904f19872b2

Thanks,
Pritha

On Wed, Feb 27, 2019 at 8:59 PM Sage Weil 
mailto:s...@newdream.net>> wrote:
Moving this to ceph-users.

On Wed, 27 Feb 2019, admin wrote:

> I want to use the STS service to generate temporary credentials for use by 
> third-party clients.
>
> I configured STS lite based on the documentation.
> http://docs.ceph.com/docs/master/radosgw/STSLite/
>
> This is my configuration file:
>
> [global]
> fsid = 42a7cae1-84d1-423e-93f4-04b0736c14aa
> mon_initial_members = admin, node1, node2, node3
> mon_host = 192.168.199.81,192.168.199.82,192.168.199.83,192.168.199.84
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
>
> osd pool default size = 2
>
> [client.rgw.admin]
> rgw sts key = "1234567890"
> rgw s3 auth use sts = true
>
> When I execute the getSessionToken method, return a 405 error:
>
> 
> MethodNotAllowed
> tx3-005c73aed8-5e48-default
> 5e48-default-default
> 
>
> This is my test code:
>
> import os
> import sys
> import traceback
>
> import boto3
> from boto.s3.connection import S3Connection
> from boto.sts import STSConnection
>
> try:
> host = 'http://192.168.199.81:7480'
> access_key = '2324YFZ7QDEOSRL18QHR'
> secret_key = 'rL9FabxCOw5LDbrHtmykiGSCjzpKLmEs9WPiNjVJ'
>
> client = boto3.client('sts',
>   aws_access_key_id = access_key,
>   aws_secret_access_key = secret_key,
>   endpoint_url = host)
> response = client.get_session_token(DurationSeconds=999)
> print response
> except:
> print traceback.format_exc()
>
> Who can tell me if my configuration is incorrect or if the version I tested 
> does not provide STS service?
>
> This is the version I tested:
>
> ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous 
> (stable)
>
> ceph version 13.2.2 (02899bfda814146b021136e9d8e80eba494e1126) mimic 
> (stable)___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Multi-Site Cluster RGW Sync issues

2019-02-27 Thread Matthew H
Hey Ben,

Could you include the following?


radosgw-admin mdlog list


Thanks,


From: ceph-users  on behalf of 
Benjamin.Zieglmeier 
Sent: Tuesday, February 26, 2019 9:33 AM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Multi-Site Cluster RGW Sync issues


Hello,



We have a two zone multisite configured Luminous 12.2.5 cluster. Cluster has 
been running for about 1 year, and has only ~140G of data (~350k objects). We 
recently added a third zone to the zonegroup to facilitate a migration out of 
an existing site. Sync appears to be working and running `radosgw-admin sync 
status` and `radosgw-admin sync status –rgw-zone=` reflects the 
same. The problem we are having, is that once the data replication completes, 
one of the rgws serving the new zone has the radosgw process consuming all the 
CPU, and the rgw log is flooded with “ERROR: failed to read mdlog info with (2) 
No such file or directory”, to the amount of 1000 log entries/sec.



This has been happening for days on end now, and are concerned about what is 
going on between these two zones. Logs are constantly filling up on the rgws 
and we are out of ideas. Are they trying to catch up on metadata? After 
extensive searching and racking our brains, we are unable to figure out what is 
causing all these requests (and errors) between the two zones.



Thanks,

Ben
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw sync falling behind regularly

2019-02-27 Thread Matthew H
Hey Christian,

I'm making a while guess, but assuming this is 12.2.8. If so, it it possible 
that you can upgrade to 12.2.11? There's been rgw multisite bug fixes for 
metadata syncing and data syncing ( both separate issues ) that you could be 
hitting.

Thanks,

From: ceph-users  on behalf of Christian 
Rice 
Sent: Wednesday, February 27, 2019 7:05 PM
To: ceph-users
Subject: [ceph-users] radosgw sync falling behind regularly


Debian 9; ceph 12.8.8-bpo90+1; no rbd or cephfs, just radosgw; three clusters 
in one zonegroup.



Often we find either metadata or data sync behind, and it doesn’t look to ever 
recover until…we restart the endpoint radosgw target service.



eg at 15:45:40:



dc11-ceph-rgw1:/var/log/ceph# radosgw-admin sync status

  realm b3e2afe7-2254-494a-9a34-ce50358779fd (savagebucket)

  zonegroup de6af748-1a2f-44a1-9d44-30799cf1313e (us)

   zone 107d29a0-b732-4bf1-a26e-1f64f820e839 (dc11-prod)

  metadata sync syncing

full sync: 0/64 shards

incremental sync: 64/64 shards

metadata is behind on 2 shards

behind shards: [19,41]

oldest incremental change not applied: 2019-02-27 
14:42:24.0.408263s

  data sync source: 1e27bf9c-3a2f-4845-85b6-33a24bbe1c04 (sv5-corp)

syncing

full sync: 0/128 shards

incremental sync: 128/128 shards

data is caught up with source

source: 331d3f1e-1b72-4c56-bb5a-d1d0fcf6d0b8 (sv3-prod)

syncing

full sync: 0/128 shards

incremental sync: 128/128 shards

data is caught up with source





so at 15:46:07:



dc11-ceph-rgw1:/var/log/ceph# sudo systemctl restart 
ceph-radosgw@rgw.dc11-ceph-rgw1.service



and by the time I checked at 15:48:08:



dc11-ceph-rgw1:/var/log/ceph# radosgw-admin sync status

  realm b3e2afe7-2254-494a-9a34-ce50358779fd (savagebucket)

  zonegroup de6af748-1a2f-44a1-9d44-30799cf1313e (us)

   zone 107d29a0-b732-4bf1-a26e-1f64f820e839 (dc11-prod)

  metadata sync syncing

full sync: 0/64 shards

incremental sync: 64/64 shards

metadata is caught up with master

  data sync source: 1e27bf9c-3a2f-4845-85b6-33a24bbe1c04 (sv5-corp)

syncing

full sync: 0/128 shards

incremental sync: 128/128 shards

data is caught up with source

source: 331d3f1e-1b72-4c56-bb5a-d1d0fcf6d0b8 (sv3-prod)

syncing

full sync: 0/128 shards

incremental sync: 128/128 shards

data is caught up with source





There’s no way this is “lag.”  It’s stuck, and happens frequently, though 
perhaps not daily.  Any suggestions?  Our cluster isn’t heavily used yet, but 
it’s production.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] New OSD with weight 0, rebalance still happen...

2018-11-23 Thread Matthew H
Greetings,

You need to set the following configuration option under [osd] in your 
ceph.conf file for your new OSDs.

[osd]
osd_crush_initial_weight = 0

This will ensure your new OSDs come up with a 0 crush weight, thus preventing 
the automatic rebalance that you see occuring.

Good luck,


From: ceph-users  on behalf of Marco Gaiarin 

Sent: Thursday, November 22, 2018 3:22 AM
To: ceph-us...@ceph.com
Subject: [ceph-users] New OSD with weight 0, rebalance still happen...


Ceph still surprise me, when i'm sure i've fully understood it,
something 'strange' (to my knowledge) happen.


I need to move out a server of my ceph hammer cluster (3 nodes, 4 OSD
per node), and for some reasons i cannot simply move disks.
So i've added a new node, and yesterday i've setup the new 4 OSD.
In my mind i will add 4 OSD with weight 0, and then slowly i will lower
the old OSD weight and increase the weight of the new.

I've done before:

ceph osd set noin

and then added OSD, and (as expected) new OSD start with weight 0.

But, despite of the fact that weight is zero, rebalance happen, and
using percentage of rebalance 'weighted' to the size of new disk (eg,
i've had 18TB circa of space, i've added a 2TB disks and roughly 10% of
data start to rebalance).


Why? Thanks.

--
dott. Marco Gaiarin GNUPG Key ID: 240A3D66
  Associazione ``La Nostra Famiglia''  http://www.lanostrafamiglia.it/
  Polo FVG   -   Via della Bontà, 7 - 33078   -   San Vito al Tagliamento (PN)
  marco.gaiarin(at)lanostrafamiglia.it   t +39-0434-842711   f +39-0434-842797

Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA!
  http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000
(cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA)
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph backfill problem

2018-09-20 Thread Matthew H


Without knowing more about the underlying hardware, you likely are reaching 
some type of IO resource constraint. Are your journals colocated or 
non-colocated? How fast is your backend OSD storage device?

You may also want to look at setting the norebalance flag.

Good luck!

> On Sep 20, 2018, at 19:52, Chen Allen  wrote:
> 
> Hi there,
> 
> Has anyone experienced below?
> 2 of OSD server was down, after bring up 2 of servers, I brought 52 OSD's in 
> with just weight of 0.05, but it causing huge backfilling load, I saw so many 
> blocked requests and a number of pg stuck inactive. some of servers was 
> impact. so I stopped backfilling by mark nobackfill flag. everything back to 
> normal.
> But the most strange thing happens after 2 hours, the backfilling suddenly 
> start again despite of nobackfill flag marked and causing so many blocked 
> requests then we have to reweight 52 OSD's to 0 to stabilize storage.
> 
> Not sure why backfill start again. Anyone has any idea about that please 
> comments. 
> 
> Thanks so much.
> Allen
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-ansible

2018-09-20 Thread Matthew H
Setup a python virtual environment and install the required notario package 
version. You'll want to also install ansible into that virtual environment 
along with netaddr.



On Sep 20, 2018, at 18:04, solarflow99 
mailto:solarflo...@gmail.com>> wrote:

oh, was that all it was...  git clone https://github.com/ceph/ceph-ansible/
I installed the notario  package from EPEL, python2-notario-0.0.11-2.el7.noarch 
 and thats the newest they have




On Thu, Sep 20, 2018 at 3:57 PM Alfredo Deza 
mailto:ad...@redhat.com>> wrote:
Not sure how you installed ceph-ansible, the requirements mention a
version of a dependency (the notario module) which needs to be 0.0.13
or newer, and you seem to be using an older one.


On Thu, Sep 20, 2018 at 6:53 PM solarflow99 
mailto:solarflo...@gmail.com>> wrote:
>
> Hi, tying to get this to do a simple deployment, and i'm getting a strange 
> error, has anyone seen this?  I'm using Centos 7, rel 5   ansible 2.5.3  
> python version = 2.7.5
>
> I've tried with mimic luninous and even jewel, no luck at all.
>
>
>
> TASK [ceph-validate : validate provided configuration] 
> **
> task path: 
> /home/jzygmont/ansible/ceph-ansible/roles/ceph-validate/tasks/main.yml:2
> Thursday 20 September 2018  14:05:18 -0700 (0:00:05.734)   0:00:37.439 
> 
> The full traceback is:
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/ansible/executor/task_executor.py", 
> line 138, in run
> res = self._execute()
>   File "/usr/lib/python2.7/site-packages/ansible/executor/task_executor.py", 
> line 561, in _execute
> result = self._handler.run(task_vars=variables)
>   File "/home/jzygmont/ansible/ceph-ansible/plugins/actions/validate.py", 
> line 43, in run
> notario.validate(host_vars, install_options, defined_keys=True)
> TypeError: validate() got an unexpected keyword argument 'defined_keys'
>
> fatal: [172.20.3.178]: FAILED! => {
> "msg": "Unexpected failure during module execution.",
> "stdout": ""
> }
>
> NO MORE HOSTS LEFT 
> **
>
> PLAY RECAP 
> **
> 172.20.3.178   : ok=25   changed=0unreachable=0failed=1
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com