Re: [ceph-users] Ceph Day Germany 2018

2018-01-15 Thread Wido den Hollander



On 01/16/2018 06:51 AM, Leonardo Vaz wrote:

Hey Cephers!

We are proud to announce our first Ceph Day in 2018 which happens on 
February 7 at the Deutsche Telekom AG Office in Darmstadt (25 km South 
from Frankfurt Airport).


The conference schedule[1] is being finished and the registration is 
already in progress[2].


If you're in Europe, join us at the Ceph Day Germany!



Yes! Looking forward :-) I'll be there :)

Wido


Mit freundlichen Grüßen,

Leo

[1] https://ceph.com/cephdays/germany/
[2] https://cephdaygermany.eventbrite.com/

--
Leonardo Vaz
Ceph Community Manager
Open Source and Standards Team


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph Day Germany 2018

2018-01-15 Thread Leonardo Vaz
Hey Cephers!

We are proud to announce our first Ceph Day in 2018 which happens on
February 7 at the Deutsche Telekom AG Office in Darmstadt (25 km South from
Frankfurt Airport).

The conference schedule[1] is being finished and the registration is
already in progress[2].

If you're in Europe, join us at the Ceph Day Germany!

Mit freundlichen Grüßen,

Leo

[1] https://ceph.com/cephdays/germany/
[2] https://cephdaygermany.eventbrite.com/

-- 
Leonardo Vaz
Ceph Community Manager
Open Source and Standards Team
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Future

2018-01-15 Thread Alex Gorbachev
Hi Massimiliano,


On Thu, Jan 11, 2018 at 6:15 AM, Massimiliano Cuttini  
wrote:
> Hi everybody,
>
> i'm always looking at CEPH for the future.
> But I do see several issue that are leaved unresolved and block nearly
> future adoption.
> I would like to know if there are some answear already:
>
> 1) Separation between Client and Server distribution.
> At this time you have always to update client & server in order to match the
> same distribution of Ceph.
> This is ok in the early releases but in future I do expect that the
> ceph-client is ONE, not many for every major version.
> The client should be able to self determinate what version of the protocol
> and what feature are enabable and connect to at least 3 or 5 older major
> version of Ceph by itself.
>
> 2) Kernel is old -> feature mismatch
> Ok, kernel is old, and so? Just do not use it and turn to NBD.
> And please don't let me even know, just virtualize under the hood.
>
> 3) Management complexity
> Ceph is amazing, but is just too big to have everything under control (too
> many services).
> Now there is a management console, but as far as I read this management
> console just show basic data about performance.
> So it doesn't manage at all... it's just a monitor...
>
> In the end You have just to manage everything by your command-line.
> In order to manage by web it's mandatory:
>
> create, delete, enable, disable services
> If I need to run ISCSI redundant gateway, do I really need to cut
> command from your online docs?
> Of course no. You just can script it better than what every admin can do.
> Just give few arguments on the html forms and that's all.
>
> create, delete, enable, disable users
> I have to create users and keys for 24 servers. Do you really think it's
> possible to make it without some bad transcription or bad cut of the
> keys across all servers.
> Everybody end by just copy the admin keys across all servers, giving very
> unsecure full permission to all clients.
>
> create MAPS  (server, datacenter, rack, node, osd).
> This is mandatory to design how the data need to be replicate.
> It's not good create this by script or shell, it's needed a graph editor
> which can dive you the perpective of what will be copied where.
>
> check hardware below the hood
> It's missing the checking of the health of the hardware below.
> But Ceph was born as a storage software that ensure redundacy and protect
> you from single failure.
> So WHY did just ignore to check the healths of disks with SMART?
> FreeNAS just do a better work on this giving lot of tools to understand
> which disks is which and if it will fail in the nearly future.
> Of course also Ceph could really forecast issues by itself and need to start
> to integrate with basic hardware IO.
> For example, should be possible to enable disable UID on the disks in order
> to know which one need to be replace.

As a technical note, we ran into this need with Storcium, and it is
pretty easy to utilize UID indicators using both Areca and LSI/Avago
HBAs.  You will need the standard control tools available from their
web sites, as well as hardware that supports SGPIO (most enterprise
JBODs and drives do).  There's likely similar options to other HBAs.

Areca:

UID on:

cli64 curctrl=1 set password=
cli64 curctrl= disk identify drv=

UID OFF:

cli64 curctrl=1 set password=
cli64 curctrl= disk identify drv=0

LSI/Avago:

UID on:

sas2ircu  locate : ON

UID OFF:

sas2ircu  locate : OFF

HTH,
Alex Gorbachev
Storcium

> I guess this kind of feature are quite standard across all linux
> distributions.
>
> The management complexity can be completly overcome with a great Web
> Manager.
> A Web Manager, in the end is just a wrapper for Shell Command from the
> CephAdminNode to others.
> If you think about it a wrapper is just tons of time easier to develop than
> what has been already developed.
> I do really see that CEPH is the future of storage. But there is some
> quick-avoidable complexity that need to be reduced.
>
> If there are already some plan for these issue I really would like to know.
>
> Thanks,
> Max
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Changing device-class using crushtool

2018-01-15 Thread Alex Gorbachev
Hi Wido,

On Wed, Jan 10, 2018 at 11:09 AM, Wido den Hollander  wrote:
> Hi,
>
> Is there a way to easily modify the device-class of devices on a offline
> CRUSHMap?
>
> I know I can decompile the CRUSHMap and do it, but that's a lot of work in a
> large environment.
>
> In larger environments I'm a fan of downloading the CRUSHMap, modifying it
> to my needs, testing it and injecting it at once into the cluster.
>
> crushtool can do a lot, you can also run tests using device classes, but
> there doesn't seem to be a way to modify the device-class using crushtool,
> is that correct?

This is how we do it in Storcium based on
http://docs.ceph.com/docs/master/rados/operations/crush-map/

ceph osd crush rm-device-class 
ceph osd crush set-device-class  

--
Best regards,
Alex Gorbachev
Storcium


>
> Wido
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Safe to delete data, metadata pools?

2018-01-15 Thread Richard Bade
Thanks John, I removed these pools on Friday and as you suspected
there was no impact.

Regards,
Rich

On 8 January 2018 at 23:15, John Spray  wrote:
> On Mon, Jan 8, 2018 at 2:55 AM, Richard Bade  wrote:
>> Hi Everyone,
>> I've got a couple of pools that I don't believe are being used but
>> have a reasonably large number of pg's (approx 50% of our total pg's).
>> I'd like to delete them but as they were pre-existing when I inherited
>> the cluster, I wanted to make sure they aren't needed for anything
>> first.
>> Here's the details:
>> POOLS:
>> NAME   ID USED   %USED MAX AVAIL OBJECTS
>> data   0   0 088037G0
>> metadata   1   0 088037G0
>>
>> We don't run cephfs and I believe these are meant for that, but may
>> have been created by default when the cluster was set up (back on
>> dumpling or bobtail I think).
>> As far as I can tell there is no data in them. Do they need to exist
>> for some ceph function?
>> The pool names worry me a little, as they sound important.
>
> The data and metadata pools were indeed created by default in older
> versions of Ceph, for use by CephFS.  Since you're not using CephFS,
> and nobody is using the pools for anything else either (they're
> empty), you can go ahead and delete them.
>
>>
>> They have 3136 pg's each so I'd like to be rid of those so I can
>> increase the number of pg's in my actual data pools without getting
>> over the 300 pg's per osd.
>> Here's the osd dump:
>> pool 0 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash
>> rjenkins pg_num 3136 pgp_num 3136 last_change 1 crash_replay_interval
>> 45 min_read_recency_for_promote 1 min_write_recency_for_promote 1
>> stripe_width 0
>> pool 1 'metadata' replicated size 2 min_size 1 crush_ruleset 1
>> object_hash rjenkins pg_num 3136 pgp_num 3136 last_change 1
>> min_read_recency_for_promote 1 min_write_recency_for_promote 1
>> stripe_width 0
>>
>> Also, what performance impact am I likely to see when ceph removes the
>> empty pg's considering it's approx 50% of my total pg's on my 180
>> osd's.
>
> Given that they're empty, I'd expect little if any noticeable impact.
>
> John
>
>>
>> Thanks,
>> Rich
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw fails with "ERROR: failed to initialize watch: (34) Numerical result out of range"

2018-01-15 Thread Brad Hubbard
On Tue, Jan 16, 2018 at 1:35 AM, Alexander Peters  wrote:
> i created the dump output but it looks very cryptic to me so i can't really 
> make much sense of it. is there anything to look for in particular?

Yes, basically we are looking for any line that ends in "= 34". You
might also find piping it through c++filt helps.

Something like...

$ c++filt 
> i think i am going to read up on how interpret ltrace output...
>
> BR
> Alex
>
> - Ursprüngliche Mail -
> Von: "Brad Hubbard" 
> An: "Alexander Peters" 
> CC: "Ceph Users" 
> Gesendet: Montag, 15. Januar 2018 03:09:53
> Betreff: Re: [ceph-users] radosgw fails with "ERROR: failed to initialize 
> watch: (34) Numerical result out of range"
>
> On Mon, Jan 15, 2018 at 11:38 AM, Brad Hubbard  wrote:
>> On Mon, Jan 15, 2018 at 10:38 AM, Alexander Peters
>>  wrote:
>>> Thanks for the reply - unfortunatly the link you send is behind a paywall so
>>> at least for now i can’t read it.
>>
>> That's why I provided the cause as laid out in that article (pgp num > pg 
>> num).
>>
>> Do you have any settings in ceph.conf related to pg_num or pgp_num?
>>
>> If not, please add your details to http://tracker.ceph.com/issues/22351
>
> Rados can return ERANGE (34) in multiple places so identifying where
> might be a big step towards working this out.
>
> $ ltrace -fo /tmp/ltrace.out /usr/bin/radosgw --cluster ceph --name
> client.radosgw.ctrl02 --setuser ceph --setgroup ceph -f -d
>
> The objective is to find which function(s) return 34.
>
>>
>>>
>>> output of ceph osd dump shows that pgp num == pg num:
>>>
>>> [root@ctrl01 ~]# ceph osd dump
>>> epoch 142
>>> fsid 0e2d841f-68fd-4629-9813-ab083e8c0f10
>>> created 2017-12-20 23:04:59.781525
>>> modified 2018-01-14 21:30:57.528682
>>> flags sortbitwise,recovery_deletes,purged_snapdirs
>>> crush_version 6
>>> full_ratio 0.95
>>> backfillfull_ratio 0.9
>>> nearfull_ratio 0.85
>>> require_min_compat_client jewel
>>> min_compat_client jewel
>>> require_osd_release luminous
>>> pool 1 'glance' replicated size 3 min_size 2 crush_rule 0 object_hash
>>> rjenkins pg_num 64 pgp_num 64 last_change 119 flags hashpspool stripe_width
>>> 0 application rbd
>>> removed_snaps [1~3]
>>> pool 2 'cinder-2' replicated size 3 min_size 2 crush_rule 0 object_hash
>>> rjenkins pg_num 64 pgp_num 64 last_change 120 flags hashpspool stripe_width
>>> 0 application rbd
>>> removed_snaps [1~3]
>>> pool 3 'cinder-3' replicated size 3 min_size 2 crush_rule 0 object_hash
>>> rjenkins pg_num 64 pgp_num 64 last_change 121 flags hashpspool stripe_width
>>> 0 application rbd
>>> removed_snaps [1~3]
>>> pool 4 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash
>>> rjenkins pg_num 8 pgp_num 8 last_change 94 owner 18446744073709551615 flags
>>> hashpspool stripe_width 0 application rgw
>>> max_osd 3
>>> osd.0 up   in  weight 1 up_from 82 up_thru 140 down_at 79
>>> last_clean_interval [23,78) 10.16.0.11:6800/1795 10.16.0.11:6801/1795
>>> 10.16.0.11:6802/1795 10.16.0.11:6803/1795 exists,up
>>> abe33844-6d98-4ede-81a8-a8bdc92dada8
>>> osd.1 up   in  weight 1 up_from 73 up_thru 140 down_at 71
>>> last_clean_interval [55,72) 10.16.0.13:6800/1756 10.16.0.13:6804/1001756
>>> 10.16.0.13:6805/1001756 10.16.0.13:6806/1001756 exists,up
>>> 0dab9372-6ffe-4a23-a8b7-4edca3745a2a
>>> osd.2 up   in  weight 1 up_from 140 up_thru 140 down_at 133
>>> last_clean_interval [31,132) 10.16.0.12:6800/1749 10.16.0.12:6801/1749
>>> 10.16.0.12:6802/1749 10.16.0.12:6803/1749 exists,up
>>> 220bba17-8119-4035-9e43-5b8eaa27562f
>>>
>>>
>>> Am 15.01.2018 um 01:33 schrieb Brad Hubbard :
>>>
>>> On Mon, Jan 15, 2018 at 8:34 AM, Alexander Peters
>>>  wrote:
>>>
>>> Hello
>>>
>>> I am currently experiencing a strange issue with my radosgw. It Fails to
>>> start and all tit says is:
>>> [root@ctrl02 ~]# /usr/bin/radosgw --cluster ceph --name
>>> client.radosgw.ctrl02 --setuser ceph --setgroup ceph -f -d
>>> 2018-01-14 21:30:57.132007 7f44ddd18e00  0 deferred set uid:gid to 167:167
>>> (ceph:ceph)
>>> 2018-01-14 21:30:57.132161 7f44ddd18e00  0 ceph version 12.2.2
>>> (cf0baba3b47f9427c6c97e2144b094b7e5ba) luminous (stable), process
>>> (unknown), pid 13928
>>> 2018-01-14 21:30:57.556672 7f44ddd18e00 -1 ERROR: failed to initialize
>>> watch: (34) Numerical result out of range
>>> 2018-01-14 21:30:57.558752 7f44ddd18e00 -1 Couldn't init storage provider
>>> (RADOS)
>>>
>>> (when started via systemctl it writes the same lines to the logfile)
>>>
>>> strange thing is that it is working on an other env that was installed with
>>> the same set of ansible playbooks.
>>> OS is CentOS Linux release 7.4.1708 (Core)
>>>
>>> Ceph is up and running ( I am currently using it for storing volumes and
>>> images form Openstack )
>>>
>>> Does anyone have an idea how to debug this?
>>>
>>>
>>> According to 

Re: [ceph-users] Removing cache tier for RBD pool

2018-01-15 Thread Mike Lovell
On Mon, Jan 8, 2018 at 6:08 AM, Jens-U. Mozdzen  wrote:

> Hi *,
>
> trying to remove a caching tier from a pool used for RBD / Openstack, we
> followed the procedure from http://docs.ceph.com/docs/mast
> er/rados/operations/cache-tiering/#removing-a-writeback-cache and ran
> into problems.
>
> The cluster is currently running Ceph 12.2.2, the caching tier was created
> with an earlier release of Ceph.
>
> First of all, setting the cache-mode to "forward" is reported to be
> unsafe, which is not mentioned in the documentation - if it's really meant
> to be used in this case, the need for "--yes-i-really-mean-it" should be
> documented.
>
> Unfortunately, using "rados -p hot-storage cache-flush-evict-all" not only
> reported errors ("file not found") for many objects, but left us with quite
> a number of objects in the pool and new ones being created, despite the
> "forward" mode. Even after stopping all Openstack instances ("VMs"), we
> could also see that the remaining objects in the pool were still locked.
> Manually unlocking these via rados commands worked, but
> "cache-flush-evict-all" then still reported those "file not found" errors
> and 1070 objects remained in the pool, like before. We checked the
> remaining objects via "rados stat" both in the hot-storage and the
> cold-storage pool and could see that every hot-storage object had a
> counter-part in cold-storage with identical stat info. We also compared
> some of the objects (with size > 0) and found the hot-storage and
> cold-storage entities to be identical.
>
> We aborted that attempt, reverted the mode to "writeback" and restarted
> the Openstack cluster - everything was working fine again, of course still
> using the cache tier.
>
> During a recent maintenance window, the Openstack cluster was shut down
> again and we re-tried the procedure. As there were no active users of the
> images pool, we skipped the step of forcing the cache mode to forward and
> immediately issued the "cache-flush-evict-all" command. Again 1070 objects
> remained in the hot-storage pool (and gave "file not found" errors), but
> unlike last time, none were locked.
>
> Out of curiosity we then issued loops of "rados -p hot-storage cache-flush
> " and "rados -p hot-storage cache-evict " for all
> objects in the hot-storage pool and surprisingly not only received no error
> messages at all, but were left with an empty hot-storage pool! We then
> proceeded with the further steps from the docs and were able to
> successfully remove the cache tier.
>
> This leaves us with two questions:
>
> 1. Does setting the cache mode to "forward" lead to above situation of
> remaining locks on hot-storage pool objects? Maybe the clients' unlock
> requests are forwarded to the cold-storage pool, leaving the hot-storage
> objects locked? If so, this should be documented and it'd seem impossible
> to cleanly remove a cache tier during live operations.
>
> 2. What is the significant difference between "rados
> cache-flush-evict-all" and separate "cache-flush" and "cache-evict" cycles?
> Or is it some implementation error that leads to those "file not found"
> errors with "cache-flush-evict-all", while the manual cycles work
> successfully?
>
> Thank you for any insight you might be able to share.
>
> Regards,
> Jens
>

i've removed a cache tier in environments a few times. the only locked
files i ran into were the rbd_directory and rbd_header objects for each
volume. the rbd_headers for each rbd volume are locked as long as the vm is
running. every time i've tried to remove a cache tier, i shutdown all of
the vms before starting the procedure and there wasn't any problem getting
things flushed+evicted. so i can't really give any further insight into
what might have happened other than it worked for me. i set the cache-mode
to forward everytime before flushing and evicting objects.

i don't think there really is a significant technical difference between
the cache-flush-evict-all command and doing separate cache-flush and
cache-evict on individual objects. my understanding is
cache-flush-evict-all is just a short cut to getting everything in the
cache flushed and evicted. did the cache-flush-evict-all error on some
objects where the separate operations succeeded? you're description doesn't
say if there was but then you say you used both styles during your second
attempt.

there being objects left in the hot storage pool is something i've seen,
even after it looks like everything has been flushed. when i dug deeper, it
looked like all of the objects left in the pool were the hitset objects
that the cache tier uses for tracking how frequently objects are used.
those hitsets need to be persisted in case an osd restarts or the pg is
migrated to another osd. the method it uses for that is just storing the
hitset as another object but one that is internal to ceph. since they're
internal, the objects are hidden from some commands like "rados ls" but
still get counted as 

Re: [ceph-users] Adding a host node back to ceph cluster

2018-01-15 Thread Marc Roos
 
Maybe for the future: 

rpm {-V|--verify} [select-options] [verify-options]

   Verifying a package compares information about the installed 
files in the package with information about  the  files  taken
   from  the  package  metadata  stored in the rpm database.  Among 
other things, verifying compares the size, digest, permis‐
   sions, type, owner and group of each file.  Any discrepancies are 
displayed.  Files that were not installed from the  pack‐
   age, for example, documentation files excluded on installation 
using the "--excludedocs" option, will be silently ignored.





-Original Message-
From: Geoffrey Rhodes [mailto:geoff...@rhodes.org.za] 
Sent: maandag 15 januari 2018 16:39
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Adding a host node back to ceph cluster

Good day,

I'm having an issue re-deploying a host back into my production ceph 
cluster.
Due to some bad memory (picked up by a scrub) which has been replaced I 
felt the need to re-install the host to be sure no host files were 
damaged.

Prior to decommissioning the host I set the crush weight's on each osd 
to 0.
Once to osd's had flushed all data I stopped the daemon's.
I then purged the osd's from the crushmap with "ceph osd purge".
Followed by "ceph osd crush rm {host}" to remove the host bucket from 
the crush map.

I also ran "ceph-deploy purge {host}" & "ceph-deploy purgedata {host}" 
from the management node.
I then reinstalled the host and made the necessary config changes 
followed by the appropriate ceph-deploy commands (ceph-deploy 
install..., ceph-deploy admin..., ceph-deploy osd create...) to bring 
the host & it's osd's back into the cluster, - same as I would when 
adding a new host node to the cluster.

Running ceph osd df tree shows the osd's however the host node is not 
displayed.
Inspecting the crush map I see no host bucket has been created or any 
host's osd's listed.
The osd's also did not start which explains the weight being 0 but I 
presume the osd's not starting isn't the only issue since the crush map 
lacks the newly installed host detail.

Could anybody maybe tell me where I've gone wrong?
I'm also assuming there shouldn't be an issue using the same host name 
again or do I manually add the host bucket and osd detail back into the 
crush map or should ceph-deploy not take care of that?

Thanks

OS: Ubuntu 16.04.3 LTS
Ceph version: 12.2.1 / 12.2.2 - Luminous


Kind regards
Geoffrey Rhodes



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous RGW Metadata Search

2018-01-15 Thread Youzhong Yang
Finally, the issue that has haunted me for quite some time turned out to be
a ceph.conf issue:

I had
osd_pool_default_pg_num = 100
osd_pool_default_pgp_num = 100

once I changed to
osd_pool_default_pg_num = 32
osd_pool_default_pgp_num = 32

then no issue to start the second rgw process.

No idea why 32 works but 100 doesn't. The debug output is useless and log
files too. Just insane.

Anyway, thanks.


On Fri, Jan 12, 2018 at 7:25 PM, Yehuda Sadeh-Weinraub 
wrote:

> The errors you're seeing there don't look like related to
> elasticsearch. It's a generic radosgw related error that says that it
> failed to reach the rados (ceph) backend. You can try bumping up the
> messenger log (debug ms =1) and see if there's any hint in there.
>
> Yehuda
>
> On Fri, Jan 12, 2018 at 12:54 PM, Youzhong Yang 
> wrote:
> > So I did the exact same thing using Kraken and the same set of VMs, no
> > issue. What is the magic to make it work in Luminous? Anyone lucky
> enough to
> > have this RGW ElasticSearch working using Luminous?
> >
> > On Mon, Jan 8, 2018 at 10:26 AM, Youzhong Yang 
> wrote:
> >>
> >> Hi Yehuda,
> >>
> >> Thanks for replying.
> >>
> >> >radosgw failed to connect to your ceph cluster. Does the rados command
> >> >with the same connection params work?
> >>
> >> I am not quite sure what to do by running rados command to test.
> >>
> >> So I tried again, could you please take a look and check what could have
> >> gone wrong?
> >>
> >> Here are what I did:
> >>
> >>  On ceph admin node, I removed installation on ceph-rgw1 and
> >> ceph-rgw2, reinstalled rgw on ceph-rgw1, stoped rgw service, removed
> all rgw
> >> pools. Elasticsearch is running on ceph-rgw2 node on port 9200.
> >>
> >> ceph-deploy purge ceph-rgw1
> >> ceph-deploy purge ceph-rgw2
> >> ceph-deploy purgedata ceph-rgw2
> >> ceph-deploy purgedata ceph-rgw1
> >> ceph-deploy install --release luminous ceph-rgw1
> >> ceph-deploy admin ceph-rgw1
> >> ceph-deploy rgw create ceph-rgw1
> >> ssh ceph-rgw1 sudo systemctl stop ceph-rado...@rgw.ceph-rgw1
> >> rados rmpool default.rgw.log default.rgw.log
> --yes-i-really-really-mean-it
> >> rados rmpool default.rgw.meta default.rgw.meta
> >> --yes-i-really-really-mean-it
> >> rados rmpool default.rgw.control default.rgw.control
> >> --yes-i-really-really-mean-it
> >> rados rmpool .rgw.root .rgw.root --yes-i-really-really-mean-it
> >>
> >>  On ceph-rgw1 node:
> >>
> >> export RGWHOST="ceph-rgw1"
> >> export ELASTICHOST="ceph-rgw2"
> >> export REALM="demo"
> >> export ZONEGRP="zone1"
> >> export ZONE1="zone1-a"
> >> export ZONE2="zone1-b"
> >> export SYNC_AKEY="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 20
> |
> >> head -n 1 )"
> >> export SYNC_SKEY="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 40
> |
> >> head -n 1 )"
> >>
> >> radosgw-admin realm create --rgw-realm=${REALM} --default
> >> radosgw-admin zonegroup create --rgw-realm=${REALM}
> >> --rgw-zonegroup=${ZONEGRP} --endpoints=http://${RGWHOST}:8000 --master
> >> --default
> >> radosgw-admin zone create --rgw-realm=${REALM}
> --rgw-zonegroup=${ZONEGRP}
> >> --rgw-zone=${ZONE1} --endpoints=http://${RGWHOST}:8000
> >> --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY} --master --default
> >> radosgw-admin user create --uid=sync --display-name="zone sync"
> >> --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY} --system
> >> radosgw-admin period update --commit
> >> sudo systemctl start ceph-radosgw@rgw.${RGWHOST}
> >>
> >> radosgw-admin zone create --rgw-realm=${REALM}
> --rgw-zonegroup=${ZONEGRP}
> >> --rgw-zone=${ZONE2} --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY}
> >> --endpoints=http://${RGWHOST}:8002
> >> radosgw-admin zone modify --rgw-realm=${REALM}
> --rgw-zonegroup=${ZONEGRP}
> >> --rgw-zone=${ZONE2} --tier-type=elasticsearch
> >> --tier-config=endpoint=http://${ELASTICHOST}:9200,num_
> replicas=1,num_shards=10
> >> radosgw-admin period update --commit
> >>
> >> sudo systemctl restart ceph-radosgw@rgw.${RGWHOST}
> >> sudo radosgw --keyring /etc/ceph/ceph.client.admin.keyring -f
> >> --rgw-zone=${ZONE2} --rgw-frontends="civetweb port=8002"
> >> 2018-01-08 00:21:54.389432 7f0fe9cd2e80 -1 Couldn't init storage
> provider
> >> (RADOS)
> >>
> >>  As you can see, starting rgw on port 8002 failed, but rgw on port
> >> 8000 was started successfully.
> >>  Here are some more info which may be useful for diagnosis:
> >>
> >> $ cat /etc/ceph/ceph.conf
> >> [global]
> >> fsid = 3e5a32d4-e45e-48dd-a3c5-f6f28fef8edf
> >> mon_initial_members = ceph-mon1, ceph-osd1, ceph-osd2, ceph-osd3
> >> mon_host = 172.30.212.226,172.30.212.227,172.30.212.228,172.30.212.250
> >> auth_cluster_required = cephx
> >> auth_service_required = cephx
> >> auth_client_required = cephx
> >> osd_pool_default_size = 2
> >> osd_pool_default_min_size = 2
> >> osd_pool_default_pg_num = 100
> >> osd_pool_default_pgp_num = 100
> >> bluestore_compression_algorithm = zlib
> >> bluestore_compression_mode = 

Re: [ceph-users] slow requests on a specific osd

2018-01-15 Thread lists

Hi Wes,

On 15-1-2018 20:57, Wes Dillingham wrote:
My understanding is that the exact same objects would move back to the 
OSD if weight went 1 -> 0 -> 1 given the same Cluster state and same 
object names, CRUSH is deterministic so that would be the almost certain 
result.




Ok, thanks! So this would be a useless exercise. :-|

Thanks very much for your feedback, Wes!

MJ
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] slow requests on a specific osd

2018-01-15 Thread Wes Dillingham
My understanding is that the exact same objects would move back to the OSD
if weight went 1 -> 0 -> 1 given the same Cluster state and same object
names, CRUSH is deterministic so that would be the almost certain result.

On Mon, Jan 15, 2018 at 2:46 PM, lists  wrote:

> Hi Wes,
>
> On 15-1-2018 20:32, Wes Dillingham wrote:
>
>> I dont hear a lot of people discuss using xfs_fsr on OSDs and going over
>> the mailing list history it seems to have been brought up very infrequently
>> and never as a suggestion for regular maintenance. Perhaps its not needed.
>>
> True, it's just something we've always done on all our xfs filesystems, to
> keep them speedy and snappy. I've disabled it, and then it doesn't happen.
>
> Perhaps I'll keep it disabled.
>
> But on this last question, about data distribution across OSDs:
>
> In that case, how about reweighting that osd.10 to "0", wait until
>> all data has moved off osd.10, and then setting it back to "1".
>> Would this result in *exactly* the same situation as before, or
>> would it at least cause the data to have spread move better across
>> the other OSDs?
>>
>
> Would it work like that? Or would setting it back to "1" give me again the
> same data on this OSD that we started with?
>
> Thanks for your comments,
>
> MJ
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Respectfully,

Wes Dillingham
wes_dilling...@harvard.edu
Research Computing | Senior CyberInfrastructure Storage Engineer
Harvard University | 38 Oxford Street, Cambridge, Ma 02138 | Room 204
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] slow requests on a specific osd

2018-01-15 Thread lists

Hi Wes,

On 15-1-2018 20:32, Wes Dillingham wrote:
I dont hear a lot of people discuss using xfs_fsr on OSDs and going over 
the mailing list history it seems to have been brought up very 
infrequently and never as a suggestion for regular maintenance. Perhaps 
its not needed.
True, it's just something we've always done on all our xfs filesystems, 
to keep them speedy and snappy. I've disabled it, and then it doesn't 
happen.


Perhaps I'll keep it disabled.

But on this last question, about data distribution across OSDs:


In that case, how about reweighting that osd.10 to "0", wait until
all data has moved off osd.10, and then setting it back to "1".
Would this result in *exactly* the same situation as before, or
would it at least cause the data to have spread move better across
the other OSDs?


Would it work like that? Or would setting it back to "1" give me again 
the same data on this OSD that we started with?


Thanks for your comments,
MJ
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] slow requests on a specific osd

2018-01-15 Thread Wes Dillingham
I dont hear a lot of people discuss using xfs_fsr on OSDs and going over
the mailing list history it seems to have been brought up very infrequently
and never as a suggestion for regular maintenance. Perhaps its not needed.

One thing to consider trying, and to rule out something funky with the XFS
filesystem on that particular OSD/drive would be to remove the OSD entirely
from the cluster, reformat the disk, and then rebuild the OSD, putting a
brand new XFS on the OSD.

On Mon, Jan 15, 2018 at 7:36 AM, lists  wrote:

> Hi,
>
> On our three-node 24 OSDs ceph 10.2.10 cluster, we have started seeing
> slow requests on a specific OSD, during the the two-hour nightly xfs_fsr
> run from 05:00 - 07:00. This started after we applied the meltdown patches.
>
> The specific osd.10 also has the highest space utilization of all OSDs
> cluster-wide, with 45%, while the others are mostly around 40%. All OSDs
> are the same 4TB platters with journal on ssd, all with weight 1.
>
> Smart info for osd.10 shows nothing interesting I think:
>
> Current Drive Temperature: 27 C
>> Drive Trip Temperature:60 C
>>
>> Manufactured in week 04 of year 2016
>> Specified cycle count over device lifetime:  1
>> Accumulated start-stop cycles:  53
>> Specified load-unload count over device lifetime:  30
>> Accumulated load-unload cycles:  697
>> Elements in grown defect list: 0
>>
>> Vendor (Seagate) cache information
>>   Blocks sent to initiator = 1933129649
>>   Blocks received from initiator = 869206640
>>   Blocks read from cache and sent to initiator = 2149311508
>>   Number of read and write commands whose size <= segment size = 676356809
>>   Number of read and write commands whose size > segment size = 12734900
>>
>> Vendor (Seagate/Hitachi) factory information
>>   number of hours powered up = 13625.88
>>   number of minutes until next internal SMART test = 8
>>
>
> Now my question:
> Could it be that osd.10 just happens to contain some data chunks that are
> heavily needed by the VMs around that time, and that the added load of an
> xfs_fsr is simply too much for it to handle?
>
> In that case, how about reweighting that osd.10 to "0", wait until all
> data has moved off osd.10, and then setting it back to "1". Would this
> result in *exactly* the same situation as before, or would it at least
> cause the data to have spread move better across the other OSDs?
>
> (with the idea that better data spread across OSDs brings also better
> distribution of load between the OSDs)
>
> Or other ideas to check out?
>
> MJ
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Respectfully,

Wes Dillingham
wes_dilling...@harvard.edu
Research Computing | Senior CyberInfrastructure Storage Engineer
Harvard University | 38 Oxford Street, Cambridge, Ma 02138 | Room 204
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Switching a pool from EC to replicated online ?

2018-01-15 Thread Wes Dillingham
You would need to create a new pool and migrate the data to that new pool.

Replicated pool fronting an EC pool for RBD is a known-bad workload:
http://docs.ceph.com/docs/master/rados/operations/cache-tiering/#a-word-of-caution
but others mileage may vary I suppose.

In order to migrate you could do an RBD at a time, I would probably take a
snapshot and than do an `rbd cp` operation from the poolA/snap to
poolB/image

If you are okay with the VMs being powered down you could do an `rbd mv`
which doesnt support renames across pools, though I would prefer the cp
method.

You could also do a wholesale pool copy using `rados cppool` see
http://ceph.com/geen-categorie/ceph-pool-migration/

best of luck.

On Sat, Jan 13, 2018 at 6:37 PM, moftah moftah  wrote:

> Hi All,
> is there a way to switch a pool that is set to be EC to being replicated
> without the need to switch to new pool and migrate data ?
>
> I am getting poor results from EC and want to switch to replicated but i
> already have customers on the system .
> i using ceph 11
> the EC already have cache tier that is replicated
>
> Thanks
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
Respectfully,

Wes Dillingham
wes_dilling...@harvard.edu
Research Computing | Senior CyberInfrastructure Storage Engineer
Harvard University | 38 Oxford Street, Cambridge, Ma 02138 | Room 204
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] subscribe to ceph-user list

2018-01-15 Thread German Anders

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Bug in RadosGW resharding? Hangs again...

2018-01-15 Thread Martin Emrich

Hi!

After having a completely broken radosgw setup due to damaged buckets, I 
completely deleted all rgw pools, and started from scratch.


But my problem is reproducible. After pushing ca. 10 objects into a 
bucket, the resharding process appears to start, and the bucket is now 
unresponsive.


I just see lots of these messages in all rgw logs:

2018-01-15 16:57:45.108826 7fd1779b1700  0 block_while_resharding ERROR: 
bucket is still resharding, please retry
2018-01-15 16:57:45.119184 7fd1779b1700  0 NOTICE: resharding operation 
on bucket index detected, blocking
2018-01-15 16:57:45.260751 7fd1120e6700  0 block_while_resharding ERROR: 
bucket is still resharding, please retry
2018-01-15 16:57:45.280410 7fd1120e6700  0 NOTICE: resharding operation 
on bucket index detected, blocking
2018-01-15 16:57:45.300775 7fd15b979700  0 block_while_resharding ERROR: 
bucket is still resharding, please retry
2018-01-15 16:57:45.300971 7fd15b979700  0 WARNING: set_req_state_err 
err_no=2300 resorting to 500
2018-01-15 16:57:45.301042 7fd15b979700  0 ERROR: 
RESTFUL_IO(s)->complete_header() returned err=Input/output error


One radosgw process and two OSDs housing the bucket index/metadata are 
still busy, but it seems to be stuck again.


How long is this resharding process supposed to take? I cannot believe 
that an application is supposed to block for more than half an hour...


I feel inclined to open a bug report, but I am yet unshure where the 
problem lies.


Some information:

* 3 RGW processes, 3 OSD hosts with 12 HDD OSDs and 6 SSD OSDs
* Ceph 12.2.2
* Auto-Resharding on, Bucket Versioning & Lifecycle rule enabled.

Thanks,

Martin

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Adding a host node back to ceph cluster

2018-01-15 Thread Geoffrey Rhodes
Good day,

I'm having an issue re-deploying a host back into my production ceph
cluster.
Due to some bad memory (picked up by a scrub) which has been replaced I
felt the need to re-install the host to be sure no host files were damaged.

Prior to decommissioning the host I set the crush weight's on each osd to 0.
Once to osd's had flushed all data I stopped the daemon's.
I then purged the osd's from the crushmap with "ceph osd purge".
Followed by "ceph osd crush rm {host}" to remove the host bucket from the
crush map.

I also ran "ceph-deploy purge {host}" & "ceph-deploy purgedata {host}" from
the management node.
I then reinstalled the host and made the necessary config changes followed
by the appropriate ceph-deploy commands (ceph-deploy install...,
ceph-deploy admin..., ceph-deploy osd create...) to bring the host & it's
osd's back into the cluster, - same as I would when adding a new host node
to the cluster.

Running ceph osd df tree shows the osd's however the host node is not
displayed.
Inspecting the crush map I see no host bucket has been created or any
host's osd's listed.
The osd's also did not start which explains the weight being 0 but I
presume the osd's not starting isn't the only issue since the crush map
lacks the newly installed host detail.

Could anybody maybe tell me where I've gone wrong?
I'm also assuming there shouldn't be an issue using the same host name
again or do I manually add the host bucket and osd detail back into the
crush map or should ceph-deploy not take care of that?

Thanks

OS: Ubuntu 16.04.3 LTS
Ceph version: 12.2.1 / 12.2.2 - Luminous


Kind regards
Geoffrey Rhodes
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw fails with "ERROR: failed to initialize watch: (34) Numerical result out of range"

2018-01-15 Thread Alexander Peters
i created the dump output but it looks very cryptic to me so i can't really 
make much sense of it. is there anything to look for in particular?

i think i am going to read up on how interpret ltrace output...

BR
Alex

- Ursprüngliche Mail -
Von: "Brad Hubbard" 
An: "Alexander Peters" 
CC: "Ceph Users" 
Gesendet: Montag, 15. Januar 2018 03:09:53
Betreff: Re: [ceph-users] radosgw fails with "ERROR: failed to initialize 
watch: (34) Numerical result out of range"

On Mon, Jan 15, 2018 at 11:38 AM, Brad Hubbard  wrote:
> On Mon, Jan 15, 2018 at 10:38 AM, Alexander Peters
>  wrote:
>> Thanks for the reply - unfortunatly the link you send is behind a paywall so
>> at least for now i can’t read it.
>
> That's why I provided the cause as laid out in that article (pgp num > pg 
> num).
>
> Do you have any settings in ceph.conf related to pg_num or pgp_num?
>
> If not, please add your details to http://tracker.ceph.com/issues/22351

Rados can return ERANGE (34) in multiple places so identifying where
might be a big step towards working this out.

$ ltrace -fo /tmp/ltrace.out /usr/bin/radosgw --cluster ceph --name
client.radosgw.ctrl02 --setuser ceph --setgroup ceph -f -d

The objective is to find which function(s) return 34.

>
>>
>> output of ceph osd dump shows that pgp num == pg num:
>>
>> [root@ctrl01 ~]# ceph osd dump
>> epoch 142
>> fsid 0e2d841f-68fd-4629-9813-ab083e8c0f10
>> created 2017-12-20 23:04:59.781525
>> modified 2018-01-14 21:30:57.528682
>> flags sortbitwise,recovery_deletes,purged_snapdirs
>> crush_version 6
>> full_ratio 0.95
>> backfillfull_ratio 0.9
>> nearfull_ratio 0.85
>> require_min_compat_client jewel
>> min_compat_client jewel
>> require_osd_release luminous
>> pool 1 'glance' replicated size 3 min_size 2 crush_rule 0 object_hash
>> rjenkins pg_num 64 pgp_num 64 last_change 119 flags hashpspool stripe_width
>> 0 application rbd
>> removed_snaps [1~3]
>> pool 2 'cinder-2' replicated size 3 min_size 2 crush_rule 0 object_hash
>> rjenkins pg_num 64 pgp_num 64 last_change 120 flags hashpspool stripe_width
>> 0 application rbd
>> removed_snaps [1~3]
>> pool 3 'cinder-3' replicated size 3 min_size 2 crush_rule 0 object_hash
>> rjenkins pg_num 64 pgp_num 64 last_change 121 flags hashpspool stripe_width
>> 0 application rbd
>> removed_snaps [1~3]
>> pool 4 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash
>> rjenkins pg_num 8 pgp_num 8 last_change 94 owner 18446744073709551615 flags
>> hashpspool stripe_width 0 application rgw
>> max_osd 3
>> osd.0 up   in  weight 1 up_from 82 up_thru 140 down_at 79
>> last_clean_interval [23,78) 10.16.0.11:6800/1795 10.16.0.11:6801/1795
>> 10.16.0.11:6802/1795 10.16.0.11:6803/1795 exists,up
>> abe33844-6d98-4ede-81a8-a8bdc92dada8
>> osd.1 up   in  weight 1 up_from 73 up_thru 140 down_at 71
>> last_clean_interval [55,72) 10.16.0.13:6800/1756 10.16.0.13:6804/1001756
>> 10.16.0.13:6805/1001756 10.16.0.13:6806/1001756 exists,up
>> 0dab9372-6ffe-4a23-a8b7-4edca3745a2a
>> osd.2 up   in  weight 1 up_from 140 up_thru 140 down_at 133
>> last_clean_interval [31,132) 10.16.0.12:6800/1749 10.16.0.12:6801/1749
>> 10.16.0.12:6802/1749 10.16.0.12:6803/1749 exists,up
>> 220bba17-8119-4035-9e43-5b8eaa27562f
>>
>>
>> Am 15.01.2018 um 01:33 schrieb Brad Hubbard :
>>
>> On Mon, Jan 15, 2018 at 8:34 AM, Alexander Peters
>>  wrote:
>>
>> Hello
>>
>> I am currently experiencing a strange issue with my radosgw. It Fails to
>> start and all tit says is:
>> [root@ctrl02 ~]# /usr/bin/radosgw --cluster ceph --name
>> client.radosgw.ctrl02 --setuser ceph --setgroup ceph -f -d
>> 2018-01-14 21:30:57.132007 7f44ddd18e00  0 deferred set uid:gid to 167:167
>> (ceph:ceph)
>> 2018-01-14 21:30:57.132161 7f44ddd18e00  0 ceph version 12.2.2
>> (cf0baba3b47f9427c6c97e2144b094b7e5ba) luminous (stable), process
>> (unknown), pid 13928
>> 2018-01-14 21:30:57.556672 7f44ddd18e00 -1 ERROR: failed to initialize
>> watch: (34) Numerical result out of range
>> 2018-01-14 21:30:57.558752 7f44ddd18e00 -1 Couldn't init storage provider
>> (RADOS)
>>
>> (when started via systemctl it writes the same lines to the logfile)
>>
>> strange thing is that it is working on an other env that was installed with
>> the same set of ansible playbooks.
>> OS is CentOS Linux release 7.4.1708 (Core)
>>
>> Ceph is up and running ( I am currently using it for storing volumes and
>> images form Openstack )
>>
>> Does anyone have an idea how to debug this?
>>
>>
>> According to https://access.redhat.com/solutions/2778161 this can
>> happen if your pgp num is higher than the pg num.
>>
>> Check "ceph osd dump" output for that possibility.
>>
>>
>> Best Regards
>> Alexander
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> 

[ceph-users] Error message in the logs: "meta sync: ERROR: failed to read mdlog info with (2) No such file or directory"

2018-01-15 Thread Victor Flávio
Hello,

We've have a radosgw cluster(verion 12.2.2) in multisite mode. Our cluster
is formed by one master realm, with one master zonegroup and two
zones(which one is the master zone).

We've followed the instructions of Ceph documentation to install and
configure our cluster.

The cluster works as expected, the objects and users are being replicated
between the zones, but we always are getting this error message in our logs:


2018-01-15 12:25:00.119301 7f68868e5700  1 meta sync: ERROR: failed to read
mdlog info with (2) No such file or directory


Some details about the errors message(s):
 - They are only printed in the non-master zone log;
 - They are only printed when this "slave" zone try to sync the metadata
info;
 - In each synchronization cycle of the metadata info, the number of this
errors messages equals to the number of shards of metadata logs;
 - When we run the command "rados-admin mdlogs list", we've got a empty
array as output in both zones;
 - The output of "rados-admin sync status" says every is ok and synced,
which is true, despite the mdlog error messages in log.

Anyone got this same problem? And how to fix it. I've tried and failed to
many times to fix it.


-- 
Victor Flávio de Oliveira Santos
Fullstack Developer/DevOps
http://victorflavio.me
Twitter: @victorflavio
Skype: victorflavio.oliveira
Github: https://github.com/victorflavio
Telefone/Phone: +55 62 81616477
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] slow requests on a specific osd

2018-01-15 Thread lists

Hi,

On our three-node 24 OSDs ceph 10.2.10 cluster, we have started seeing 
slow requests on a specific OSD, during the the two-hour nightly xfs_fsr 
run from 05:00 - 07:00. This started after we applied the meltdown patches.


The specific osd.10 also has the highest space utilization of all OSDs 
cluster-wide, with 45%, while the others are mostly around 40%. All OSDs 
are the same 4TB platters with journal on ssd, all with weight 1.


Smart info for osd.10 shows nothing interesting I think:


Current Drive Temperature: 27 C
Drive Trip Temperature:60 C

Manufactured in week 04 of year 2016
Specified cycle count over device lifetime:  1
Accumulated start-stop cycles:  53
Specified load-unload count over device lifetime:  30
Accumulated load-unload cycles:  697
Elements in grown defect list: 0

Vendor (Seagate) cache information
  Blocks sent to initiator = 1933129649
  Blocks received from initiator = 869206640
  Blocks read from cache and sent to initiator = 2149311508
  Number of read and write commands whose size <= segment size = 676356809
  Number of read and write commands whose size > segment size = 12734900

Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 13625.88
  number of minutes until next internal SMART test = 8


Now my question:
Could it be that osd.10 just happens to contain some data chunks that 
are heavily needed by the VMs around that time, and that the added load 
of an xfs_fsr is simply too much for it to handle?


In that case, how about reweighting that osd.10 to "0", wait until all 
data has moved off osd.10, and then setting it back to "1". Would this 
result in *exactly* the same situation as before, or would it at least 
cause the data to have spread move better across the other OSDs?


(with the idea that better data spread across OSDs brings also better 
distribution of load between the OSDs)


Or other ideas to check out?

MJ
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com