Re: [ceph-users] Ceph on Solaris / Illumos

2015-04-15 Thread Mark Nelson



On 04/15/2015 08:16 AM, Jake Young wrote:

Has anyone compiled ceph (either osd or client) on a Solaris based OS?

The thread on ZFS support for osd got me thinking about using solaris as
an osd server. It would have much better ZFS performance and I wonder if
the osd performance without a journal would be 2x better.


Doubt it.  You may be able to do a little better, but you have to pay 
the piper some how.  If you clone from journal you will introduce 
fragmentation.  If you throw the journal away you'll suffer for 
everything but very large writes unless you throw safety away.  I think 
if we are going to generally beat filestore (not just for optimal 
benchmarking tests!) it's going to take some very careful cleverness. 
Thankfully Sage is very clever and is working on it in newstore. Even 
there, filestore has been proving difficult to beat for writes.




A second thought I had was using the Comstar iscsi / fcoe target
software that is part of Solaris. Has anyone done anything with a ceph
rbd client for Solaris based OSs?


No idea!



Jake


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph on Solaris / Illumos

2015-04-15 Thread Jake Young
Has anyone compiled ceph (either osd or client) on a Solaris based OS?

The thread on ZFS support for osd got me thinking about using solaris as an
osd server. It would have much better ZFS performance and I wonder if the
osd performance without a journal would be 2x better.

A second thought I had was using the Comstar iscsi / fcoe target software
that is part of Solaris. Has anyone done anything with a ceph rbd client
for Solaris based OSs?

Jake
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Do I have enough pgs?

2015-04-15 Thread Mark Nelson

On 04/15/2015 08:10 AM, Tony Harris wrote:

Hi all,

I have a cluster of 3 nodes, 18 OSDs.  I used the pgcalc to give a
suggested number of PGs - here was my list:

Group1   3 rep  18 OSDs  30% data  512PGs
Group2   3 rep  18 OSDs  30% data  512PGs
Group3   3 rep  18 OSDs  30% data  512PGs
Group4   2 rep  18 OSDs  5% data  256PGs
Group5   2 rep  18 OSDs  5% data  256PGs

My estimated growth is to 27-36 OSDs within the next 18 months, after
that probably pretty stagnant for the next several years.


I would use more, but I tend to error on the high side for small 
clusters.  The tool I mentioned in the other data distribution thread 
shows you the most and least subscribed OSDs in each pool.  You can use 
that to determine if you think the distribution looks reasonable.


Script is here:

https://github.com/ceph/cbt/blob/master/tools/readpgdump.py

You can run it by doing ceph pg dump | readpgdump.py

Mark



Thoughts?

-Tony


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Do I have enough pgs?

2015-04-15 Thread Tony Harris
Hi all,

I have a cluster of 3 nodes, 18 OSDs.  I used the pgcalc to give a
suggested number of PGs - here was my list:

Group1   3 rep  18 OSDs  30% data  512PGs
Group2   3 rep  18 OSDs  30% data  512PGs
Group3   3 rep  18 OSDs  30% data  512PGs
Group4   2 rep  18 OSDs  5% data  256PGs
Group5   2 rep  18 OSDs  5% data  256PGs

My estimated growth is to 27-36 OSDs within the next 18 months, after that
probably pretty stagnant for the next several years.

Thoughts?

-Tony
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph on Solaris / Illumos

2015-04-15 Thread Jake Young
On Wednesday, April 15, 2015, Mark Nelson mnel...@redhat.com wrote:



 On 04/15/2015 08:16 AM, Jake Young wrote:

 Has anyone compiled ceph (either osd or client) on a Solaris based OS?

 The thread on ZFS support for osd got me thinking about using solaris as
 an osd server. It would have much better ZFS performance and I wonder if
 the osd performance without a journal would be 2x better.


 Doubt it.  You may be able to do a little better, but you have to pay the
 piper some how.  If you clone from journal you will introduce
 fragmentation.  If you throw the journal away you'll suffer for everything
 but very large writes unless you throw safety away.  I think if we are
 going to generally beat filestore (not just for optimal benchmarking
 tests!) it's going to take some very careful cleverness. Thankfully Sage is
 very clever and is working on it in newstore. Even there, filestore has
 been proving difficult to beat for writes.


That's interesting. I've been under the impression that the ideal
osd config was using a stable and fast BTRFS (which doesn't exist yet) with
no journal.

In my specific case, I don't want to use an external journal. I've gone
down the path of using RAID controllers with write-back cache and BBUs with
each disk in its own RAID0 group, instead of SSD journals. (Thanks for your
performance articles BTW, they were very helpful!)

My take on your results indicates that IO throughput performance on XFS
with same disk journal and WB cache on the RAID card was basically the same
or better than BTRFS with no journal.  In addition, BTRFS typically used
much more CPU.

Has BTRFS performance gotten any better since you wrote the performance
articles?

Have you compared ZFS (ZoL) performance to BTRFS?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph site is very slow

2015-04-15 Thread Gregory Farnum
People are working on it but I understand there was/is a DoS attack going
on. :/
-Greg
On Wed, Apr 15, 2015 at 1:50 AM Ignazio Cassano ignaziocass...@gmail.com
wrote:

 Many thanks

 2015-04-15 10:44 GMT+02:00 Wido den Hollander w...@42on.com:

 On 04/15/2015 10:20 AM, Ignazio Cassano wrote:
  Hi all,
  why ceph.com is very slow ?

 Not known right now. But you can try eu.ceph.com for your packages and
 downloads.

  It is impossible download files for installing ceph.
  Regards
  Ignazio
 
 
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 


 --
 Wido den Hollander
 42on B.V.
 Ceph trainer and consultant

 Phone: +31 (0)20 700 9902
 Skype: contact42on
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Binding a pool to certain OSDs

2015-04-15 Thread Giuseppe Civitella
So it was a PG problem. I added a couple of OSD per host, reconfigured the
CRUSH map and the cluster began to work properly.

Thanks
Giuseppe

2015-04-14 19:02 GMT+02:00 Saverio Proto ziopr...@gmail.com:

 No error message. You just finish the RAM memory and you blow up the
 cluster because of too many PGs.

 Saverio

 2015-04-14 18:52 GMT+02:00 Giuseppe Civitella 
 giuseppe.civite...@gmail.com:
  Hi Saverio,
 
  I first made a test on my test staging lab where I have only 4 OSD.
  On my mon servers (which run other services) I have 16BG RAM, 15GB used
 but
  5 cached. On the OSD servers I have 3GB RAM, 3GB used but 2 cached.
  ceph -s tells me nothing about PGs, shouldn't I get an error message
 from
  its output?
 
  Thanks
  Giuseppe
 
  2015-04-14 18:20 GMT+02:00 Saverio Proto ziopr...@gmail.com:
 
  You only have 4 OSDs ?
  How much RAM per server ?
  I think you have already too many PG. Check your RAM usage.
 
  Check on Ceph wiki guidelines to dimension the correct number of PGs.
  Remeber that everytime to create a new pool you add PGs into the
  system.
 
  Saverio
 
 
  2015-04-14 17:58 GMT+02:00 Giuseppe Civitella
  giuseppe.civite...@gmail.com:
   Hi all,
  
   I've been following this tutorial to realize my setup:
  
  
 http://www.sebastien-han.fr/blog/2014/08/25/ceph-mix-sata-and-ssd-within-the-same-box/
  
   I got this CRUSH map from my test lab:
   http://paste.openstack.org/show/203887/
  
   then I modified the map and uploaded it. This is the final version:
   http://paste.openstack.org/show/203888/
  
   When applied the new CRUSH map, after some rebalancing, I get this
   health
   status:
   [- avalon1 root@controller001 Ceph -] # ceph -s
   cluster af09420b-4032-415e-93fc-6b60e9db064e
health HEALTH_WARN crush map has legacy tunables;
 mon.controller001
   low
   disk space; clock skew detected on mon.controller002
monmap e1: 3 mons at
  
   {controller001=
 10.235.24.127:6789/0,controller002=10.235.24.128:6789/0,controller003=10.235.24.129:6789/0
 },
   election epoch 314, quorum 0,1,2
   controller001,controller002,controller003
osdmap e3092: 4 osds: 4 up, 4 in
 pgmap v785873: 576 pgs, 6 pools, 71548 MB data, 18095 objects
   8842 MB used, 271 GB / 279 GB avail
576 active+clean
  
   and this osd tree:
   [- avalon1 root@controller001 Ceph -] # ceph osd tree
   # idweight  type name   up/down reweight
   -8  2   root sed
   -5  1   host ceph001-sed
   2   1   osd.2   up  1
   -7  1   host ceph002-sed
   3   1   osd.3   up  1
   -1  2   root default
   -4  1   host ceph001-sata
   0   1   osd.0   up  1
   -6  1   host ceph002-sata
   1   1   osd.1   up  1
  
   which seems not a bad situation. The problem rise when I try to
 create a
   new
   pool, the command ceph osd pool create sed 128 128 gets stuck. It
   never
   ends.  And I noticed that my Cinder installation is not able to create
   volumes anymore.
   I've been looking in the logs for errors and found nothing.
   Any hint about how to proceed to restore my ceph cluster?
   Is there something wrong with the steps I take to update the CRUSH
 map?
   Is
   the problem related to Emperor?
  
   Regards,
   Giuseppe
  
  
  
  
   2015-04-13 18:26 GMT+02:00 Giuseppe Civitella
   giuseppe.civite...@gmail.com:
  
   Hi all,
  
   I've got a Ceph cluster which serves volumes to a Cinder
 installation.
   It
   runs Emperor.
   I'd like to be able to replace some of the disks with OPAL disks and
   create a new pool which uses exclusively the latter kind of disk. I'd
   like
   to have a traditional pool and a secure one coexisting on the
 same
   ceph
   host. I'd then use Cinder multi backend feature to serve them.
   My question is: how is it possible to realize such a setup? How can I
   bind
   a pool to certain OSDs?
  
   Thanks
   Giuseppe
  
  
  
   ___
   ceph-users mailing list
   ceph-users@lists.ceph.com
   http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
  
 
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph on Solaris / Illumos

2015-04-15 Thread Jake Young
On Wednesday, April 15, 2015, Alexandre Marangone amara...@redhat.com
wrote:

 The LX branded zones might be a way to run OSDs on Illumos:
 https://wiki.smartos.org/display/DOC/LX+Branded+Zones

 For fun, I tried a month or so ago, managed to have a quorum. OSDs
 wouldn't start, I didn't look further as far as debugging. I'll give
 it a go when I have more time.


Hmm. That is a great idea.

I'll give LX branded zones a shot for both server and client use cases.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph on Solaris / Illumos

2015-04-15 Thread Alexandre Marangone
The LX branded zones might be a way to run OSDs on Illumos:
https://wiki.smartos.org/display/DOC/LX+Branded+Zones

For fun, I tried a month or so ago, managed to have a quorum. OSDs
wouldn't start, I didn't look further as far as debugging. I'll give
it a go when I have more time.

On Wed, Apr 15, 2015 at 7:04 AM, Mark Nelson mnel...@redhat.com wrote:


 On 04/15/2015 08:16 AM, Jake Young wrote:

 Has anyone compiled ceph (either osd or client) on a Solaris based OS?

 The thread on ZFS support for osd got me thinking about using solaris as
 an osd server. It would have much better ZFS performance and I wonder if
 the osd performance without a journal would be 2x better.


 Doubt it.  You may be able to do a little better, but you have to pay the
 piper some how.  If you clone from journal you will introduce fragmentation.
 If you throw the journal away you'll suffer for everything but very large
 writes unless you throw safety away.  I think if we are going to generally
 beat filestore (not just for optimal benchmarking tests!) it's going to take
 some very careful cleverness. Thankfully Sage is very clever and is working
 on it in newstore. Even there, filestore has been proving difficult to beat
 for writes.


 A second thought I had was using the Comstar iscsi / fcoe target
 software that is part of Solaris. Has anyone done anything with a ceph
 rbd client for Solaris based OSs?


 No idea!


 Jake


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rados Gateway and keystone

2015-04-15 Thread ghislain.chevalier
Hi,

Despite the creation of ec2 credentials which provides an accesskey and a 
secretkey for a user, it’s always impossible to connect using S3 
(Forbidden/Access denied).
All is right using swift (create container, list container, get object, put 
object, delete object)
I use cloudberry client to do so.

Does someone know how I can check if the interoperability between keystone and 
the rgw is correctly set up?
In the rgw pools? in the radosgw metadata?

Best regards

De : ceph-users [mailto:ceph-users-boun...@lists.ceph.com] De la part de 
ghislain.cheval...@orange.com
Envoyé : mercredi 15 avril 2015 13:16
À : Erik McCormick
Cc : ceph-users
Objet : Re: [ceph-users] Rados Gateway and keystone

Thanks a lot
That helps.

De : Erik McCormick [mailto:emccorm...@cirrusseven.com]
Envoyé : lundi 13 avril 2015 18:32
À : CHEVALIER Ghislain IMT/OLPS
Cc : ceph-users
Objet : Re: [ceph-users] Rados Gateway and keystone

I haven't really used the S3 stuff much, but the credentials should be in 
keystone already. If you're in horizon, you can download them under Access and 
Security-API Access. Using the CLI you can use the openstack client like 
openstack credential list | show | create | delete | set or with the 
keystone client like keystone ec2-credentials-list, etc.  Then you should be 
able to feed those credentials to the rgw like a normal S3 API call.

Cheers,
Erik

On Mon, Apr 13, 2015 at 10:16 AM, 
ghislain.cheval...@orange.commailto:ghislain.cheval...@orange.com wrote:
Hi all,

Coming back to that issue.

I successfully used keystone users for the rados gateway and the swift API but 
I still don't understand how it can work with S3 API and i.e. S3 users 
(AccessKey/SecretKey)

I found a swift3 initiative but I think It's only compliant in a pure OpenStack 
swift environment  by setting up a specific plug-in.
https://github.com/stackforge/swift3

A rgw can be, at the same, time under keystone control and  standard 
radosgw-admin if
- for swift, you use the right authentication service (keystone or internal)
- for S3, you use the internal authentication service

So, my questions are still valid.
How can a rgw work for S3 users if there are stored in keystone? Which is the 
accesskey and secretkey?
What is the purpose of rgw s3 auth use keystone parameter ?

Best regards

--
De : ceph-users 
[mailto:ceph-users-boun...@lists.ceph.commailto:ceph-users-boun...@lists.ceph.com]
 De la part de 
ghislain.cheval...@orange.commailto:ghislain.cheval...@orange.com
Envoyé : lundi 23 mars 2015 14:03
À : ceph-users
Objet : [ceph-users] Rados Gateway and keystone

Hi All,

I just would to be sure about keystone configuration for Rados Gateway.

I read the documentation http://ceph.com/docs/master/radosgw/keystone/ and 
http://ceph.com/docs/master/radosgw/config-ref/?highlight=keystone
but I didn't catch if after having configured the rados gateway (ceph.conf) in 
order to use keystone, it becomes mandatory to create all the users in it.

In other words, can a rgw be, at the same, time under keystone control and  
standard radosgw-admin ?
How does it work for S3 users ?
What is the purpose of rgw s3 auth use keystone parameter ?

Best regards

- - - - - - - - - - - - - - - - -
Ghislain Chevalier
+33299124432tel:%2B33299124432
+33788624370tel:%2B33788624370
ghislain.cheval...@orange.commailto:ghislain.cheval...@orange.com
_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.

_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may 

Re: [ceph-users] Ceph on Solaris / Illumos

2015-04-15 Thread Mark Nelson

On 04/15/2015 10:36 AM, Jake Young wrote:



On Wednesday, April 15, 2015, Mark Nelson mnel...@redhat.com
mailto:mnel...@redhat.com wrote:



On 04/15/2015 08:16 AM, Jake Young wrote:

Has anyone compiled ceph (either osd or client) on a Solaris
based OS?

The thread on ZFS support for osd got me thinking about using
solaris as
an osd server. It would have much better ZFS performance and I
wonder if
the osd performance without a journal would be 2x better.


Doubt it.  You may be able to do a little better, but you have to
pay the piper some how.  If you clone from journal you will
introduce fragmentation.  If you throw the journal away you'll
suffer for everything but very large writes unless you throw safety
away.  I think if we are going to generally beat filestore (not just
for optimal benchmarking tests!) it's going to take some very
careful cleverness. Thankfully Sage is very clever and is working on
it in newstore. Even there, filestore has been proving difficult to
beat for writes.


That's interesting. I've been under the impression that the ideal
osd config was using a stable and fast BTRFS (which doesn't exist
yet) with no journal.


This is sort of unrelated to the journal specifically, but BTRFS with 
RBD will start fragmenting terribly due to how COW works (and how it 
relates to snapshots too).  More related to the journal:  At one point 
we were thinking about cloning from the journal on BTRFS, but that also 
potentially leads to nasty fragmentation even if the initial behavior 
would look very good.  I haven't done any testing that I can remember of 
BTRFS with no journal.  I'm not sure if it even still works...




In my specific case, I don't want to use an external journal. I've gone
down the path of using RAID controllers with write-back cache and BBUs
with each disk in its own RAID0 group, instead of SSD journals. (Thanks
for your performance articles BTW, they were very helpful!)

My take on your results indicates that IO throughput performance on XFS
with same disk journal and WB cache on the RAID card was basically the
same or better than BTRFS with no journal.  In addition, BTRFS typically
used much more CPU.

Has BTRFS performance gotten any better since you wrote the performance
articles?


So the trick with those articles is that the systems are fresh, and most 
of the initial articles were using rados bench which is always writing 
out new objects vs something like RBD where you are (usually) doing 
writes to existing objects that represent the blocks.  If you were to do 
a bunch of random 4k writes and then later try to do sequential reads, 
you'd see BTRFS sequential read performance tank.  We actually did tests 
like that with emperor during the firefly development cycle.  I've 
included the results. Basically the first iteration of the test cycle 
looks great on BTRFS, then you see read performance drop way down. 
Eventually write performance also is likely drop as the disks become 
extremely fragmented (we may even see a little of that in those tests).




Have you compared ZFS (ZoL) performance to BTRFS?


I did way back in 2013 when we were working with Brian Behlendorf to fix 
xattr bugs in ZOL.  It was quite a bit slower if you didn't enable SA 
xattrs.  With SA xattrs, it was much closer, but not as fast as btrfs or 
xfs.  I didn't do a lot of tuning though and Ceph wasn't making good use 
of ZFS features, so it's very possible things have changed.







Emeror Raw Performance Data.ods
Description: application/vnd.oasis.opendocument.spreadsheet
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph repo - RSYNC?

2015-04-15 Thread Paul Mansfield

Sorry for starting a new thread, I've only just subscribed to the list
and the archive on the mail listserv is far from complete at the moment.

on 8th March David Moreau Simard said
  http://www.spinics.net/lists/ceph-users/msg16334.html
that there was a rsync'able mirror of the ceph repo at
http://ceph.mirror.iweb.ca/


My problem is that the repo doesn't include Hammer. Is there someone who
can get that added to the mirror?

thanks very much
Paul
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph on Debian Jessie stopped working

2015-04-15 Thread Chad William Seys
Hi All,
Earlier ceph on Debian Jessie was working.  Jessie is running 3.16.7 .

Now when I modprobe rbd , no /dev/rbd appear.

# dmesg | grep -e rbd -e ceph
[   15.814423] Key type ceph registered
[   15.814461] libceph: loaded (mon/osd proto 15/24)
[   15.831092] rbd: loaded
[   22.084573] rbd: no image name provided
[   22.230176] rbd: no image name provided


Some files appear under /sys
ls /sys/devices/rbd
power  uevent

ceph-fuse /mnt/cephfs just hangs.

I haven't changed the ceph config, but possibly there were package updates.  I 
did install a earlier Jessie kernel from a machine which is still working and 
rebooted.  No luck.

Any ideas of what to check next?

Thanks,
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph repo - RSYNC?

2015-04-15 Thread Robert LeBlanc
http://eu.ceph.com/ has rsync and Hammer.

On Wed, Apr 15, 2015 at 10:17 AM, Paul Mansfield 
paul.mansfi...@alcatel-lucent.com wrote:


 Sorry for starting a new thread, I've only just subscribed to the list
 and the archive on the mail listserv is far from complete at the moment.

 on 8th March David Moreau Simard said
   http://www.spinics.net/lists/ceph-users/msg16334.html
 that there was a rsync'able mirror of the ceph repo at
 http://ceph.mirror.iweb.ca/


 My problem is that the repo doesn't include Hammer. Is there someone who
 can get that added to the mirror?

 thanks very much
 Paul
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] mds crashing

2015-04-15 Thread Kyle Hutson
I upgraded to 0.94.1 from 0.94 on Monday, and everything had been going
pretty well.

Then, about noon today, we had an mds crash. And then the failover mds
crashed. And this cascaded through all 4 mds servers we have.

If I try to start it ('service ceph start mds' on CentOS 7.1), it appears
to be OK for a little while. ceph -w goes through 'replay' 'reconnect'
'rejoin' 'clientreplay' and 'active' but nearly immediately after getting
to 'active', it crashes again.

I have the mds log at
http://people.beocat.cis.ksu.edu/~kylehutson/ceph-mds.hobbit01.log

For the possibly, but not necessarily, useful background info.
- Yesterday we took our erasure coded pool and increased both pg_num and
pgp_num from 2048 to 4096. We still have several objects misplaced (~17%),
but those seem to be continuing to clean themselves up.
- We are in the midst of a large (300+ TB) rsync from our old (non-ceph)
filesystem to this filesystem.
- Before we realized the mds crashes, we had just changed the size of our
metadata pool from 2 to 4.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph repo - RSYNC?

2015-04-15 Thread David Moreau Simard
Hey, you're right.

Thanks for bringing that to my attention, it's syncing now :)

Should be available soon.

David Moreau Simard

On 2015-04-15 12:17 PM, Paul Mansfield wrote:
 Sorry for starting a new thread, I've only just subscribed to the list
 and the archive on the mail listserv is far from complete at the moment.

 on 8th March David Moreau Simard said
http://www.spinics.net/lists/ceph-users/msg16334.html
 that there was a rsync'able mirror of the ceph repo at
 http://ceph.mirror.iweb.ca/


 My problem is that the repo doesn't include Hammer. Is there someone who
 can get that added to the mirror?

 thanks very much
 Paul
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upgrade from Giant 0.87-1 to Hammer 0.94-1

2015-04-15 Thread Steffen W Sørensen
Also our calamari web UI won't authenticate anymore, can’t see any issues in 
any log under /var/log/calamari, any hints on what to look for are appreciated, 
TIA!

# dpkg -l | egrep -i calamari\|ceph
ii  calamari-clients   1.2.3.1-2-gc1f14b2all
  Inktank Calamari user interface
ii  calamari-server1.3-rc-16-g321cd58amd64  
  Inktank package containing the Calamari management srever
ii  ceph   0.94.1-1~bpo70+1  amd64  
  distributed storage and file system
ii  ceph-common0.94.1-1~bpo70+1  amd64  
  common utilities to mount and interact with a ceph storage cluster
ii  ceph-deploy1.5.23~bpo70+1all
  Ceph-deploy is an easy to use configuration tool
ii  ceph-fs-common 0.94.1-1~bpo70+1  amd64  
  common utilities to mount and interact with a ceph file system
ii  ceph-fuse  0.94.1-1~bpo70+1  amd64  
  FUSE-based client for the Ceph distributed file system
ii  ceph-mds   0.94.1-1~bpo70+1  amd64  
  metadata server for the ceph distributed file system
ii  curl   7.29.0-1~bpo70+1.ceph amd64  
  command line tool for transferring data with URL syntax
ii  libcephfs1 0.94.1-1~bpo70+1  amd64  
  Ceph distributed file system client library
ii  libcurl3:amd64 7.29.0-1~bpo70+1.ceph amd64  
  easy-to-use client-side URL transfer library (OpenSSL flavour)
ii  libcurl3-gnutls:amd64  7.29.0-1~bpo70+1.ceph amd64  
  easy-to-use client-side URL transfer library (GnuTLS flavour)
ii  libleveldb1:amd64  1.12.0-1~bpo70+1.ceph amd64  
  fast key-value storage library
ii  python-ceph0.94.1-1~bpo70+1  amd64  
  Meta-package for python libraries for the Ceph libraries
ii  python-cephfs  0.94.1-1~bpo70+1  amd64  
  Python libraries for the Ceph libcephfs library
ii  python-rados   0.94.1-1~bpo70+1  amd64  
  Python libraries for the Ceph librados library
ii  python-rbd 0.94.1-1~bpo70+1  amd64  
  Python libraries for the Ceph librbd library


 On 16/04/2015, at 00.41, Steffen W Sørensen ste...@me.com wrote:
 
 Hi,
 
 Successfully upgrade a small development 4x node Giant 0.87-1 cluster to 
 Hammer 0.94-1, each node with 6x OSD - 146GB, 19 pools, mainly 2 in usage.
 Only minor thing now ceph -s complaining over too may PGs, previously Giant 
 had complain of too few, so various pools were bumped up till health status 
 was okay as before upgrading. Admit, that after bumping PGs up in Giant we 
 had changed pool sizes from 3 to 2  min 1 in fear of perf. when 
 backfilling/recovering PGs.
 
 
 # ceph -s
cluster 16fe2dcf-2629-422f-a649-871deba78bcd
 health HEALTH_WARN
too many PGs per OSD (1237  max 300)
 monmap e29: 3 mons at 
 {0=10.0.3.4:6789/0,1=10.0.3.2:6789/0,2=10.0.3.1:6789/0}
election epoch 1370, quorum 0,1,2 2,1,0
 mdsmap e142: 1/1/1 up {0=2=up:active}, 1 up:standby
 osdmap e3483: 24 osds: 24 up, 24 in
  pgmap v3719606: 14848 pgs, 19 pools, 530 GB data, 133 kobjects
1055 GB used, 2103 GB / 3159 GB avail
   14848 active+clean
 
 Can we just reduce PGs again and should we decrement in minor steps one pool 
 at a time…
 
 Any thoughts, TIA!
 
 /Steffen
 
 
 1. restart the monitor daemons on each node
 2. then, restart the osd daemons on each node
 3. then, restart the mds daemons on each node
 4. then, restart the radosgw daemon on each node
 
 Regards.
 
 -- 
 François Lafont
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] many slow requests on different osds (scrubbing disabled)

2015-04-15 Thread Dominik Mostowiec
Hi,
From few days we notice on our cluster many slow request.
Cluster:
ceph version 0.67.11
3 x mon
36 hosts - 10 osd ( 4T ) + 2 SSD (journals)
Scrubbing and deep scrubbing is disabled but count of slow requests is
still increasing.
Disk utilisation is very small after we have disabled scrubbings.
Log from one write with slow with debug osd = 20/20
osd.284 - master: http://pastebin.com/xPtpNU6n
osd.186 - replica: http://pastebin.com/NS1gmhB0
osd.177 - replica: http://pastebin.com/Ln9L2Z5Z

Can you help me find what is reason of it?

-- 
Regards
Dominik
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Upgrade from Giant 0.87-1 to Hammer 0.94-1

2015-04-15 Thread Steffen W Sørensen
Hi,

Successfully upgrade a small development 4x node Giant 0.87-1 cluster to Hammer 
0.94-1, each node with 6x OSD - 146GB, 19 pools, mainly 2 in usage.
Only minor thing now ceph -s complaining over too may PGs, previously Giant had 
complain of too few, so various pools were bumped up till health status was 
okay as before upgrading. Admit, that after bumping PGs up in Giant we had 
changed pool sizes from 3 to 2  min 1 in fear of perf. when 
backfilling/recovering PGs.


# ceph -s
cluster 16fe2dcf-2629-422f-a649-871deba78bcd
 health HEALTH_WARN
too many PGs per OSD (1237  max 300)
 monmap e29: 3 mons at 
{0=10.0.3.4:6789/0,1=10.0.3.2:6789/0,2=10.0.3.1:6789/0}
election epoch 1370, quorum 0,1,2 2,1,0
 mdsmap e142: 1/1/1 up {0=2=up:active}, 1 up:standby
 osdmap e3483: 24 osds: 24 up, 24 in
  pgmap v3719606: 14848 pgs, 19 pools, 530 GB data, 133 kobjects
1055 GB used, 2103 GB / 3159 GB avail
   14848 active+clean

Can we just reduce PGs again and should we decrement in minor steps one pool at 
a time…

Any thoughts, TIA!

/Steffen


 1. restart the monitor daemons on each node
 2. then, restart the osd daemons on each node
 3. then, restart the mds daemons on each node
 4. then, restart the radosgw daemon on each node
 
 Regards.
 
 -- 
 François Lafont
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mds crashing

2015-04-15 Thread Kyle Hutson
Thank you, John!

That was exactly the bug we were hitting. My Google-fu didn't lead me to
this one.

On Wed, Apr 15, 2015 at 4:16 PM, John Spray john.sp...@redhat.com wrote:

 On 15/04/2015 20:02, Kyle Hutson wrote:

 I upgraded to 0.94.1 from 0.94 on Monday, and everything had been going
 pretty well.

 Then, about noon today, we had an mds crash. And then the failover mds
 crashed. And this cascaded through all 4 mds servers we have.

 If I try to start it ('service ceph start mds' on CentOS 7.1), it appears
 to be OK for a little while. ceph -w goes through 'replay' 'reconnect'
 'rejoin' 'clientreplay' and 'active' but nearly immediately after getting
 to 'active', it crashes again.

 I have the mds log at http://people.beocat.cis.ksu.
 edu/~kylehutson/ceph-mds.hobbit01.log http://people.beocat.cis.ksu.
 edu/%7Ekylehutson/ceph-mds.hobbit01.log

 For the possibly, but not necessarily, useful background info.
 - Yesterday we took our erasure coded pool and increased both pg_num and
 pgp_num from 2048 to 4096. We still have several objects misplaced (~17%),
 but those seem to be continuing to clean themselves up.
 - We are in the midst of a large (300+ TB) rsync from our old (non-ceph)
 filesystem to this filesystem.
 - Before we realized the mds crashes, we had just changed the size of our
 metadata pool from 2 to 4.


 It looks like you're seeing http://tracker.ceph.com/issues/10449, which
 is a situation where the SessionMap object becomes too big for the MDS to
 save.The cause of it in that case was stuck requests from a misbehaving
 client running a slightly older kernel.

 Assuming you're using the kernel client and having a similar problem, you
 could try to work around this situation by forcibly unmounting the clients
 while the MDS is offline, such that during clientreplay the MDS will remove
 them from the SessionMap after timing out, and then next time it tries to
 save the map it won't be oversized.  If that works, you could then look
 into getting newer kernels on the clients to avoid hitting the issue again
 -- the #10449 ticket has some pointers about which kernel changes were
 relevant.

 Cheers,
 John

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mds crashing

2015-04-15 Thread Adam Tygart
What is significantly smaller? We have 67 requests in the 16,400,000
range and 250 in the 18,900,000 range.

Thanks,

Adam

On Wed, Apr 15, 2015 at 8:38 PM, Yan, Zheng uker...@gmail.com wrote:
 On Thu, Apr 16, 2015 at 9:07 AM, Adam Tygart mo...@ksu.edu wrote:
 We are using 3.18.6-gentoo. Based on that, I was hoping that the
 kernel bug referred to in the bug report would have been fixed.


 The bug was supposed to be fixed, but you hit the bug again. could you
 check if the kernel client has any hang mds request. (check
 /sys/kernel/debug/ceph/*/mdsc on the machine that contain cephfs
 mount. If there is any request whose ID is significant smaller than
 other requests' IDs)

 Regards
 Yan, Zheng

 --
 Adam

 On Wed, Apr 15, 2015 at 8:02 PM, Yan, Zheng uker...@gmail.com wrote:
 On Thu, Apr 16, 2015 at 5:29 AM, Kyle Hutson kylehut...@ksu.edu wrote:
 Thank you, John!

 That was exactly the bug we were hitting. My Google-fu didn't lead me to
 this one.


 here is the bug report http://tracker.ceph.com/issues/10449. It's a
 kernel client bug which causes the session map size increase
 infinitely. which version of linux kernel are using?

 Regards
 Yan, Zheng



 On Wed, Apr 15, 2015 at 4:16 PM, John Spray john.sp...@redhat.com wrote:

 On 15/04/2015 20:02, Kyle Hutson wrote:

 I upgraded to 0.94.1 from 0.94 on Monday, and everything had been going
 pretty well.

 Then, about noon today, we had an mds crash. And then the failover mds
 crashed. And this cascaded through all 4 mds servers we have.

 If I try to start it ('service ceph start mds' on CentOS 7.1), it appears
 to be OK for a little while. ceph -w goes through 'replay' 'reconnect'
 'rejoin' 'clientreplay' and 'active' but nearly immediately after 
 getting to
 'active', it crashes again.

 I have the mds log at
 http://people.beocat.cis.ksu.edu/~kylehutson/ceph-mds.hobbit01.log
 http://people.beocat.cis.ksu.edu/%7Ekylehutson/ceph-mds.hobbit01.log

 For the possibly, but not necessarily, useful background info.
 - Yesterday we took our erasure coded pool and increased both pg_num and
 pgp_num from 2048 to 4096. We still have several objects misplaced 
 (~17%),
 but those seem to be continuing to clean themselves up.
 - We are in the midst of a large (300+ TB) rsync from our old (non-ceph)
 filesystem to this filesystem.
 - Before we realized the mds crashes, we had just changed the size of our
 metadata pool from 2 to 4.


 It looks like you're seeing http://tracker.ceph.com/issues/10449, which is
 a situation where the SessionMap object becomes too big for the MDS to
 save.The cause of it in that case was stuck requests from a misbehaving
 client running a slightly older kernel.

 Assuming you're using the kernel client and having a similar problem, you
 could try to work around this situation by forcibly unmounting the clients
 while the MDS is offline, such that during clientreplay the MDS will 
 remove
 them from the SessionMap after timing out, and then next time it tries to
 save the map it won't be oversized.  If that works, you could then look 
 into
 getting newer kernels on the clients to avoid hitting the issue again -- 
 the
 #10449 ticket has some pointers about which kernel changes were relevant.

 Cheers,
 John



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mds crashing

2015-04-15 Thread Adam Tygart
We are using 3.18.6-gentoo. Based on that, I was hoping that the
kernel bug referred to in the bug report would have been fixed.

--
Adam

On Wed, Apr 15, 2015 at 8:02 PM, Yan, Zheng uker...@gmail.com wrote:
 On Thu, Apr 16, 2015 at 5:29 AM, Kyle Hutson kylehut...@ksu.edu wrote:
 Thank you, John!

 That was exactly the bug we were hitting. My Google-fu didn't lead me to
 this one.


 here is the bug report http://tracker.ceph.com/issues/10449. It's a
 kernel client bug which causes the session map size increase
 infinitely. which version of linux kernel are using?

 Regards
 Yan, Zheng



 On Wed, Apr 15, 2015 at 4:16 PM, John Spray john.sp...@redhat.com wrote:

 On 15/04/2015 20:02, Kyle Hutson wrote:

 I upgraded to 0.94.1 from 0.94 on Monday, and everything had been going
 pretty well.

 Then, about noon today, we had an mds crash. And then the failover mds
 crashed. And this cascaded through all 4 mds servers we have.

 If I try to start it ('service ceph start mds' on CentOS 7.1), it appears
 to be OK for a little while. ceph -w goes through 'replay' 'reconnect'
 'rejoin' 'clientreplay' and 'active' but nearly immediately after getting 
 to
 'active', it crashes again.

 I have the mds log at
 http://people.beocat.cis.ksu.edu/~kylehutson/ceph-mds.hobbit01.log
 http://people.beocat.cis.ksu.edu/%7Ekylehutson/ceph-mds.hobbit01.log

 For the possibly, but not necessarily, useful background info.
 - Yesterday we took our erasure coded pool and increased both pg_num and
 pgp_num from 2048 to 4096. We still have several objects misplaced (~17%),
 but those seem to be continuing to clean themselves up.
 - We are in the midst of a large (300+ TB) rsync from our old (non-ceph)
 filesystem to this filesystem.
 - Before we realized the mds crashes, we had just changed the size of our
 metadata pool from 2 to 4.


 It looks like you're seeing http://tracker.ceph.com/issues/10449, which is
 a situation where the SessionMap object becomes too big for the MDS to
 save.The cause of it in that case was stuck requests from a misbehaving
 client running a slightly older kernel.

 Assuming you're using the kernel client and having a similar problem, you
 could try to work around this situation by forcibly unmounting the clients
 while the MDS is offline, such that during clientreplay the MDS will remove
 them from the SessionMap after timing out, and then next time it tries to
 save the map it won't be oversized.  If that works, you could then look into
 getting newer kernels on the clients to avoid hitting the issue again -- the
 #10449 ticket has some pointers about which kernel changes were relevant.

 Cheers,
 John



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mds crashing

2015-04-15 Thread Yan, Zheng
On Thu, Apr 16, 2015 at 9:48 AM, Adam Tygart mo...@ksu.edu wrote:
 What is significantly smaller? We have 67 requests in the 16,400,000
 range and 250 in the 18,900,000 range.


that explains the crash. could you help me to debug this issue.

 send /sys/kernel/debug/ceph/*/mdsc to me.

 run echo module ceph +p  /sys/kernel/debug/dynamic_debug/control
on the cephfs mount machine
 restart the mds and wait until it crash again
 run echo module ceph -p  /sys/kernel/debug/dynamic_debug/control
on the cephfs mount machine
 send kernel message of the cephfs mount machine to me (should in
/var/log/kerne.log or /var/log/message)

to recover from the crash. you can either force reset the machine
contains cephfs mount or add mds wipe sessions = 1 to mds section of
ceph.conf

Regards
Yan, Zheng


 Thanks,

 Adam

 On Wed, Apr 15, 2015 at 8:38 PM, Yan, Zheng uker...@gmail.com wrote:
 On Thu, Apr 16, 2015 at 9:07 AM, Adam Tygart mo...@ksu.edu wrote:
 We are using 3.18.6-gentoo. Based on that, I was hoping that the
 kernel bug referred to in the bug report would have been fixed.


 The bug was supposed to be fixed, but you hit the bug again. could you
 check if the kernel client has any hang mds request. (check
 /sys/kernel/debug/ceph/*/mdsc on the machine that contain cephfs
 mount. If there is any request whose ID is significant smaller than
 other requests' IDs)

 Regards
 Yan, Zheng

 --
 Adam

 On Wed, Apr 15, 2015 at 8:02 PM, Yan, Zheng uker...@gmail.com wrote:
 On Thu, Apr 16, 2015 at 5:29 AM, Kyle Hutson kylehut...@ksu.edu wrote:
 Thank you, John!

 That was exactly the bug we were hitting. My Google-fu didn't lead me to
 this one.


 here is the bug report http://tracker.ceph.com/issues/10449. It's a
 kernel client bug which causes the session map size increase
 infinitely. which version of linux kernel are using?

 Regards
 Yan, Zheng



 On Wed, Apr 15, 2015 at 4:16 PM, John Spray john.sp...@redhat.com wrote:

 On 15/04/2015 20:02, Kyle Hutson wrote:

 I upgraded to 0.94.1 from 0.94 on Monday, and everything had been going
 pretty well.

 Then, about noon today, we had an mds crash. And then the failover mds
 crashed. And this cascaded through all 4 mds servers we have.

 If I try to start it ('service ceph start mds' on CentOS 7.1), it 
 appears
 to be OK for a little while. ceph -w goes through 'replay' 'reconnect'
 'rejoin' 'clientreplay' and 'active' but nearly immediately after 
 getting to
 'active', it crashes again.

 I have the mds log at
 http://people.beocat.cis.ksu.edu/~kylehutson/ceph-mds.hobbit01.log
 http://people.beocat.cis.ksu.edu/%7Ekylehutson/ceph-mds.hobbit01.log

 For the possibly, but not necessarily, useful background info.
 - Yesterday we took our erasure coded pool and increased both pg_num and
 pgp_num from 2048 to 4096. We still have several objects misplaced 
 (~17%),
 but those seem to be continuing to clean themselves up.
 - We are in the midst of a large (300+ TB) rsync from our old (non-ceph)
 filesystem to this filesystem.
 - Before we realized the mds crashes, we had just changed the size of 
 our
 metadata pool from 2 to 4.


 It looks like you're seeing http://tracker.ceph.com/issues/10449, which 
 is
 a situation where the SessionMap object becomes too big for the MDS to
 save.The cause of it in that case was stuck requests from a misbehaving
 client running a slightly older kernel.

 Assuming you're using the kernel client and having a similar problem, you
 could try to work around this situation by forcibly unmounting the 
 clients
 while the MDS is offline, such that during clientreplay the MDS will 
 remove
 them from the SessionMap after timing out, and then next time it tries to
 save the map it won't be oversized.  If that works, you could then look 
 into
 getting newer kernels on the clients to avoid hitting the issue again -- 
 the
 #10449 ticket has some pointers about which kernel changes were relevant.

 Cheers,
 John



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upgrade from Giant 0.87-1 to Hammer 0.94-1

2015-04-15 Thread Christian Balzer

Hello,

On Thu, 16 Apr 2015 00:41:29 +0200 Steffen W Sørensen wrote:

 Hi,
 
 Successfully upgrade a small development 4x node Giant 0.87-1 cluster to
 Hammer 0.94-1, each node with 6x OSD - 146GB, 19 pools, mainly 2 in
 usage. Only minor thing now ceph -s complaining over too may PGs,
 previously Giant had complain of too few, so various pools were bumped
 up till health status was okay as before upgrading. Admit, that after
 bumping PGs up in Giant we had changed pool sizes from 3 to 2  min 1 in
 fear of perf. when backfilling/recovering PGs.


That later change would have _increased_ the number of recommended PG, not
decreased it.

With your cluster 2048 PGs total (all pools combined!) would be the sweet
spot, see:

http://ceph.com/pgcalc/
 
It seems to me that you increased PG counts assuming that the formula is
per pool.

 
 # ceph -s
 cluster 16fe2dcf-2629-422f-a649-871deba78bcd
  health HEALTH_WARN
 too many PGs per OSD (1237  max 300)
  monmap e29: 3 mons at
 {0=10.0.3.4:6789/0,1=10.0.3.2:6789/0,2=10.0.3.1:6789/0} election epoch
 1370, quorum 0,1,2 2,1,0 mdsmap e142: 1/1/1 up {0=2=up:active}, 1
 up:standby osdmap e3483: 24 osds: 24 up, 24 in
   pgmap v3719606: 14848 pgs, 19 pools, 530 GB data, 133 kobjects
 1055 GB used, 2103 GB / 3159 GB avail
14848 active+clean
 

This is an insanely high PG count for this cluster and is certain to
impact performance and resource requirements (all these PGs need to peer
after all).

 Can we just reduce PGs again and should we decrement in minor steps one
 pool at a time…
 
No, as per the documentation you can only increase PGs and PGPs.

So your options are to totally flatten this cluster or if pools with
important data exist to copy them to new, correctly sized, pools and
delete all the oversized ones after that.

Christian



-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mds crashing

2015-04-15 Thread Yan, Zheng
On Thu, Apr 16, 2015 at 9:07 AM, Adam Tygart mo...@ksu.edu wrote:
 We are using 3.18.6-gentoo. Based on that, I was hoping that the
 kernel bug referred to in the bug report would have been fixed.


The bug was supposed to be fixed, but you hit the bug again. could you
check if the kernel client has any hang mds request. (check
/sys/kernel/debug/ceph/*/mdsc on the machine that contain cephfs
mount. If there is any request whose ID is significant smaller than
other requests' IDs)

Regards
Yan, Zheng

 --
 Adam

 On Wed, Apr 15, 2015 at 8:02 PM, Yan, Zheng uker...@gmail.com wrote:
 On Thu, Apr 16, 2015 at 5:29 AM, Kyle Hutson kylehut...@ksu.edu wrote:
 Thank you, John!

 That was exactly the bug we were hitting. My Google-fu didn't lead me to
 this one.


 here is the bug report http://tracker.ceph.com/issues/10449. It's a
 kernel client bug which causes the session map size increase
 infinitely. which version of linux kernel are using?

 Regards
 Yan, Zheng



 On Wed, Apr 15, 2015 at 4:16 PM, John Spray john.sp...@redhat.com wrote:

 On 15/04/2015 20:02, Kyle Hutson wrote:

 I upgraded to 0.94.1 from 0.94 on Monday, and everything had been going
 pretty well.

 Then, about noon today, we had an mds crash. And then the failover mds
 crashed. And this cascaded through all 4 mds servers we have.

 If I try to start it ('service ceph start mds' on CentOS 7.1), it appears
 to be OK for a little while. ceph -w goes through 'replay' 'reconnect'
 'rejoin' 'clientreplay' and 'active' but nearly immediately after getting 
 to
 'active', it crashes again.

 I have the mds log at
 http://people.beocat.cis.ksu.edu/~kylehutson/ceph-mds.hobbit01.log
 http://people.beocat.cis.ksu.edu/%7Ekylehutson/ceph-mds.hobbit01.log

 For the possibly, but not necessarily, useful background info.
 - Yesterday we took our erasure coded pool and increased both pg_num and
 pgp_num from 2048 to 4096. We still have several objects misplaced (~17%),
 but those seem to be continuing to clean themselves up.
 - We are in the midst of a large (300+ TB) rsync from our old (non-ceph)
 filesystem to this filesystem.
 - Before we realized the mds crashes, we had just changed the size of our
 metadata pool from 2 to 4.


 It looks like you're seeing http://tracker.ceph.com/issues/10449, which is
 a situation where the SessionMap object becomes too big for the MDS to
 save.The cause of it in that case was stuck requests from a misbehaving
 client running a slightly older kernel.

 Assuming you're using the kernel client and having a similar problem, you
 could try to work around this situation by forcibly unmounting the clients
 while the MDS is offline, such that during clientreplay the MDS will remove
 them from the SessionMap after timing out, and then next time it tries to
 save the map it won't be oversized.  If that works, you could then look 
 into
 getting newer kernels on the clients to avoid hitting the issue again -- 
 the
 #10449 ticket has some pointers about which kernel changes were relevant.

 Cheers,
 John



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] live migration fails with image on ceph

2015-04-15 Thread Yuming Ma (yumima)
The issue is reproducible in svl-3 with rbd cache set to false.

On the 5th ping-pong, the instance experienced ping drops and did not
recover for 20+ minutes:

(os-clients)[root@fedora21 nimbus-env]# nova live-migration lmtest1
(os-clients)[root@fedora21 nimbus-env]# nova show lmtest1 |grep -E
'hypervisor_hostname|task_state|vm_state'
| OS-EXT-SRV-ATTR:hypervisor_hostname  | svl-3-cc-nova1-002.cisco.com
 |
| OS-EXT-STS:task_state| migrating
 |
| OS-EXT-STS:vm_state  | active
 |
(os-clients)[root@fedora21 nimbus-env]# nova show lmtest1 |grep -E
'hypervisor_hostname|task_state|vm_state'
| OS-EXT-SRV-ATTR:hypervisor_hostname  | svl-3-cc-nova1-001.cisco.com
 |
| OS-EXT-STS:task_state| -
 |
| OS-EXT-STS:vm_state  | active
 |
(os-clients)[root@fedora21 nimbus-env]# ping -c3 -S60 10.33.143.215
PING 10.33.143.215 (10.33.143.215) 56(84) bytes of data.

--- 10.33.143.215 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2001ms

(os-clients)[root@fedora21 nimbus-env]# ping -c3 -S60 10.33.143.215
PING 10.33.143.215 (10.33.143.215) 56(84) bytes of data.

--- 10.33.143.215 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 1999ms

(os-clients)[root@fedora21 nimbus-env]# ping -c3 -S60 10.33.143.215
PING 10.33.143.215 (10.33.143.215) 56(84) bytes of data.

--- 10.33.143.215 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 1999ms



‹ Yuming




On 4/10/15, 4:51 PM, Josh Durgin jdur...@redhat.com wrote:

On 04/08/2015 09:37 PM, Yuming Ma (yumima) wrote:
 Josh,

 I think we are using plain live migration and not mirroring block drives
 as the other test did.

Do you have the migration flags or more from the libvirt log? Also
which versions of qemu is this?

The libvirt log message about qemuMigrationCancelDriveMirror from your
first email is suspicious. Being unable to stop it may mean it was not
running (fine, but libvirt shouldn't have tried to stop it), or it kept
running (bad esp. if it's trying to copy to the same rbd).

 What are the chances or scenario that disk image
 can be corrupted during the live migration for both source and target
 are connected to the same volume and RBD caches is turned on:

Generally rbd caching with live migration is safe. The way to get
corruption is to have drive-mirror try to copy over the rbd on the
destination while the source is still using the disk...

Did you observe fs corruption after a live migration, or just other odd
symptoms? Since a reboot fixed it, it sounds more like memory corruption
to me, unless it was fsck'd during reboot.

Josh

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mds crashing

2015-04-15 Thread Yan, Zheng
On Thu, Apr 16, 2015 at 5:29 AM, Kyle Hutson kylehut...@ksu.edu wrote:
 Thank you, John!

 That was exactly the bug we were hitting. My Google-fu didn't lead me to
 this one.


here is the bug report http://tracker.ceph.com/issues/10449. It's a
kernel client bug which causes the session map size increase
infinitely. which version of linux kernel are using?

Regards
Yan, Zheng



 On Wed, Apr 15, 2015 at 4:16 PM, John Spray john.sp...@redhat.com wrote:

 On 15/04/2015 20:02, Kyle Hutson wrote:

 I upgraded to 0.94.1 from 0.94 on Monday, and everything had been going
 pretty well.

 Then, about noon today, we had an mds crash. And then the failover mds
 crashed. And this cascaded through all 4 mds servers we have.

 If I try to start it ('service ceph start mds' on CentOS 7.1), it appears
 to be OK for a little while. ceph -w goes through 'replay' 'reconnect'
 'rejoin' 'clientreplay' and 'active' but nearly immediately after getting to
 'active', it crashes again.

 I have the mds log at
 http://people.beocat.cis.ksu.edu/~kylehutson/ceph-mds.hobbit01.log
 http://people.beocat.cis.ksu.edu/%7Ekylehutson/ceph-mds.hobbit01.log

 For the possibly, but not necessarily, useful background info.
 - Yesterday we took our erasure coded pool and increased both pg_num and
 pgp_num from 2048 to 4096. We still have several objects misplaced (~17%),
 but those seem to be continuing to clean themselves up.
 - We are in the midst of a large (300+ TB) rsync from our old (non-ceph)
 filesystem to this filesystem.
 - Before we realized the mds crashes, we had just changed the size of our
 metadata pool from 2 to 4.


 It looks like you're seeing http://tracker.ceph.com/issues/10449, which is
 a situation where the SessionMap object becomes too big for the MDS to
 save.The cause of it in that case was stuck requests from a misbehaving
 client running a slightly older kernel.

 Assuming you're using the kernel client and having a similar problem, you
 could try to work around this situation by forcibly unmounting the clients
 while the MDS is offline, such that during clientreplay the MDS will remove
 them from the SessionMap after timing out, and then next time it tries to
 save the map it won't be oversized.  If that works, you could then look into
 getting newer kernels on the clients to avoid hitting the issue again -- the
 #10449 ticket has some pointers about which kernel changes were relevant.

 Cheers,
 John



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Managing larger ceph clusters

2015-04-15 Thread Stillwell, Bryan
I'm curious what people managing larger ceph clusters are doing with
configuration management and orchestration to simplify their lives?

We've been using ceph-deploy to manage our ceph clusters so far, but
feel that moving the management of our clusters to standard tools would
provide a little more consistency and help prevent some mistakes that
have happened while using ceph-deploy.

We're looking at using the same tools we use in our OpenStack
environment (puppet/ansible), but I'm interested in hearing from people
using chef/salt/juju as well.

Some of the cluster operation tasks that I can think of along with
ideas/concerns I have are:

Keyring management
  Seems like hiera-eyaml is a natural fit for storing the keyrings.

ceph.conf
  I believe the puppet ceph module can be used to manage this file, but
  I'm wondering if using a template (erb?) might be better method to
  keeping it organized and properly documented.

Pool configuration
  The puppet module seems to be able to handle managing replicas and the
  number of placement groups, but I don't see support for erasure coded
  pools yet.  This is probably something we would want the initial
  configuration to be set up by puppet, but not something we would want
  puppet changing on a production cluster.

CRUSH maps
  Describing the infrastructure in yaml makes sense.  Things like which
  servers are in which rows/racks/chassis.  Also describing the type of
  server (model, number of HDDs, number of SSDs) makes sense.

CRUSH rules
  I could see puppet managing the various rules based on the backend
  storage (HDD, SSD, primary affinity, erasure coding, etc).

Replacing a failed HDD disk
  Do you automatically identify the new drive and start using it right
  away?  I've seen people talk about using a combination of udev and
  special GPT partition IDs to automate this.  If you have a cluster
  with thousands of drives I think automating the replacement makes
  sense.  How do you handle the journal partition on the SSD?  Does
  removing the old journal partition and creating a new one create a
  hole in the partition map (because the old partition is removed and
  the new one is created at the end of the drive)?

Replacing a failed SSD journal
  Has anyone automated recreating the journal drive using Sebastien
  Han's instructions, or do you have to rebuild all the OSDs as well?


http://www.sebastien-han.fr/blog/2014/11/27/ceph-recover-osds-after-ssd-jou
rnal-failure/

Adding new OSD servers
  How are you adding multiple new OSD servers to the cluster?  I could
  see an ansible playbook which disables nobackfill, noscrub, and
  nodeep-scrub followed by adding all the OSDs to the cluster being
  useful.

Upgrading releases
  I've found an ansible playbook for doing a rolling upgrade which looks
  like it would work well, but are there other methods people are using?


http://www.sebastien-han.fr/blog/2015/03/30/ceph-rolling-upgrades-with-ansi
ble/

Decommissioning hardware
  Seems like another ansible playbook for reducing the OSDs weights to
  zero, marking the OSDs out, stopping the service, removing the OSD ID,
  removing the CRUSH entry, unmounting the drives, and finally removing
  the server would be the best method here.  Any other ideas on how to
  approach this?


That's all I can think of right now.  Is there any other tasks that
people have run into that are missing from this list?

Thanks,
Bryan


This E-mail and any of its attachments may contain Time Warner Cable 
proprietary information, which is privileged, confidential, or subject to 
copyright belonging to Time Warner Cable. This E-mail is intended solely for 
the use of the individual or entity to which it is addressed. If you are not 
the intended recipient of this E-mail, you are hereby notified that any 
dissemination, distribution, copying, or action taken in relation to the 
contents of and attachments to this E-mail is strictly prohibited and may be 
unlawful. If you have received this E-mail in error, please notify the sender 
immediately and permanently delete the original and any copy of this E-mail and 
any printout.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mds crashing

2015-04-15 Thread John Spray

On 15/04/2015 20:02, Kyle Hutson wrote:
I upgraded to 0.94.1 from 0.94 on Monday, and everything had been 
going pretty well.


Then, about noon today, we had an mds crash. And then the failover mds 
crashed. And this cascaded through all 4 mds servers we have.


If I try to start it ('service ceph start mds' on CentOS 7.1), it 
appears to be OK for a little while. ceph -w goes through 'replay' 
'reconnect' 'rejoin' 'clientreplay' and 'active' but nearly 
immediately after getting to 'active', it crashes again.


I have the mds log at 
http://people.beocat.cis.ksu.edu/~kylehutson/ceph-mds.hobbit01.log 
http://people.beocat.cis.ksu.edu/%7Ekylehutson/ceph-mds.hobbit01.log


For the possibly, but not necessarily, useful background info.
- Yesterday we took our erasure coded pool and increased both pg_num 
and pgp_num from 2048 to 4096. We still have several objects misplaced 
(~17%), but those seem to be continuing to clean themselves up.
- We are in the midst of a large (300+ TB) rsync from our old 
(non-ceph) filesystem to this filesystem.
- Before we realized the mds crashes, we had just changed the size of 
our metadata pool from 2 to 4.


It looks like you're seeing http://tracker.ceph.com/issues/10449, which 
is a situation where the SessionMap object becomes too big for the MDS 
to save.The cause of it in that case was stuck requests from a 
misbehaving client running a slightly older kernel.


Assuming you're using the kernel client and having a similar problem, 
you could try to work around this situation by forcibly unmounting the 
clients while the MDS is offline, such that during clientreplay the MDS 
will remove them from the SessionMap after timing out, and then next 
time it tries to save the map it won't be oversized.  If that works, you 
could then look into getting newer kernels on the clients to avoid 
hitting the issue again -- the #10449 ticket has some pointers about 
which kernel changes were relevant.


Cheers,
John
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Is ceph.com down?

2015-04-15 Thread Lindsay Mathieson
Can't open at the moment, niever the website or apt.

Trying from Brisbane, Australia.
-- 
Lindsay
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v0.80.8 and librbd performance

2015-04-15 Thread Josh Durgin

On 04/14/2015 08:01 PM, shiva rkreddy wrote:

The clusters are in test environment, so its a new deployment of 0.80.9.
OS on the cluster nodes is reinstalled as well, so there shouldn't be
any fs aging unless the disks are slowing down.

The perf measurement is done initiating multiple cinder create/delete
commands and tracking the volume to be in available or completely gone
from cinder list output.

Even running  rbd rm  command from cinder node results in similar
behaviour.

I'll try with  increasing  rbd_concurrent_management in ceph.conf.
  Is the param name rbd_concurrent_management or rbd-concurrent-management ?


'rbd concurrent management ops' - spaces, hyphens, and underscores are
equivalent in ceph configuration.

A log with 'debug ms = 1' and 'debug rbd = 20' from 'rbd rm' on both 
versions might give clues about what's going slower.


Josh


On Tue, Apr 14, 2015 at 12:36 PM, Josh Durgin jdur...@redhat.com
mailto:jdur...@redhat.com wrote:

I don't see any commits that would be likely to affect that between
0.80.7 and 0.80.9.

Is this after upgrading an existing cluster?
Could this be due to fs aging beneath your osds?

How are you measuring create/delete performance?

You can try increasing rbd concurrent management ops in ceph.conf on
the cinder node. This affects delete speed, since rbd tries to
delete each object in a volume.

Josh


*From:* shiva rkreddy shiva.rkre...@gmail.com
mailto:shiva.rkre...@gmail.com
*Sent:* Apr 14, 2015 5:53 AM
*To:* Josh Durgin
*Cc:* Ken Dreyer; Sage Weil; Ceph Development; ceph-us...@ceph.com
mailto:ceph-us...@ceph.com
*Subject:* Re: v0.80.8 and librbd performance

Hi Josh,

We are using firefly 0.80.9 and see both cinder create/delete
numbers slow down compared 0.80.7.
I don't see any specific tuning requirements and our cluster is
run pretty much on default configuration.
Do you recommend any tuning or can you please suggest some log
signatures we need to be looking at?

Thanks
shiva

On Wed, Mar 4, 2015 at 1:53 PM, Josh Durgin jdur...@redhat.com
mailto:jdur...@redhat.com wrote:

On 03/03/2015 03:28 PM, Ken Dreyer wrote:

On 03/03/2015 04:19 PM, Sage Weil wrote:

Hi,

This is just a heads up that we've identified a
performance regression in
v0.80.8 from previous firefly releases.  A v0.80.9
is working it's way
through QA and should be out in a few days.  If you
haven't upgraded yet
you may want to wait.

Thanks!
sage


Hi Sage,

I've seen a couple Redmine tickets on this (eg
http://tracker.ceph.com/__issues/9854
http://tracker.ceph.com/issues/9854 ,
http://tracker.ceph.com/__issues/10956
http://tracker.ceph.com/issues/10956). It's not
totally clear to me
which of the 70+ unreleased commits on the firefly
branch fix this
librbd issue.  Is it only the three commits in
https://github.com/ceph/ceph/__pull/3410
https://github.com/ceph/ceph/pull/3410 , or are there
more?


Those are the only ones needed to fix the librbd performance
regression, yes.

Josh

--
To unsubscribe from this list: send the line unsubscribe
ceph-devel in
the body of a message to majord...@vger.kernel.org
mailto:majord...@vger.kernel.org
More majordomo info at
http://vger.kernel.org/__majordomo-info.html
http://vger.kernel.org/majordomo-info.html





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Is ceph.com down?

2015-04-15 Thread Wido den Hollander
On 04/15/2015 09:30 AM, Lindsay Mathieson wrote:
 Can't open at the moment, niever the website or apt.
 

Yes, it's down here as well. You can try eu.ceph.com if you need the
packages.

Or this one: http://ceph.mirror.digitalpacific.com.au/ (working on
au.ceph.com)

 Trying from Brisbane, Australia.
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 


-- 
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph site is very slow

2015-04-15 Thread Wido den Hollander
On 04/15/2015 10:20 AM, Ignazio Cassano wrote:
 Hi all,
 why ceph.com is very slow ?

Not known right now. But you can try eu.ceph.com for your packages and
downloads.

 It is impossible download files for installing ceph.
 Regards
 Ignazio
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 


-- 
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph site is very slow

2015-04-15 Thread Ignazio Cassano
Many thanks

2015-04-15 10:44 GMT+02:00 Wido den Hollander w...@42on.com:

 On 04/15/2015 10:20 AM, Ignazio Cassano wrote:
  Hi all,
  why ceph.com is very slow ?

 Not known right now. But you can try eu.ceph.com for your packages and
 downloads.

  It is impossible download files for installing ceph.
  Regards
  Ignazio
 
 
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 


 --
 Wido den Hollander
 42on B.V.
 Ceph trainer and consultant

 Phone: +31 (0)20 700 9902
 Skype: contact42on
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph site is very slow

2015-04-15 Thread Ignazio Cassano
Hi all,
why ceph.com is very slow ?
It is impossible download files for installing ceph.
Regards
Ignazio
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] how to compute Ceph durability?

2015-04-15 Thread ghislain.chevalier
Thanks Mark
Loic also gave me this link

It would be a good start for sure

Best regards

-Message d'origine-
De : ceph-users [mailto:ceph-users-boun...@lists.ceph.com] De la part de Mark 
Nelson
Envoyé : mardi 14 avril 2015 14:11
À : ceph-users@lists.ceph.com
Objet : Re: [ceph-users] how to compute Ceph durability?

Hi Ghislain,

Mark Kampe was working on durability models a couple of years ago, but I'm not 
sure if they ever were completed or if anyone has reviewed them. 
  The source code is available here:

https://github.com/ceph/ceph-tools/tree/master/models/reliability

This was before EC was in Ceph, so I'm guessing new models would need to be 
created for that, but this may at least be a good place to start.

Mark

On 04/14/2015 07:04 AM, ghislain.cheval...@orange.com wrote:
 Hi All,

 Am I alone to have this need ?

 *De :*ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *De la 
 part
 de* ghislain.cheval...@orange.com
 *Envoyé :* vendredi 20 mars 2015 11:47 *À :* ceph-users *Objet :* 
 [ceph-users] how to compute Ceph durability?

 Hi all,

 I would like to compute the durability of data stored in a  ceph 
 environment according to the cluster topology (failure domains) and 
 the data resiliency (replication/erasure coding).

 Does a tool exist ?

 Best regards

 *- - - - - - - - - - - - - - - - -*
 *Ghislain Chevalier ORANGE*
 +33299124432

 +33788624370
 ghislain.cheval...@orange.com
 mailto:ghislain.cheval...@orange-ftgroup.com

 __
 ___



 Ce message et ses pieces jointes peuvent contenir des informations 
 confidentielles ou privilegiees et ne doivent donc

 pas etre diffuses, exploites ou copies sans autorisation. Si vous avez 
 recu ce message par erreur, veuillez le signaler

 a l'expediteur et le detruire ainsi que les pieces jointes. Les 
 messages electroniques etant susceptibles d'alteration,

 Orange decline toute responsabilite si ce message a ete altere, deforme ou 
 falsifie. Merci.



 This message and its attachments may contain confidential or 
 privileged information that may be protected by law;

 they should not be distributed, used or copied without authorisation.

 If you have received this email in error, please notify the sender and delete 
 this message and its attachments.

 As emails may be altered, Orange is not liable for messages that have been 
 modified, changed or falsified.

 Thank you.

 __
 ___

 Ce message et ses pieces jointes peuvent contenir des informations 
 confidentielles ou privilegiees et ne doivent donc pas etre diffuses, 
 exploites ou copies sans autorisation. Si vous avez recu ce message 
 par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les 
 pieces jointes. Les messages electroniques etant susceptibles d'alteration, 
 Orange decline toute responsabilite si ce message a ete altere, deforme ou 
 falsifie. Merci.

 This message and its attachments may contain confidential or 
 privileged information that may be protected by law; they should not be 
 distributed, used or copied without authorisation.
 If you have received this email in error, please notify the sender and delete 
 this message and its attachments.
 As emails may be altered, Orange is not liable for messages that have been 
 modified, changed or falsified.
 Thank you.



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rados Gateway and keystone

2015-04-15 Thread ghislain.chevalier
Thanks a lot
That helps.

De : Erik McCormick [mailto:emccorm...@cirrusseven.com]
Envoyé : lundi 13 avril 2015 18:32
À : CHEVALIER Ghislain IMT/OLPS
Cc : ceph-users
Objet : Re: [ceph-users] Rados Gateway and keystone

I haven't really used the S3 stuff much, but the credentials should be in 
keystone already. If you're in horizon, you can download them under Access and 
Security-API Access. Using the CLI you can use the openstack client like 
openstack credential list | show | create | delete | set or with the 
keystone client like keystone ec2-credentials-list, etc.  Then you should be 
able to feed those credentials to the rgw like a normal S3 API call.

Cheers,
Erik

On Mon, Apr 13, 2015 at 10:16 AM, 
ghislain.cheval...@orange.commailto:ghislain.cheval...@orange.com wrote:
Hi all,

Coming back to that issue.

I successfully used keystone users for the rados gateway and the swift API but 
I still don't understand how it can work with S3 API and i.e. S3 users 
(AccessKey/SecretKey)

I found a swift3 initiative but I think It's only compliant in a pure OpenStack 
swift environment  by setting up a specific plug-in.
https://github.com/stackforge/swift3

A rgw can be, at the same, time under keystone control and  standard 
radosgw-admin if
- for swift, you use the right authentication service (keystone or internal)
- for S3, you use the internal authentication service

So, my questions are still valid.
How can a rgw work for S3 users if there are stored in keystone? Which is the 
accesskey and secretkey?
What is the purpose of rgw s3 auth use keystone parameter ?

Best regards

--
De : ceph-users 
[mailto:ceph-users-boun...@lists.ceph.commailto:ceph-users-boun...@lists.ceph.com]
 De la part de 
ghislain.cheval...@orange.commailto:ghislain.cheval...@orange.com
Envoyé : lundi 23 mars 2015 14:03
À : ceph-users
Objet : [ceph-users] Rados Gateway and keystone

Hi All,

I just would to be sure about keystone configuration for Rados Gateway.

I read the documentation http://ceph.com/docs/master/radosgw/keystone/ and 
http://ceph.com/docs/master/radosgw/config-ref/?highlight=keystone
but I didn't catch if after having configured the rados gateway (ceph.conf) in 
order to use keystone, it becomes mandatory to create all the users in it.

In other words, can a rgw be, at the same, time under keystone control and  
standard radosgw-admin ?
How does it work for S3 users ?
What is the purpose of rgw s3 auth use keystone parameter ?

Best regards

- - - - - - - - - - - - - - - - -
Ghislain Chevalier
+33299124432tel:%2B33299124432
+33788624370tel:%2B33788624370
ghislain.cheval...@orange.commailto:ghislain.cheval...@orange.com
_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.

_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.
___
ceph-users mailing list
ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_

Ce message et ses