Re: [ceph-users] Physical maintainance

2016-07-17 Thread Kees Meijs
Hi,

Thanks guys, this worked like a charm. Activating the OSDs wasn't
necessary: it seemed udev(7) helped me with that.

Cheers,
Kees

On 13-07-16 14:47, Kees Meijs wrote:
> So to sum up, I'd best:
>
>   * set the noout flag
>   * stop the OSDs one by one
>   * shut down the physical node
>   * jank the OSD drives to prevent ceph-disk(8) from automaticly
> activating at boot time
>   * do my maintainance
>   * start the physical node
>   * reseat and activate the OSD drives one by one
>   * unset the noout flag
>

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] [RGW] how to choise the best placement groups ?

2016-07-17 Thread Khang Nguyễn Nhật
Hi all,
I have a cluster consists of: 3 Monitors, 1 RGW, 1 host of 24 OSDs(2TB/OSD)
and some pool as:
ap-southeast.rgw.data.root
ap-southeast.rgw.control
ap-southeast.rgw.gc
ap-southeast.rgw.log
ap-southeast.rgw.intent-log
ap-southeast.rgw.usage
ap-southeast.rgw.users.keys
ap-southeast.rgw.users.email
ap-southeast.rgw.users.swift
ap-southeast.rgw.users.uid
ap-southeast.rgw.buckets.index
ap-southeast.rgw.buckets.data
ap-southeast.rgw.buckets.non-ec
ap-southeast.rgw.meta
In which "ap-southeast.rgw.buckets.data" is a erasure pool(k=20, m=4) and
all of the remaining pool are replicated(size=3). I've used (100*OSDs)/size
 to calculate the number of PGs, e.g. 100*24/3 = 800(nearest power of 2:
1024) for replicated pools and 100*24/24=100(nearest power of 2: 128) for
erasure pool. I'm not sure this is the best placement group number, someone
can give me some advice ?
Thank !
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs-journal-tool lead to data missing and show up

2016-07-17 Thread Yan, Zheng
On Thu, Jul 14, 2016 at 4:42 PM, txm  wrote:
> I am a user of cephfs.
>
> Recently i met a problem by using the cephfs-journal-tool.
>
> There were some strange things happened below.
>
> 1.After use the cephfs-journal-tool and cephfs-table-tool(i came up with the 
> "negative object nums” issues, so i tried these tools to repair the cephfs),i 
> remount the cephfs.
> 2.Then i found that the old data(a directoy and a file under it) is missing.
> 3.But after i create a new file at the root of cephfs, the missing directory 
> show up. Then i delete the new created file,the “missing directory”  
> disappeared soon.
> 4.So this is the problem, when i create something under the root of cephfs, 
> the missing directory show up, when i delete it ,the “missing directory” 
> disappeared.
>
> Here is my question:
> 1. I am not sure whether this damage is caused by cephfs-journal-tool?
> 2. If so, another question is , how this damage came up by using the 
> cephfs-journal-tool and what should i do next to fix this problem?
>
> Apparently ,i didn’t lose my data, but this is strange after all.
>

This is likely caused by incorrect inode dirstat. The scrub
functionality of devel version MDS can fix it.

Yan, Zheng

>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph noob - getting error when I try to "ceph-deploy osd activate" on a node

2016-07-17 Thread Will Dennis

> On Jul 17, 2016, at 4:21 PM, Ruben Kerkhof  wrote:
> 
> Yes, that's it. You should see osd processes running, and the osd's
> should be marked 'up' when you run 'ceph osd tree’.

Looks like I’m good then:
 
[wdennis@ceph2 ~]$ sudo ceph osd tree
ID WEIGHT  TYPE NAME  UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0.04408 root default
-2 0.01469 host ceph2
 0 0.00490 osd.0   up  1.0  1.0
 1 0.00490 osd.1   up  1.0  1.0
 2 0.00490 osd.2   up  1.0  1.0
-3 0.01469 host ceph3
 3 0.00490 osd.3   up  1.0  1.0
 4 0.00490 osd.4   up  1.0  1.0
 5 0.00490 osd.5   up  1.0  1.0
-4 0.01469 host ceph4
 6 0.00490 osd.6   up  1.0  1.0
 7 0.00490 osd.7   up  1.0  1.0
 8 0.00490 osd.8   up  1.0  1.0


> 
> Just thought of a fourth issue, please make sure your disks are
> absolutely empty!
> I reused disks that I used previously for zfs, and zfs leaves metadata
> behind at the end of the disk.
> This confuses blkid greatly (and me too).
> ceph-disk prepare --zap is not enough to resolve this.
> 
> I've stuck the following in my kickstart file which I use to prepare
> my OSD servers.
> 
> %pre
> #!/bin/bash
> for disk in $(ls -1 /dev/sd* | awk '/[a-z]$/ {print}'); do
>test -b "$disk" || continue
>size_in_bytes=$(blockdev --getsize64 ${disk})
>offset=$((size_in_bytes - 8 * 1024 * 1024))
> 
>echo "Wiping ${disk}"
># wipe start
>dd if=/dev/zero of=${disk} bs=1M count=8 status=none
># wipe end
>dd if=/dev/zero of=${disk} bs=1M count=8 seek=${offset}
> oflag=seek_bytes status=none
> done
> %end

Again, good to know - thanks! The prior use was just the previous Ceph install, 
no other fs use…



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Lessons learned upgrading Hammer -> Jewel

2016-07-17 Thread Adrian Saul

I have SELinux disabled and it does the restorecon on /var/lib/ceph regardless 
from the RPM post upgrade scripts.

In my case I chose to kill the restorecon processes to save outage time – it 
didn’t affect the upgrade package completion.


From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mykola 
Dvornik
Sent: Friday, 15 July 2016 6:54 PM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Lessons learned upgrading Hammer -> Jewel

I would also advice people to mind the SELinux if it is enabled on the OSD's 
nodes.
The re-labeling should be done as the part of the upgrade and this is rather 
time consuming process.


-Original Message-
From: Mart van Santen 
mailto:mart%20van%20santen%20%3cm...@greenhost.nl%3e>>
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Lessons learned upgrading Hammer -> Jewel
Date: Fri, 15 Jul 2016 10:48:40 +0200


Hi Wido,

Thank you, we are currently in the same process so this information is very 
usefull. Can you share why you upgraded from hammer directly to jewel, is there 
a reason to skip infernalis? So, I wonder why you didn't do a 
hammer->infernalis->jewel upgrade, as that seems the logical path for me.

(we did indeed saw the same errors "Failed to encode map eXXX with expected 
crc" when upgrading to the latest hammer)


Regards,

Mart






On 07/15/2016 03:08 AM, 席智勇 wrote:
good job, thank you for sharing, Wido~
it's very useful~

2016-07-14 14:33 GMT+08:00 Wido den Hollander 
mailto:w...@42on.com>>:

To add, the RGWs upgraded just fine as well.

No regions in use here (yet!), so that upgraded as it should.

Wido

> Op 13 juli 2016 om 16:56 schreef Wido den Hollander 
> mailto:w...@42on.com>>:
>
>
> Hello,
>
> The last 3 days I worked at a customer with a 1800 OSD cluster which had to 
> be upgraded from Hammer 0.94.5 to Jewel 10.2.2
>
> The cluster in this case is 99% RGW, but also some RBD.
>
> I wanted to share some of the things we encountered during this upgrade.
>
> All 180 nodes are running CentOS 7.1 on a IPv6-only network.
>
> ** Hammer Upgrade **
> At first we upgraded from 0.94.5 to 0.94.7, this went well except for the 
> fact that the monitors got spammed with these kind of messages:
>
>   "Failed to encode map eXXX with expected crc"
>
> Some searching on the list brought me to:
>
>   ceph tell osd.* injectargs -- --clog_to_monitors=false
>
>  This reduced the load on the 5 monitors and made recovery succeed smoothly.
>
>  ** Monitors to Jewel **
>  The next step was to upgrade the monitors from Hammer to Jewel.
>
>  Using Salt we upgraded the packages and afterwards it was simple:
>
>killall ceph-mon
>chown -R ceph:ceph /var/lib/ceph
>chown -R ceph:ceph /var/log/ceph
>
> Now, a systemd quirck. 'systemctl start ceph.target' does not work, I had to 
> manually enabled the monitor and start it:
>
>   systemctl enable 
> ceph-mon@srv-zmb04-05.service
>   systemctl start 
> ceph-mon@srv-zmb04-05.service
>
> Afterwards the monitors were running just fine.
>
> ** OSDs to Jewel **
> To upgrade the OSDs to Jewel we initially used Salt to update the packages on 
> all systems to 10.2.2, we then used a Shell script which we ran on one node 
> at a time.
>
> The failure domain here is 'rack', so we executed this in one rack, then the 
> next one, etc, etc.
>
> Script can be found on Github: 
> https://gist.github.com/wido/06eac901bd42f01ca2f4f1a1d76c49a6
>
> Be aware that the chown can take a long, long, very long time!
>
> We ran into the issue that some OSDs crashed after start. But after trying 
> again they would start.
>
>   "void FileStore::init_temp_collections()"
>
> I reported this in the tracker as I'm not sure what is happening here: 
> http://tracker.ceph.com/issues/16672
>
> ** New OSDs with Jewel **
> We also had some new nodes which we wanted to add to the Jewel cluster.
>
> Using Salt and ceph-disk we ran into a partprobe issue in combination with 
> ceph-disk. There was already a Pull Request for the fix, but that was not 
> included in Jewel 10.2.2.
>
> We manually applied the PR and it fixed our issues: 
> https://github.com/ceph/ceph/pull/9330
>
> Hope this helps other people with their upgrades to Jewel!
>
> Wido
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com






___

ceph-users mailing list

ceph-users@lists.ceph.com

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___

ceph-users mailing list

ceph-users@lists.ceph.co

Re: [ceph-users] Ceph noob - getting error when I try to "ceph-deploy osd activate" on a node

2016-07-17 Thread Ruben Kerkhof
On Sun, Jul 17, 2016 at 10:01 PM, Will Dennis  wrote:
>
> On Jul 17, 2016, at 7:05 AM, Ruben Kerkhof  wrote:
>
> First, there's an issue with the version of parted in CentOS 7.2:
> https://bugzilla.redhat.com/1339705
>
>
> Saw this sort of thing:
>
> [ceph2][WARNIN] update_partition: Calling partprobe on created device
> /dev/sde
> [ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle
> --timeout=600
> [ceph2][WARNIN] command: Running command: /sbin/partprobe /dev/sde
> [ceph2][WARNIN] update_partition: partprobe /dev/sde failed : Error: Error
> informing the kernel about modifications to partition /dev/sde1 -- Device or
> resource busy.  This means Linux won't know about any changes you made to
> /dev/sde1 until you reboot -- so you shouldn't mount it or use it in any way
> before rebooting.
> [ceph2][WARNIN] Error: Failed to add partition 1 (Device or resource busy)
> [ceph2][WARNIN]  (ignored, waiting 60s)
> [ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle
> --timeout=600
> [ceph2][WARNIN] command: Running command: /sbin/partprobe /dev/sde
> [ceph2][WARNIN] update_partition: partprobe /dev/sde failed : Error: Error
> informing the kernel about modifications to partition /dev/sde1 -- Device or
> resource busy.  This means Linux won't know about any changes you made to
> /dev/sde1 until you reboot -- so you shouldn't mount it or use it in any way
> before rebooting.
> [ceph2][WARNIN] Error: Failed to add partition 1 (Device or resource busy)
> [ceph2][WARNIN]  (ignored, waiting 60s)
> [ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle
> --timeout=600
> [ceph2][WARNIN] command: Running command: /sbin/partprobe /dev/sde
> [ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle
> --timeout=600
> [ceph2][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is
> /sys/dev/block/8:64/dm/uuid
> [ceph2][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is
> /sys/dev/block/8:64/dm/uuid
> [ceph2][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde1 uuid path is
> /sys/dev/block/8:65/dm/uuid
> [ceph2][WARNIN] populate_data_path_device: Creating xfs fs on /dev/sde1
>
> Is this because of the aforementioned bug? It seemed to succeed after a few
> retries in each case of it happening.

It is, yes. Most of the time it succeeds after retrying, but I've seen
it fail too.

>
> Secondly, the disks are now activated by udev. Instead of using
> activate, use prepare
> and udev handles the rest.
>
>
> I saw this sort of thing after each disk prepare:
>
> [ceph2][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdc uuid path is
> /sys/dev/block/8:32/dm/uuid
> [ceph2][WARNIN] command_check_call: Running command: /sbin/sgdisk
> --typecode=1:4fbd7e29-9d25-41b8-afd0-062c0ceff05d -- /dev/sdc
> [ceph2][DEBUG ] Warning: The kernel is still using the old partition table.
> [ceph2][DEBUG ] The new table will be used at the next reboot.
> [ceph2][DEBUG ] The operation has completed successfully.
> [ceph2][WARNIN] update_partition: Calling partprobe on prepared device
> /dev/sdc
> [ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle
> --timeout=600
> [ceph2][WARNIN] command: Running command: /sbin/partprobe /dev/sdc
> [ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle
> --timeout=600
> [ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm
> trigger --action=add --sysname-match sdc1
> [ceph2][INFO  ] checking OSD status...
> [ceph2][DEBUG ] find the location of an executable
> [ceph2][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat
> --format=json
> [ceph_deploy.osd][DEBUG ] Host ceph2 is now ready for osd use.
>
> Is the ‘udevadm’ stuff I see there what you are talking about? How may I
> verify that the disks are activated & ready for use?

Yes, that's it. You should see osd processes running, and the osd's
should be marked 'up' when you run 'ceph osd tree'.
>
>
> Third, this doesn't work well if you're also using LVM on your host
> since for some reason
> this causes udev to not send the necessary add/change events.
>
>
> Not using LVM on these hosts, but good to know.

Just thought of a fourth issue, please make sure your disks are
absolutely empty!
I reused disks that I used previously for zfs, and zfs leaves metadata
behind at the end of the disk.
This confuses blkid greatly (and me too).
ceph-disk prepare --zap is not enough to resolve this.

I've stuck the following in my kickstart file which I use to prepare
my OSD servers.

%pre
#!/bin/bash
for disk in $(ls -1 /dev/sd* | awk '/[a-z]$/ {print}'); do
test -b "$disk" || continue
size_in_bytes=$(blockdev --getsize64 ${disk})
offset=$((size_in_bytes - 8 * 1024 * 1024))

echo "Wiping ${disk}"
# wipe start
dd if=/dev/zero of=${disk} bs=1M count=8 status=none
# wipe end
dd if=/dev/zero of=${disk} bs=1M count=8 seek=${offset}
oflag=seek_bytes status=none
done
%end

Kind 

Re: [ceph-users] Ceph noob - getting error when I try to "ceph-deploy osd activate" on a node

2016-07-17 Thread Will Dennis

> On Jul 17, 2016, at 7:05 AM, Ruben Kerkhof  wrote:
> 
> First, there's an issue with the version of parted in CentOS 7.2:
> https://bugzilla.redhat.com/1339705 

Saw this sort of thing:

[ceph2][WARNIN] update_partition: Calling partprobe on created device /dev/sde
[ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle 
--timeout=600
[ceph2][WARNIN] command: Running command: /sbin/partprobe /dev/sde
[ceph2][WARNIN] update_partition: partprobe /dev/sde failed : Error: Error 
informing the kernel about modifications to partition /dev/sde1 -- Device or 
resource busy.  This means Linux won't know about any changes you made to 
/dev/sde1 until you reboot -- so you shouldn't mount it or use it in any way 
before rebooting.
[ceph2][WARNIN] Error: Failed to add partition 1 (Device or resource busy)
[ceph2][WARNIN]  (ignored, waiting 60s)
[ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle 
--timeout=600
[ceph2][WARNIN] command: Running command: /sbin/partprobe /dev/sde
[ceph2][WARNIN] update_partition: partprobe /dev/sde failed : Error: Error 
informing the kernel about modifications to partition /dev/sde1 -- Device or 
resource busy.  This means Linux won't know about any changes you made to 
/dev/sde1 until you reboot -- so you shouldn't mount it or use it in any way 
before rebooting.
[ceph2][WARNIN] Error: Failed to add partition 1 (Device or resource busy)
[ceph2][WARNIN]  (ignored, waiting 60s)
[ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle 
--timeout=600
[ceph2][WARNIN] command: Running command: /sbin/partprobe /dev/sde
[ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle 
--timeout=600
[ceph2][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is 
/sys/dev/block/8:64/dm/uuid
[ceph2][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is 
/sys/dev/block/8:64/dm/uuid
[ceph2][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde1 uuid path is 
/sys/dev/block/8:65/dm/uuid
[ceph2][WARNIN] populate_data_path_device: Creating xfs fs on /dev/sde1

Is this because of the aforementioned bug? It seemed to succeed after a few 
retries in each case of it happening.

> Secondly, the disks are now activated by udev. Instead of using
> activate, use prepare
> and udev handles the rest.

I saw this sort of thing after each disk prepare:

[ceph2][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdc uuid path is 
/sys/dev/block/8:32/dm/uuid
[ceph2][WARNIN] command_check_call: Running command: /sbin/sgdisk 
--typecode=1:4fbd7e29-9d25-41b8-afd0-062c0ceff05d -- /dev/sdc
[ceph2][DEBUG ] Warning: The kernel is still using the old partition table.
[ceph2][DEBUG ] The new table will be used at the next reboot.
[ceph2][DEBUG ] The operation has completed successfully.
[ceph2][WARNIN] update_partition: Calling partprobe on prepared device /dev/sdc
[ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle 
--timeout=600
[ceph2][WARNIN] command: Running command: /sbin/partprobe /dev/sdc
[ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle 
--timeout=600
[ceph2][WARNIN] command_check_call: Running command: /usr/bin/udevadm trigger 
--action=add --sysname-match sdc1
[ceph2][INFO  ] checking OSD status...
[ceph2][DEBUG ] find the location of an executable
[ceph2][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat 
--format=json
[ceph_deploy.osd][DEBUG ] Host ceph2 is now ready for osd use.

Is the ‘udevadm’ stuff I see there what you are talking about? How may I verify 
that the disks are activated & ready for use?

> 
> Third, this doesn't work well if you're also using LVM on your host
> since for some reason
> this causes udev to not send the necessary add/change events.

Not using LVM on these hosts, but good to know.

> 
> Hope this helps,
> 
> Ruben

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph noob - getting error when I try to "ceph-deploy osd activate" on a node

2016-07-17 Thread Ruben Kerkhof
On Sun, Jul 17, 2016 at 5:21 AM, Will Dennis  wrote:
> OK, so nuked everything Ceph-related on my nodes, and started over. Now
> running Jewel (ceph version 10.2.2). Everything went fine until "ceph-deploy
> osd activate” again; now I’m seeing the following --
>
> [ceph_deploy.osd][DEBUG ] activating host ceph2 disk /dev/sdc
> [ceph_deploy.osd][DEBUG ] will use init type: systemd
> [ceph2][DEBUG ] find the location of an executable
> [ceph2][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v activate
> --mark-init systemd --mount /dev/sdc
> [ceph2][WARNIN] main_activate: path = /dev/sdc
> [ceph2][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdc uuid path is
> /sys/dev/block/8:32/dm/uuid
> [ceph2][WARNIN] command: Running command: /sbin/blkid -p -s TYPE -o value --
> /dev/sdc
> [ceph2][WARNIN] Traceback (most recent call last):
> [ceph2][WARNIN]   File "/usr/sbin/ceph-disk", line 9, in 
> [ceph2][WARNIN] load_entry_point('ceph-disk==1.0.0', 'console_scripts',
> 'ceph-disk')()
> [ceph2][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py",
> line 4994, in run
> [ceph2][WARNIN] main(sys.argv[1:])
> [ceph2][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py",
> line 4945, in main
> [ceph2][WARNIN] args.func(args)
> [ceph2][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py",
> line 3299, in main_activate
> [ceph2][WARNIN] reactivate=args.reactivate,
> [ceph2][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py",
> line 3009, in mount_activate
> [ceph2][WARNIN] e,
> [ceph2][WARNIN] ceph_disk.main.FilesystemTypeError: Cannot discover
> filesystem type: device /dev/sdc: Line is truncated:
> [ceph2][ERROR ] RuntimeError: command returned non-zero exit status: 1
> [ceph_deploy][ERROR ] RuntimeError: Failed to execute command:
> /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /dev/sdc
>
> Still the same sort of error… no output from the blkid command used. Again,
> if I use either “-s PTTYPE” or get rid of the “-s TYPE” altogether, I get a
> value returned (“gpt”)…

I've run into a few issues as well activating OSDs on CentOS 7.

First, there's an issue with the version of parted in CentOS 7.2:
https://bugzilla.redhat.com/1339705

Secondly, the disks are now activated by udev. Instead of using
activate, use prepare
and udev handles the rest.

Third, this doesn't work well if you're also using LVM on your host
since for some reason
this causes udev to not send the necessary add/change events.

Hope this helps,

Ruben
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com