from:"Kenneth Van Alstyne"

Re: [ceph-users] Panic in kernel CephFS client after kernel update

2019-10-05 Thread Kenneth Van Alstyne

Thanks!  I’ll remove my patch from my local build of the 4.19 kernel and 
upgrade to 4.19.77.  Appreciate the quick fix.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
M: 228.547.8045
15052 Conference Center Dr, Chantilly, VA 20151
perspecta

On Oct 5, 2019, at 7:29 AM, Ilya Dryomov 
mailto:idryo...@gmail.com>> wrote:

On Tue, Oct 1, 2019 at 9:12 PM Jeff Layton 
mailto:jlay...@kernel.org>> wrote:

On Tue, 2019-10-01 at 15:04 -0400, Sasha Levin wrote:
On Tue, Oct 01, 2019 at 01:54:45PM -0400, Jeff Layton wrote:
On Tue, 2019-10-01 at 19:03 +0200, Ilya Dryomov wrote:
On Tue, Oct 1, 2019 at 6:41 PM Kenneth Van Alstyne
mailto:kvanalst...@knightpoint.com>> wrote:
All:
I’m not sure this should go to LKML or here, but I’ll start here.  After 
upgrading from Linux kernel 4.19.60 to 4.19.75 (or 76), I started running into 
kernel panics in the “ceph” module.  Based on the call trace, I believe I was 
able to narrow it down to the following commit in the Linux kernel 4.19 source 
tree:

commit 81281039a673d30f9d04d38659030a28051a
Author: Yan, Zheng mailto:z...@redhat.com>>
Date:   Sun Jun 2 09:45:38 2019 +0800

   ceph: use ceph_evict_inode to cleanup inode's resource

   [ Upstream commit 87bc5b895d94a0f40fe170d4cf5771c8e8f85d15 ]

   remove_session_caps() relies on __wait_on_freeing_inode(), to wait for
   freeing inode to remove its caps. But VFS wakes freeing inode waiters
   before calling destroy_inode().

   Cc: sta...@vger.kernel.org<mailto:sta...@vger.kernel.org>
   Link: https://tracker.ceph.com/issues/40102
   Signed-off-by: "Yan, Zheng" mailto:z...@redhat.com>>
   Reviewed-by: Jeff Layton mailto:jlay...@redhat.com>>
   Signed-off-by: Ilya Dryomov mailto:idryo...@gmail.com>>
   Signed-off-by: Sasha Levin mailto:sas...@kernel.org>>


Backing this patch out and recompiling my kernel has since resolved my issues 
(as far as I can tell thus far).  The issue was fairly easy to create by simply 
creating and deleting files.  I tested using ‘dd’ and was pretty consistently 
able to reproduce the issue. Since the issue occurred in a VM, I do have a 
screenshot of the crashed machine and to avoid attaching an image, I’ll link to 
where they are:  http://kvanals.kvanals.org/.ceph_kernel_panic_images/

Am I way off base or has anyone else run into this issue?

Hi Kenneth,

This might be a botched backport.  The first version of this patch had
a conflict with Al's change that introduced ceph_free_inode() and Zheng
had to adjust it for that.  However, it looks like it has been taken to
4.19 verbatim, even though 4.19 does not have ceph_free_inode().

Zheng, Jeff, please take a look ASAP.


(Sorry for the resend -- I got Sasha's old addr)

Thanks Ilya,

I think you're right -- this patch should not have been merged on any
pre-5.2 kernels. We should go ahead and revert this for now, and do a
one-off backport for v4.19.

Sasha, what do we need to do to make that happen?

I think the easiest would be to just revert the broken one and apply a
clean backport which you'll send me?


Thanks, Sasha. You can revert the old patch as soon as you're ready.
It'll take me a bit to put together and test a proper backport, but
I'll try to have something ready within the next day or so.

Kenneth, this is now fixed in 4.19.77.  Thanks for the report!

   Ilya

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Panic in kernel CephFS client after kernel update

2019-10-01 Thread Kenneth Van Alstyne

All:
I’m not sure this should go to LKML or here, but I’ll start here.  After 
upgrading from Linux kernel 4.19.60 to 4.19.75 (or 76), I started running into 
kernel panics in the “ceph” module.  Based on the call trace, I believe I was 
able to narrow it down to the following commit in the Linux kernel 4.19 source 
tree:

commit 81281039a673d30f9d04d38659030a28051a
Author: Yan, Zheng mailto:z...@redhat.com>>
Date:   Sun Jun 2 09:45:38 2019 +0800

ceph: use ceph_evict_inode to cleanup inode's resource

[ Upstream commit 87bc5b895d94a0f40fe170d4cf5771c8e8f85d15 ]

remove_session_caps() relies on __wait_on_freeing_inode(), to wait for
freeing inode to remove its caps. But VFS wakes freeing inode waiters
before calling destroy_inode().

Cc: sta...@vger.kernel.org<mailto:sta...@vger.kernel.org>
Link: https://tracker.ceph.com/issues/40102
Signed-off-by: "Yan, Zheng" mailto:z...@redhat.com>>
Reviewed-by: Jeff Layton mailto:jlay...@redhat.com>>
Signed-off-by: Ilya Dryomov mailto:idryo...@gmail.com>>
Signed-off-by: Sasha Levin mailto:sas...@kernel.org>>


Backing this patch out and recompiling my kernel has since resolved my issues 
(as far as I can tell thus far).  The issue was fairly easy to create by simply 
creating and deleting files.  I tested using ‘dd’ and was pretty consistently 
able to reproduce the issue. Since the issue occurred in a VM, I do have a 
screenshot of the crashed machine and to avoid attaching an image, I’ll link to 
where they are:  http://kvanals.kvanals.org/.ceph_kernel_panic_images/

Am I way off base or has anyone else run into this issue?

Thanks,

--
Kenneth Van Alstyne
Systems Architect
M: 228.547.8045
15052 Conference Center Dr, Chantilly, VA 20151
perspecta

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph capacity versus pool replicated size discrepancy?

2019-08-14 Thread Kenneth Van Alstyne

Got it!  I can calculate individual clone usage using “rbd du”, but does 
anything exist to show total clone usage across the pool?  Otherwise it looks 
like phantom space is just missing.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
M: 228.547.8045
15052 Conference Center Dr, Chantilly, VA 20151
perspecta

On Aug 13, 2019, at 11:05 PM, Konstantin Shalygin 
mailto:k0...@k0ste.ru>> wrote:



Hey guys, this is probably a really silly question, but I’m trying to reconcile 
where all of my space has gone in one cluster that I am responsible for.

The cluster is made up of 36 2TB SSDs across 3 nodes (12 OSDs per node), all 
using FileStore on XFS.  We are running Ceph Luminous 12.2.8 on this particular 
cluster. The only pool where data is heavily stored is the “rbd” pool, of which 
7.09TiB is consumed.  With a replication of “3”, I would expect that the raw 
used to be close to 21TiB, but it’s actually closer to 35TiB.  Some additional 
details are below.  Any thoughts?

[cluster] root at 
dashboard<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>:~# ceph df
GLOBAL:
SIZEAVAIL   RAW USED %RAW USED
62.8TiB 27.8TiB  35.1TiB 55.81
POOLS:
NAME   ID USED%USED MAX AVAIL 
OBJECTS
rbd0  7.09TiB 53.76   6.10TiB 
3056783
data   3  29.4GiB  0.47   6.10TiB   
 7918
metadata   4  57.2MiB 0   6.10TiB   
   95
.rgw.root  5  1.09KiB 0   6.10TiB   
4
default.rgw.control6   0B 0   6.10TiB   
8
default.rgw.meta   7   0B 0   6.10TiB   
0
default.rgw.log8   0B 0   6.10TiB   
  207
default.rgw.buckets.index  9   0B 0   6.10TiB   
0
default.rgw.buckets.data   10  0B 0   6.10TiB   
0
default.rgw.buckets.non-ec 11  0B 0   6.10TiB   
0

[cluster] root at 
dashboard<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>:~# ceph 
--version
ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable)

[cluster] root at 
dashboard<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>:~# ceph osd 
dump | grep 'replicated size'
pool 0 'rbd' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins 
pg_num 682 pgp_num 682 last_change 414873 flags hashpspool 
min_write_recency_for_promote 1 stripe_width 0 application rbd
pool 3 'data' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins 
pg_num 682 pgp_num 682 last_change 409614 flags hashpspool 
crash_replay_interval 45 min_write_recency_for_promote 1 stripe_width 0 
application cephfs
pool 4 'metadata' replicated size 3 min_size 1 crush_rule 0 object_hash 
rjenkins pg_num 682 pgp_num 682 last_change 409617 flags hashpspool 
min_write_recency_for_promote 1 stripe_width 0 application cephfs
pool 5 '.rgw.root' replicated size 3 min_size 1 crush_rule 0 object_hash 
rjenkins pg_num 409 pgp_num 409 last_change 409710 lfor 0/336229 flags 
hashpspool stripe_width 0 application rgw
pool 6 'default.rgw.control' replicated size 3 min_size 1 crush_rule 0 
object_hash rjenkins pg_num 409 pgp_num 409 last_change 409711 lfor 0/336232 
flags hashpspool stripe_width 0 application rgw
pool 7 'default.rgw.meta' replicated size 3 min_size 1 crush_rule 0 object_hash 
rjenkins pg_num 409 pgp_num 409 last_change 409713 lfor 0/336235 flags 
hashpspool stripe_width 0 application rgw
pool 8 'default.rgw.log' replicated size 3 min_size 1 crush_rule 0 object_hash 
rjenkins pg_num 409 pgp_num 409 last_change 409712 lfor 0/336238 flags 
hashpspool stripe_width 0 application rgw
pool 9 'default.rgw.buckets.index' replicated size 3 min_size 1 crush_rule 0 
object_hash rjenkins pg_num 409 pgp_num 409 last_change 409714 lfor 0/336241 
flags hashpspool stripe_width 0 application rgw
pool 10 'default.rgw.buckets.data' replicated size 3 min_size 1 crush_rule 0 
object_hash rjenkins pg_num 409 pgp_num 409 last_change 409715 lfor 0/336244 
flags hashpspool stripe_width 0 application rgw
pool 11 'default.rgw.buckets.non-ec' replicated size 3 min_size 1 crush_rule 0 
object_hash rjenkins pg_num 409 pgp_num 409 last_change 409716 lfor 0/336247 
flags hashpspool stripe_width 0 application rgw

[cluster] root at 
dashboard<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>:~# ceph osd 
lspools
0 rbd,3 data,4 metadata,5 .rgw.root,6 default.rgw.control,7 default.rgw.meta,8 
default.rgw.log,9 default.rgw.buckets.index,10 default.rgw.buckets.data,11 
default.rgw.buckets.non-ec,

[cluster] root at 
dashboard<http://lists.ceph.com/listi

[ceph-users] Ceph capacity versus pool replicated size discrepancy?

2019-08-13 Thread Kenneth Van Alstyne

c  0B   0   0   0  0   
00   0  0B   0  0B
default.rgw.control 0B   8   0  24  0   
00   0  0B   0  0B
default.rgw.log 0B 207   0 621  0   
0021644149 20.6GiB14422618  0B
default.rgw.meta0B   0   0   0  0   
00   0  0B   0  0B
metadata   57.2MiB  95   0 285  0   
00 780  189MiB   86885  476MiB
rbd7.09TiB 3053998 1539909 9161994  0   
00 23432304830 1.07PiB 11174458128  232TiB

total_objects3062230
total_used   35.0TiB
total_avail  27.8TiB
total_space  62.8TiB

[cluster] root@dashboard:~# for pool in `rados lspools`; do echo $pool; ceph 
osd pool get $pool size; echo; done
rbd
size: 3
data
size: 3
metadata
size: 3
.rgw.root
size: 3
default.rgw.control
size: 3
default.rgw.meta
size: 3
default.rgw.log
size: 3
default.rgw.buckets.index
size: 3
default.rgw.buckets.data
size: 3
default.rgw.buckets.non-ec
size: 3


Thanks,

--
Kenneth Van Alstyne
Systems Architect
M: 228.547.8045
15052 Conference Center Dr, Chantilly, VA 20151
perspecta

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Data distribution question

2019-04-30 Thread Kenneth Van Alstyne

Unfortunately it looks like he’s still on Luminous, but if upgrading is an 
option, the options are indeed significantly better.  If I recall correctly, at 
least the balancer module is available in Luminous.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

On Apr 30, 2019, at 12:15 PM, Jack 
mailto:c...@jack.fr.eu.org>> wrote:

Hi,

I see that you are using rgw
RGW comes with many pools, yet most of them are used for metadata and
configuration, those do not store many data
Such pools do not need more than a couple PG, each (I use pg_num = 8)

You need to allocate your pg on pool that actually stores the data

Please do the following, to let us know more:
Print the pg_num per pool:
for i in $(rados lspools); do echo -n "$i: "; ceph osd pool get $i
pg_num; done

Print the usage per pool:
ceph df

Also, instead of doing a "ceph osd reweight-by-utilization", check out
the balancer plugin : http://docs.ceph.com/docs/mimic/mgr/balancer/

Finally, in nautilus, the pg can now upscale and downscale automaticaly
See https://ceph.com/rados/new-in-nautilus-pg-merging-and-autotuning/


On 04/30/2019 06:34 PM, Shain Miley wrote:
Hi,

We have a cluster with 235 osd's running version 12.2.11 with a
combination of 4 and 6 TB drives.  The data distribution across osd's
varies from 52% to 94%.

I have been trying to figure out how to get this a bit more balanced as
we are running into 'backfillfull' issues on a regular basis.

I've tried adding more pgs...but this did not seem to do much in terms
of the imbalance.

Here is the end output from 'ceph osd df':

MIN/MAX VAR: 0.73/1.31  STDDEV: 7.73

We have 8199 pgs total with 6775 of them in the pool that has 97% of the
data.

The other pools are not really used (data, metadata, .rgw.root,
.rgw.control, etc).  I have thought about deleting those unused pools so
that most if not all the pgs are being used by the pool with the
majority of the data.

However...before I do that...there anything else I can do or try in
order to see if I can balance out the data more uniformly?

Thanks in advance,

Shain


___
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Data distribution question

2019-04-30 Thread Kenneth Van Alstyne

Shain:
Have you looked into doing a "ceph osd reweight-by-utilization” by chance?  
I’ve found that data distribution is rarely perfect and on aging clusters, I 
always have to do this periodically.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

On Apr 30, 2019, at 11:34 AM, Shain Miley 
mailto:smi...@npr.org>> wrote:

Hi,

We have a cluster with 235 osd's running version 12.2.11 with a combination of 
4 and 6 TB drives.  The data distribution across osd's varies from 52% to 94%.

I have been trying to figure out how to get this a bit more balanced as we are 
running into 'backfillfull' issues on a regular basis.

I've tried adding more pgs...but this did not seem to do much in terms of the 
imbalance.

Here is the end output from 'ceph osd df':

MIN/MAX VAR: 0.73/1.31  STDDEV: 7.73

We have 8199 pgs total with 6775 of them in the pool that has 97% of the data.

The other pools are not really used (data, metadata, .rgw.root, .rgw.control, 
etc).  I have thought about deleting those unused pools so that most if not all 
the pgs are being used by the pool with the majority of the data.

However...before I do that...there anything else I can do or try in order to 
see if I can balance out the data more uniformly?

Thanks in advance,

Shain

--
NPR | Shain Miley | Manager of Infrastructure, Digital Media | 
smi...@npr.org<mailto:smi...@npr.org> | 202.513.3649

___
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] VM management setup

2019-04-05 Thread Kenneth Van Alstyne

This is purely anecdotal (obviously), but I have found that OpenNebula is not 
only easy to setup, is relatively lightweight, and has very good Ceph support.  
5.8.0 was recently released, but has a few bugs related to live migrations with 
Ceph as the backend datastore.  You may want to look at 5.6.1 or wait for 5.8.1 
to be released since the issues have already been fixed upstream.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

On Apr 5, 2019, at 2:34 PM, jes...@krogh.cc<mailto:jes...@krogh.cc> wrote:

Hi. Knowing this is a bit off-topic but seeking recommendations
and advise anyway.

We're seeking a "management" solution for VM's - currently in the 40-50
VM - but would like to have better access in managing them and potintially
migrate them across multiple hosts, setup block devices, etc, etc.

This is only to be used internally in a department where a bunch of
engineering people will manage it, no costumers and that kind of thing.

Up until now we have been using virt-manager with kvm - and have been
quite satisfied when we were in the "few vms", but it seems like the
time to move on.

Thus we're looking for something "simple" that can help manage a ceph+kvm
based setup -  the simpler and more to the point the better.

Any recommendations?

.. found a lot of names allready ..
OpenStack
CloudStack
Proxmox
..

But recommendations are truely welcome.

Thanks.

___
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Nautilus upgrade but older releases reported by features

2019-03-27 Thread Kenneth Van Alstyne

Anecdotally, I see the same behaviour, but there seem to be no negative 
side-effects.  The “jewel” clients below are more than likely the (Linux) 
kernel client:

[cinder] root@aurae-dashboard:~# ceph features
{
"mon": [
{
"features": "0x3ffddff8ffac",
"release": "luminous",
"num": 1
}
],
"mds": [
{
"features": "0x3ffddff8ffac",
"release": "luminous",
"num": 1
}
],
"osd": [
{
"features": "0x3ffddff8ffac",
"release": "luminous",
"num": 1
}
],
"client": [
{
"features": "0x27018fb86aa42ada",
"release": "jewel",
"num": 5
},
{
"features": "0x3ffddff8ffac",
"release": "luminous",
"num": 8
}
],
"mgr": [
{
"features": "0x3ffddff8ffac",
"release": "luminous",
"num": 1
}
]
}
[cinder] root@aurae-dashboard:~# ceph -s
  cluster:
id: 650c5366-efa8-4636-a1a1-08740513ac3c
health: HEALTH_OK

  services:
mon: 1 daemons, quorum aurae-storage-1 (age 45h)
mgr: aurae-storage-1(active, since 45h)
mds: cephfs:1 {0=aurae-storage-1=up:active}
osd: 1 osds: 1 up (since 45h), 1 in (since 43h)
rgw: 1 daemon active (radosgw.aurae-storage-1)

  data:
pools:   10 pools, 832 pgs
objects: 1.42k objects, 3.0 GiB
usage:   4.1 GiB used, 91 GiB / 95 GiB avail
pgs: 832 active+clean

  io:
client:   36 KiB/s wr, 0 op/s rd, 3 op/s wr

[cinder] root@aurae-dashboard:~# ceph versions
{
"mon": {
"ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc) 
nautilus (stable)": 1
},
"mgr": {
"ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc) 
nautilus (stable)": 1
},
"osd": {
"ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc) 
nautilus (stable)": 1
},
"mds": {
"ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc) 
nautilus (stable)": 1
},
"rgw": {
"ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc) 
nautilus (stable)": 1
},
"overall": {
"ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc) 
nautilus (stable)": 5
}
}

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

On Mar 27, 2019, at 6:52 AM, John Hearns 
mailto:hear...@googlemail.com>> wrote:

Sure

# ceph versions
{
"mon": {
"ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc) 
nautilus (stable)": 3
},
"mgr": {
"ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc) 
nautilus (stable)": 2
},
"osd": {
"ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc) 
nautilus (stable)": 12
},
"mds": {
"ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc) 
nautilus (stable)": 3
},
"rgw": {
"ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc) 
nautilus (stable)": 4
},
"overall": {
"ceph version 14.2.0 (3a54b2b6d167d4a2a19e003a705696d4fe619afc) 
nautilus (stable)": 24
}
}


On Wed, 27 Mar 2019 at 11:20, Konstantin Shalygin 
mailto:k0...@k0ste.ru>> wrote:


We recently updated a cluster to the Nautlius release by updating Debian
packages from the Ceph site. Then rebooted all servers.

ceph features still reports older releases, for example the osd

"osd": [
{
"features": "0x3ffddff8ffac",
"release": "luminous",
"num": 12
}

I think I am not understanding what is exactly meant by release here.
Cn we alter the osd (mon, clients etc.) such that they report nautilus ??


Show your `ceph versions` please.



k

___
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Filestore OSD on CephFS?

2019-01-16 Thread Kenneth Van Alstyne

I’d actually rather it not be an extra cluster, but can the destination pool 
name be different?  If not, I have conflicting image names in the “rbd” pool on 
either side.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

On Jan 16, 2019, at 9:38 AM, Robert Sander 
mailto:r.san...@heinlein-support.de>> wrote:

On 16.01.19 16:03, Kenneth Van Alstyne wrote:

To be clear, I know the question comes across as ludicrous.  It *seems*
like this is going to work okay for the light workload use case that I
have in mind — I just didn’t want to risk impacting the underlying
cluster too much or hit any other caveats that perhaps someone else has
run into before.

Why is setting up a distinct pool as destination for your RBD mirros not
an option? Does it have to be an extra cluster?

Regards
--
Robert Sander
Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 93818 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Filestore OSD on CephFS?

2019-01-16 Thread Kenneth Van Alstyne

Burkhard:
Thank you, this is literally what I was looking for.  A VM with RBD images 
attached was my first choice (and what we do for a test and integration lab 
today), but am trying to give as much possible space to the underlying cluster 
without having to frequently add/remove OSDs and rebalance the “sub-cluster”.  
I didn’t think about a loopback-mapped file on CephFS — but at that point, to 
your point, I might as well use RBD.  :-)

To be clear, I know the question comes across as ludicrous.  It *seems* like 
this is going to work okay for the light workload use case that I have in mind 
— I just didn’t want to risk impacting the underlying cluster too much or hit 
any other caveats that perhaps someone else has run into before.  I doubt many 
people have tried CephFS as a Filestore OSD since in general, it seems like a 
pretty silly idea.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

On Jan 16, 2019, at 8:27 AM, Burkhard Linke 
mailto:burkhard.li...@computational.bio.uni-giessen.de>>
 wrote:

Hi,


just some comments:


CephFS has an overhead for accessing files (capabilities round trip to MDS for 
first access, cap cache management, limited number of concurrent caps depending 
on MDS cache size...), so using the cephfs filesystem as storage for a 
filestore OSD will add some extra overhead. I would use a loopback file since 
it reduces the cephfs overhead (one file, one cap), but it might also introduce 
other restrictions, e.g. fixed size of the file.

If you can use a ceph cluster as 'backend storage', you can also use a rbd 
image. This should be remove most of the restrictions you have already 
mentioned (except fixed size again). You can also use multiple images to have 
multiple OSDs ;-)


Regards,

Burkhard


___
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Filestore OSD on CephFS?

2019-01-16 Thread Kenneth Van Alstyne

Marc:
To clarify, there will be no direct client workload (which is what I mean by 
“active production workload”), but rather RBD images from a remote cluster 
imported from either RBD export/import or as an RBD mirror destination.  
Obviously the best solution is dedicated hardware, but I don’t have that.  The 
single OSD is simply due to the underlying cluster already either being erasure 
coded or replicated.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

On Jan 16, 2019, at 8:14 AM, Marc Roos 
mailto:m.r...@f1-outsourcing.eu>> wrote:


How can there be a "catastrophic reason" if you have "no active,
production workload"...? Do as you please. I am also having 1
replication for temp en tests. But if you have only one osd why use
ceph? Choose the correct 'tool' for the job.





-----Original Message-
From: Kenneth Van Alstyne [mailto:kvanalst...@knightpoint.com]
Sent: 16 January 2019 15:04
To: ceph-users
Subject: [ceph-users] Filestore OSD on CephFS?

Disclaimer:  Even I will admit that I know this is going to sound like a
silly/crazy/insane question, but I have a reason for wanting to do this
and asking the question.  Its also worth noting that no active,
production workload will be used on this cluster, so Im worried
more about data integrity than performance of availability.

Can anyone think of any catastrophic reason why I cannot use an existing
clusters CephFS filesystem as a single OSD for a small cluster?  Ive
tested it and it seems to work with the following caveats:
- 50% performance degradation (due to double write penalty since journal
and OSD data both are on the same backing cluster)
- Max object name and namespace length limits, which can be overcome
with the following OSD parameters:
- osd max object name len = 256
- osd max object namespace len = 64
- Due to above name/namespace length limits, cluster should be limited
to RBD (which is exactly what I want to do)

Some details of my cluster are below if anyone cares and Im getting a
consistent, solid roughly 50% of the underlying clusters performance
benchmarks using rados bench:
# ceph --cluster cephfs status
 cluster:
   id: 0f8904ce-754b-48d4-aa58-7ee6fe9e2cca
   health: HEALTH_OK

 services:
   mon:1 daemons, quorum storage
   mgr:storage(active)
   osd:1 osds: 1 up, 1 in
   rbd-mirror: 1 daemon active

 data:
   pools:   1 pools, 32 pgs
   objects: 10  objects, 133 B
   usage:   12 MiB used, 87 GiB / 87 GiB avail
   pgs: 32 active+clean

 io:
   client:   85 B/s wr, 0 op/s rd, 0 op/s wr

# ceph --cluster cephfs versions
{
   "mon": {
   "ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e)
mimic (stable)": 1
   },
   "mgr": {
   "ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e)
mimic (stable)": 1
   },
   "osd": {
   "ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e)
mimic (stable)": 1
   },
   "mds": {},
   "rbd-mirror": {
   "ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e)
mimic (stable)": 1
   },
   "overall": {
   "ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e)
mimic (stable)": 4
   }
}


# ceph --cluster cephfs osd df
ID CLASS WEIGHT  REWEIGHT SIZE   USEAVAIL  %USE VAR  PGS
0   hdd 0.08510  1.0 87 GiB 16 MiB 87 GiB 0.02 1.00  32
   TOTAL 87 GiB 16 MiB 87 GiB 0.02
MIN/MAX VAR: 1.00/1.00  STDDEV: 0


# ceph --cluster cephfs df
GLOBAL:
   SIZE   AVAIL  RAW USED %RAW USED
   87 GiB 87 GiB   16 MiB  0.02
POOLS:
   NAME ID USED  %USED MAX AVAIL OBJECTS
   rbd  1  133 B 083 GiB  10


# df -h /var/lib/ceph/osd/cephfs-0/
Filesystem Size  Used Avail Use% Mounted on
10.0.0.1:/ceph-remote   87G   12M   87G   1% /var/lib/ceph

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track

[ceph-users] Filestore OSD on CephFS?

2019-01-16 Thread Kenneth Van Alstyne

Disclaimer:  Even I will admit that I know this is going to sound like a 
silly/crazy/insane question, but I have a reason for wanting to do this and 
asking the question.  It’s also worth noting that no active, production 
workload will be used on this “cluster”, so I’m worried more about data 
integrity than performance of availability.

Can anyone think of any catastrophic reason why I cannot use an existing 
cluster’s CephFS filesystem as a single OSD for a small cluster?  I’ve tested 
it and it seems to work with the following caveats:
- 50% performance degradation (due to double write penalty since journal and 
OSD data both are on the same backing cluster)
- Max object name and namespace length limits, which can be overcome with the 
following OSD parameters:
- osd max object name len = 256
- osd max object namespace len = 64
- Due to above name/namespace length limits, cluster should be limited to RBD 
(which is exactly what I want to do)

Some details of my cluster are below if anyone cares and I’m getting a 
consistent, solid roughly 50% of the underlying cluster’s performance 
benchmarks using “rados bench”:
# ceph --cluster cephfs status
  cluster:
id: 0f8904ce-754b-48d4-aa58-7ee6fe9e2cca
health: HEALTH_OK

  services:
mon:1 daemons, quorum storage
mgr:storage(active)
osd:1 osds: 1 up, 1 in
rbd-mirror: 1 daemon active

  data:
pools:   1 pools, 32 pgs
objects: 10  objects, 133 B
usage:   12 MiB used, 87 GiB / 87 GiB avail
pgs: 32 active+clean

  io:
client:   85 B/s wr, 0 op/s rd, 0 op/s wr

# ceph --cluster cephfs versions
{
"mon": {
"ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic 
(stable)": 1
},
"mgr": {
"ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic 
(stable)": 1
},
"osd": {
"ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic 
(stable)": 1
},
"mds": {},
"rbd-mirror": {
"ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic 
(stable)": 1
},
"overall": {
"ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic 
(stable)": 4
}
}

# ceph --cluster cephfs osd df
ID CLASS WEIGHT  REWEIGHT SIZE   USEAVAIL  %USE VAR  PGS
 0   hdd 0.08510  1.0 87 GiB 16 MiB 87 GiB 0.02 1.00  32
TOTAL 87 GiB 16 MiB 87 GiB 0.02
MIN/MAX VAR: 1.00/1.00  STDDEV: 0

# ceph --cluster cephfs df
GLOBAL:
SIZE   AVAIL  RAW USED %RAW USED
87 GiB 87 GiB   16 MiB  0.02
POOLS:
NAME ID USED  %USED MAX AVAIL OBJECTS
rbd  1  133 B 083 GiB  10

# df -h /var/lib/ceph/osd/cephfs-0/
Filesystem Size  Used Avail Use% Mounted on
10.0.0.1:/ceph-remote   87G   12M   87G   1% /var/lib/ceph

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] RBD Mirror Proxy Support?

2019-01-14 Thread Kenneth Van Alstyne

D’oh!  I was hoping that the destination pools could be unique names, 
regardless of the source pool name.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

On Jan 14, 2019, at 11:07 AM, Jason Dillaman 
mailto:jdill...@redhat.com>> wrote:

On Mon, Jan 14, 2019 at 11:09 AM Kenneth Van Alstyne
mailto:kvanalst...@knightpoint.com>> wrote:

In this case, I’m imagining Clusters A/B both having write access to a third 
“Cluster C”.  So A/B -> C rather than A -> C -> B / B -> C -> A / A -> B-> C.  
I admit, in the event that I need to replicate back to either primary cluster, 
there may be challenges.

While this is possible, in addition to the failback question, you
would also need to use unique pool names in clusters A and B since on
cluster C you are currently prevented from adding more than a single
peer per pool.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

On Jan 14, 2019, at 9:50 AM, Jason Dillaman  wrote:

On Mon, Jan 14, 2019 at 10:10 AM Kenneth Van Alstyne
 wrote:


Thanks for the reply Jason — I was actually thinking of emailing you directly, 
but thought it may be beneficial to keep the conversation to the list so that 
everyone can see the thread.   Can you think of a reason why one-way RBD 
mirroring would not work to a shared tertiary cluster?  I need to build out a 
test lab to see how that would work for us.


I guess I don't understand what the tertiary cluster is doing? If the
goal is to replicate from cluster A -> cluster B -> cluster C, that is
not currently supported since (by design choice) we don't currently
re-write the RBD image journal entries from the source cluster to the
destination cluster but instead just directly apply the journal
entries to the destination image (to save IOPS).

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

On Jan 12, 2019, at 4:01 PM, Jason Dillaman  wrote:

On Fri, Jan 11, 2019 at 2:09 PM Kenneth Van Alstyne
 wrote:


Hello all (and maybe this would be better suited for the ceph devel mailing 
list):
I’d like to use RBD mirroring between two sites (to each other), but I have the 
following limitations:
- The clusters use the same name (“ceph”)


That's actually not an issue. The "ceph" name is used to locate
configuration files for RBD mirroring (a la
/etc/ceph/.conf and
/etc/ceph/.client..keyring). You just need to map
that cluster config file name to the remote cluster name in the RBD
mirroring configuration. Additionally, starting with Nautilus, the
configuration details for connecting to a remote cluster can now be
stored in the monitor (via the rbd CLI and dashbaord), so there won't
be any need to fiddle with configuration files for remote clusters
anymore.

- The clusters share IP address space on

Re: [ceph-users] RBD Mirror Proxy Support?

2019-01-14 Thread Kenneth Van Alstyne

In this case, I’m imagining Clusters A/B both having write access to a third 
“Cluster C”.  So A/B -> C rather than A -> C -> B / B -> C -> A / A -> B-> C.  
I admit, in the event that I need to replicate back to either primary cluster, 
there may be challenges.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

On Jan 14, 2019, at 9:50 AM, Jason Dillaman 
mailto:jdill...@redhat.com>> wrote:

On Mon, Jan 14, 2019 at 10:10 AM Kenneth Van Alstyne
mailto:kvanalst...@knightpoint.com>> wrote:

Thanks for the reply Jason — I was actually thinking of emailing you directly, 
but thought it may be beneficial to keep the conversation to the list so that 
everyone can see the thread.   Can you think of a reason why one-way RBD 
mirroring would not work to a shared tertiary cluster?  I need to build out a 
test lab to see how that would work for us.


I guess I don't understand what the tertiary cluster is doing? If the
goal is to replicate from cluster A -> cluster B -> cluster C, that is
not currently supported since (by design choice) we don't currently
re-write the RBD image journal entries from the source cluster to the
destination cluster but instead just directly apply the journal
entries to the destination image (to save IOPS).

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

On Jan 12, 2019, at 4:01 PM, Jason Dillaman  wrote:

On Fri, Jan 11, 2019 at 2:09 PM Kenneth Van Alstyne
 wrote:


Hello all (and maybe this would be better suited for the ceph devel mailing 
list):
I’d like to use RBD mirroring between two sites (to each other), but I have the 
following limitations:
- The clusters use the same name (“ceph”)


That's actually not an issue. The "ceph" name is used to locate
configuration files for RBD mirroring (a la
/etc/ceph/.conf and
/etc/ceph/.client..keyring). You just need to map
that cluster config file name to the remote cluster name in the RBD
mirroring configuration. Additionally, starting with Nautilus, the
configuration details for connecting to a remote cluster can now be
stored in the monitor (via the rbd CLI and dashbaord), so there won't
be any need to fiddle with configuration files for remote clusters
anymore.

- The clusters share IP address space on a private, non-routed storage network


Unfortunately, that is an issue since the rbd-mirror daemon needs to
be able to connect to both clusters. If the two clusters are at least
on different subnets and your management servers can talk to each
side, you might be able to run the rbd-mirror daemon there.


There are management servers on each side that can talk to the respective 
storage networks, but the storage networks cannot talk directly to each other.  
I recall reading, some years back, of possibly adding support for an RBD mirror 
proxy, which would potentially solve my issues.  Has anything been done in this 
regard?


No, I haven't really seen much demand for such support so it's never
bubbled up as a priority yet.

If not, is my best bet perhaps a tertiary clusters that both can reach and do 
one-way replication to?

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001

Re: [ceph-users] RBD Mirror Proxy Support?

2019-01-14 Thread Kenneth Van Alstyne

Thanks for the reply Jason — I was actually thinking of emailing you directly, 
but thought it may be beneficial to keep the conversation to the list so that 
everyone can see the thread.   Can you think of a reason why one-way RBD 
mirroring would not work to a shared tertiary cluster?  I need to build out a 
test lab to see how that would work for us.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

On Jan 12, 2019, at 4:01 PM, Jason Dillaman 
mailto:jdill...@redhat.com>> wrote:

On Fri, Jan 11, 2019 at 2:09 PM Kenneth Van Alstyne
mailto:kvanalst...@knightpoint.com>> wrote:

Hello all (and maybe this would be better suited for the ceph devel mailing 
list):
I’d like to use RBD mirroring between two sites (to each other), but I have the 
following limitations:
- The clusters use the same name (“ceph”)

That's actually not an issue. The "ceph" name is used to locate
configuration files for RBD mirroring (a la
/etc/ceph/.conf and
/etc/ceph/.client..keyring). You just need to map
that cluster config file name to the remote cluster name in the RBD
mirroring configuration. Additionally, starting with Nautilus, the
configuration details for connecting to a remote cluster can now be
stored in the monitor (via the rbd CLI and dashbaord), so there won't
be any need to fiddle with configuration files for remote clusters
anymore.

- The clusters share IP address space on a private, non-routed storage network

Unfortunately, that is an issue since the rbd-mirror daemon needs to
be able to connect to both clusters. If the two clusters are at least
on different subnets and your management servers can talk to each
side, you might be able to run the rbd-mirror daemon there.


There are management servers on each side that can talk to the respective 
storage networks, but the storage networks cannot talk directly to each other.  
I recall reading, some years back, of possibly adding support for an RBD mirror 
proxy, which would potentially solve my issues.  Has anything been done in this 
regard?

No, I haven't really seen much demand for such support so it's never
bubbled up as a priority yet.

If not, is my best bet perhaps a tertiary clusters that both can reach and do 
one-way replication to?

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Jason

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] RBD Mirror Proxy Support?

2019-01-11 Thread Kenneth Van Alstyne

Hello all (and maybe this would be better suited for the ceph devel mailing 
list):
I’d like to use RBD mirroring between two sites (to each other), but I have the 
following limitations:
- The clusters use the same name (“ceph”)
- The clusters share IP address space on a private, non-routed storage network

There are management servers on each side that can talk to the respective 
storage networks, but the storage networks cannot talk directly to each other.  
I recall reading, some years back, of possibly adding support for an RBD mirror 
proxy, which would potentially solve my issues.  Has anything been done in this 
regard?  If not, is my best bet perhaps a tertiary clusters that both can reach 
and do one-way replication to?

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Image has watchers, but cannot determine why

2019-01-10 Thread Kenneth Van Alstyne

Thanks for the reply — I was pretty darn sure, since I live migrated all VMs 
off of that box and then killed everything but a handful of system processes 
(init, sshd, etc) and the watcher was STILL present.  In saying that, I halted 
the machine (since nothing was running on it any longer) and the watcher did 
indeed go away and I was able to remove the images.  Very, very strange.  (But 
situation solved… except I don’t know what the cause was, really.)

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

On Jan 10, 2019, at 4:03 AM, Ilya Dryomov 
mailto:idryo...@gmail.com>> wrote:

On Wed, Jan 9, 2019 at 5:17 PM Kenneth Van Alstyne
mailto:kvanalst...@knightpoint.com>> wrote:

Hey folks, I’m looking into what I would think would be a simple problem, but 
is turning out to be more complicated than I would have anticipated.   A 
virtual machine managed by OpenNebula was blown away, but the backing RBD 
images remain.  Upon investigating, it appears
that the images still have watchers on the KVM node that that VM previously 
lived on.  I can confirm that there are no mapped RBD images on the machine and 
the qemu-system-x86_64 process is indeed no longer running.  Any ideas?  
Additional details are below:

# rbd info one-73-145-10
rbd image 'one-73-145-10':
size 1024 GB in 262144 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.27174d6b8b4567
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
flags:
parent: rbd/one-73@snap
overlap: 102400 kB
#
# rbd status one-73-145-10
Watchers:
watcher=10.0.235.135:0/3820784110 client.33810559 cookie=140234310778880
#
#
# rados -p rbd listwatchers rbd_header.27174d6b8b4567
watcher=10.0.235.135:0/3820784110 client.33810559 cookie=140234310778880

This appears to be a RADOS (i.e. not a kernel client) watch.  Are you
sure that nothing of the sort is running on that node?

In order for the watch to stay live, the watcher has to send periodic
ping messages to the OSD.  Perhaps determine the primary OSD with "ceph
osd map rbd rbd_header.27174d6b8b4567", set debug_ms to 1 on that OSD
and monitor the log for a few minutes?

Thanks,

   Ilya

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Image has watchers, but cannot determine why

2019-01-09 Thread Kenneth Van Alstyne

Hey folks, I’m looking into what I would think would be a simple problem, but 
is turning out to be more complicated than I would have anticipated.   A 
virtual machine managed by OpenNebula was blown away, but the backing RBD 
images remain.  Upon investigating, it appears
that the images still have watchers on the KVM node that that VM previously 
lived on.  I can confirm that there are no mapped RBD images on the machine and 
the qemu-system-x86_64 process is indeed no longer running.  Any ideas?  
Additional details are below:

# rbd info one-73-145-10
rbd image 'one-73-145-10':
size 1024 GB in 262144 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.27174d6b8b4567
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
flags:
parent: rbd/one-73@snap
overlap: 102400 kB
#
# rbd status one-73-145-10
Watchers:
watcher=10.0.235.135:0/3820784110 client.33810559 cookie=140234310778880
#
#
# rados -p rbd listwatchers rbd_header.27174d6b8b4567
watcher=10.0.235.135:0/3820784110 client.33810559 cookie=140234310778880
#
# ip addr show | grep -i 10.0.235.135
inet 10.0.235.135/16 scope global i-storage
#
# rbd showmapped
#
# ps -efww | grep -i qemu | grep -i rbd | grep -i 145
# ceph version
ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe)
#

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 9001 / ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Anyone tested Samsung 860 DCT SSDs?

2018-10-12 Thread Kenneth Van Alstyne

Thanks for the feedback everyone.  Based on the TBW figures, it sounds like 
these drives are terrible for us as the idea is to NOT use them simply for 
archive.  This will be a high read/write workload, so totally a show stopper.  
I’m interested in the Seagate Nytro myself.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com 
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

> On Oct 12, 2018, at 9:31 AM, Corin Langosch  
> wrote:
> 
> Hi
> 
> It has only TBW of 349 TB, so might die quite soon. But what about the
> "Seagate Nytro 1551 DuraWrite 3DWPD Mainstream Endurance 960GB, SATA"?
> Seems really cheap too and has TBW 5.25PB. Anybody tested that? What
> about (RBD) performance? 
> 
> Cheers
> Corin
> 
> On Fri, 2018-10-12 at 13:53 +, Kenneth Van Alstyne wrote:
>> Cephers:
>>  As the subject suggests, has anyone tested Samsung 860 DCT
>> SSDs?  They are really inexpensive and we are considering buying some
>> to test.
>> 
>> Thanks,
>> 
>> --
>> Kenneth Van Alstyne
>> Systems Architect
>> Knight Point Systems, LLC
>> Service-Disabled Veteran-Owned Business
>> 1775 Wiehle Avenue Suite 101 | Reston, VA 20190
>> c: 228-547-8045 f: 571-266-3106
>> www.knightpoint.com 
>> DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
>> GSA Schedule 70 SDVOSB: GS-35F-0646S
>> GSA MOBIS Schedule: GS-10F-0404Y
>> ISO 2 / ISO 27001 / CMMI Level 3
>> 
>> Notice: This e-mail message, including any attachments, is for the
>> sole use of the intended recipient(s) and may contain confidential
>> and privileged information. Any unauthorized review, copy, use,
>> disclosure, or distribution is STRICTLY prohibited. If you are not
>> the intended recipient, please contact the sender by reply e-mail and
>> destroy all copies of the original message.
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Anyone tested Samsung 860 DCT SSDs?

2018-10-12 Thread Kenneth Van Alstyne

Cephers:
As the subject suggests, has anyone tested Samsung 860 DCT SSDs?  They 
are really inexpensive and we are considering buying some to test.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com 
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSD Crash When Upgrading from Jewel to Luminous?

2018-08-21 Thread Kenneth Van Alstyne

After looking into this further, is it possible that adjusting CRUSH weight of 
the OSDs while running mis-matched versions of the ceph-osd daemon across the 
cluster can cause this issue?  Under certain circumstances in our cluster, this 
may happen automatically on the backend.  I can’t duplicate the issue in a lab, 
but highly suspect this is what happened.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com>
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

On Aug 17, 2018, at 4:01 PM, Gregory Farnum 
mailto:gfar...@redhat.com>> wrote:

Do you have more logs that indicate what state machine event the crashing OSDs 
received? This obviously shouldn't have happened, but it's a plausible failure 
mode, especially if it's a relatively rare combination of events.
-Greg

On Fri, Aug 17, 2018 at 4:49 PM Kenneth Van Alstyne 
mailto:kvanalst...@knightpoint.com>> wrote:
Hello all:
I ran into an issue recently with one of my clusters when upgrading 
from 10.2.10 to 12.2.7.  I have previously tested the upgrade in a lab and 
upgraded one of our five production clusters with no issues.  On the second 
cluster, however, I ran into an issue where all OSDs that were NOT running 
Luminous yet (which was about 40% of the cluster at the time) all crashed with 
the same backtrace, which I have pasted below:

===
 0> 2018-08-13 17:35:13.160849 7f145c9ec700 -1 osd/PG.cc<http://PG.cc>: In 
function 
'PG::RecoveryState::Crashed::Crashed(boost::statechart::state::my_context)' thread 7f145c9ec700 time 
2018-08-13 17:35:13.157319
osd/PG.cc<http://PG.cc>: 5860: FAILED assert(0 == "we got a bad state machine 
event")

 ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x7f) 
[0x55b9bf08614f]
 2: 
(PG::RecoveryState::Crashed::Crashed(boost::statechart::state, (boost::statechart::history_mode)0>::my_context)+0xc4) 
[0x55b9bea62db4]
 3: (()+0x447366) [0x55b9bea9a366]
 4: (boost::statechart::simple_state, 
(boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base 
const&, void const*)+0x2f7) [0x55b9beac8b77]
 5: (boost::statechart::state_machine, 
boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base
 const&)+0x6b) [0x55b9beaab5bb]
 6: (PG::handle_peering_event(std::shared_ptr, 
PG::RecoveryCtx*)+0x384) [0x55b9bea7db14]
 7: (OSD::process_peering_events(std::__cxx11::list > 
const&, ThreadPool::TPHandle&)+0x263) [0x55b9be9d1723]
 8: (ThreadPool::BatchWorkQueue::_void_process(void*, 
ThreadPool::TPHandle&)+0x2a) [0x55b9bea1274a]
 9: (ThreadPool::worker(ThreadPool::WorkThread*)+0xeb0) [0x55b9bf076d40]
 10: (ThreadPool::WorkThread::entry()+0x10) [0x55b9bf077ef0]
 11: (()+0x7507) [0x7f14e2c96507]
 12: (clone()+0x3f) [0x7f14e0ca214f]
 NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
interpret this.
===

Once I restarted the impacted OSDs, which brought them up to 12.2.7, everything 
recovered just fine and the cluster is healthy.  The only rub is that losing 
that many OSDs simultaneously caused a significant I/O disruption to the 
production servers for several minutes while I brought up the remaining OSDs.  
I have been trying to duplicate this issue in a lab again before continuing the 
upgrades on the other three clusters, but am coming up short.  Has anyone seen 
anything like this and am I missing something obvious?

Given how quickly the issue happened and the fact that I’m having a hard time 
reproducing this issue, I am limited in the amount of logging and debug 
information I have available, unfortunately.  If it helps, all ceph-mon, 
ceph-mds, radosgw, and ceph-mgr daemons were running 12.2.7, while 30 of the 50 
total ceph-osd daemons were also on 12.2.7 when the remaining 20 ceph-osd 
daemons (on 10.2.10) crashed.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 
20190<https://maps.google.com/?q=1775+Wiehle+Avenue+Suite+101+%7C+Reston,+VA+20190&entry=gmail&source=g>
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com<http://www.knightpoint.com/>
DHS EAGLE II

[ceph-users] OSD Crash When Upgrading from Jewel to Luminous?

2018-08-17 Thread Kenneth Van Alstyne

Hello all:
I ran into an issue recently with one of my clusters when upgrading 
from 10.2.10 to 12.2.7.  I have previously tested the upgrade in a lab and 
upgraded one of our five production clusters with no issues.  On the second 
cluster, however, I ran into an issue where all OSDs that were NOT running 
Luminous yet (which was about 40% of the cluster at the time) all crashed with 
the same backtrace, which I have pasted below:

===
 0> 2018-08-13 17:35:13.160849 7f145c9ec700 -1 osd/PG.cc: In function 
'PG::RecoveryState::Crashed::Crashed(boost::statechart::state::my_context)' thread 7f145c9ec700 time 
2018-08-13 17:35:13.157319
osd/PG.cc: 5860: FAILED assert(0 == "we got a bad state machine event")

 ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x7f) 
[0x55b9bf08614f]
 2: 
(PG::RecoveryState::Crashed::Crashed(boost::statechart::state, (boost::statechart::history_mode)0>::my_context)+0xc4) 
[0x55b9bea62db4]
 3: (()+0x447366) [0x55b9bea9a366]
 4: (boost::statechart::simple_state, 
(boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base 
const&, void const*)+0x2f7) [0x55b9beac8b77]
 5: (boost::statechart::state_machine, 
boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base
 const&)+0x6b) [0x55b9beaab5bb]
 6: (PG::handle_peering_event(std::shared_ptr, 
PG::RecoveryCtx*)+0x384) [0x55b9bea7db14]
 7: (OSD::process_peering_events(std::__cxx11::list > 
const&, ThreadPool::TPHandle&)+0x263) [0x55b9be9d1723]
 8: (ThreadPool::BatchWorkQueue::_void_process(void*, 
ThreadPool::TPHandle&)+0x2a) [0x55b9bea1274a]
 9: (ThreadPool::worker(ThreadPool::WorkThread*)+0xeb0) [0x55b9bf076d40]
 10: (ThreadPool::WorkThread::entry()+0x10) [0x55b9bf077ef0]
 11: (()+0x7507) [0x7f14e2c96507]
 12: (clone()+0x3f) [0x7f14e0ca214f]
 NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
interpret this.
===

Once I restarted the impacted OSDs, which brought them up to 12.2.7, everything 
recovered just fine and the cluster is healthy.  The only rub is that losing 
that many OSDs simultaneously caused a significant I/O disruption to the 
production servers for several minutes while I brought up the remaining OSDs.  
I have been trying to duplicate this issue in a lab again before continuing the 
upgrades on the other three clusters, but am coming up short.  Has anyone seen 
anything like this and am I missing something obvious?

Given how quickly the issue happened and the fact that I’m having a hard time 
reproducing this issue, I am limited in the amount of logging and debug 
information I have available, unfortunately.  If it helps, all ceph-mon, 
ceph-mds, radosgw, and ceph-mgr daemons were running 12.2.7, while 30 of the 50 
total ceph-osd daemons were also on 12.2.7 when the remaining 20 ceph-osd 
daemons (on 10.2.10) crashed.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com 
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Snapshot cleanup performance impact on client I/O?

2017-06-30 Thread Kenneth Van Alstyne

Hey folks:
I was wondering if the community can provide any advice — over time and 
due to some external issues, we have managed to accumulate thousands of 
snapshots of RBD images, which are now in need of cleaning up.  I have recently 
attempted to roll through a “for" loop to perform a “rbd snap rm” on each 
snapshot, sequentially, waiting until the rbd command finishes before moving 
onto the next one, of course.  I noticed that shortly after starting this, I 
started seeing thousands of slow ops and a few of our guest VMs became 
unresponsive, naturally.

My questions are:
- Is this expected behavior?
- Is the background cleanup asynchronous from the “rbd snap rm” command?
- If so, are there any OSD parameters I can set to reduce the 
impact on production?
- Would “rbd snap purge” be any different?  I expect not, since 
fundamentally, rbd is performing the same action that I do via the loop.

Relevant details are as follows, though I’m not sure cluster size *really* has 
any effect here:
- Ceph: version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
- 5 storage nodes, each with:
- 10x 2TB 7200 RPM SATA Spindles (for a total of 50 OSDs)
- 2x Samsung MZ7LM240 SSDs (used as journal for the OSDs)
- 64GB RAM
- 2x Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz
- 20GBit LACP Port Channel via Intel X520 Dual Port 10GbE NIC

Let me know if I’ve missed something fundamental.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com 
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 2 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph Performance Questions with rbd images access by qemu-kvm

2015-09-01 Thread Kenneth Van Alstyne

Got it — I’ll keep that in mind. That may just be what I need to “get by” for 
now.  Ultimately, we’re looking to buy at least three nodes of servers that can 
hold 40+ OSDs backed by 2TB+ SATA disks,

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com 
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 2 / ISO 27001

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

> On Sep 1, 2015, at 11:26 AM, Robert LeBlanc  wrote:
> 
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
> 
> Just swapping out spindles for SSD will not give you orders of magnitude 
> performance gains as it does in regular cases. This is because Ceph has a lot 
> of overhead for each I/O which limits the performance of the SSDs. In my 
> testing, two Intel S3500 SSDs with an 8 core Atom (Intel(R) Atom(TM) CPU  
> C2750  @ 2.40GHz) and size=1 and fio with 8 jobs and QD=8 sync,direct 4K 
> read/writes produced 2,600 IOPs. Don't get me wrong, it will help, but don't 
> expect spectacular results.
> 
> - 
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> 
> On Tue, Sep 1, 2015 at 8:01 AM, Kenneth Van Alstyne  wrote:
> Thanks for the awesome advice folks.  Until I can go larger scale (50+ SATA 
> disks), I’m thinking my best option here is to just swap out these 1TB SATA 
> disks with 1TB SSDs.  Am I oversimplifying the short term solution?
> 
> Thanks,
> 
> - --
> Kenneth Van Alstyne
> Systems Architect
> Knight Point Systems, LLC
> Service-Disabled Veteran-Owned Business
> 1775 Wiehle Avenue Suite 101 | Reston, VA 20190
> c: 228-547-8045 f: 571-266-3106
> www.knightpoint.com <http://www.knightpoint.com/> 
> DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
> GSA Schedule 70 SDVOSB: GS-35F-0646S
> GSA MOBIS Schedule: GS-10F-0404Y
> ISO 2 / ISO 27001
> 
> Notice: This e-mail message, including any attachments, is for the sole use 
> of the intended recipient(s) and may contain confidential and privileged 
> information. Any unauthorized review, copy, use, disclosure, or distribution 
> is STRICTLY prohibited. If you are not the intended recipient, please contact 
> the sender by reply e-mail and destroy all copies of the original message.
> 
> On Aug 31, 2015, at 7:29 PM, Christian Balzer  wrote:
> 
> 
> Hello,
> 
> On Mon, 31 Aug 2015 12:28:15 -0500 Kenneth Van Alstyne wrote:
> 
> In addition to the spot on comments by Warren and Quentin, verify this by
> watching your nodes with atop, iostat, etc. 
> The culprit (HDDs) should be plainly visible.
> 
> More inline:
> 
> Christian, et al:
> 
> Sorry for the lack of information.  I wasn’t sure what of our hardware
> specifications or Ceph configuration was useful information at this
> point.  Thanks for the feedback — any feedback, is appreciated at this
> point, as I’ve been beating my head against a wall trying to figure out
> what’s going on.  (If anything.  Maybe the spindle count is indeed our
> upper limit or our SSDs really suck? :-) )
> 
> Your SSDs aren't the problem.
> 
> To directly address your questions, see answers below:
>   - CBT is the Ceph Benchmarking Tool.  Since my question was more
> generic rather than with CBT itself, it was probably more useful to post
> in the ceph-users list rather than cbt.
>   - 8 Cores are from 2x quad core Intel(R) Xeon(R) CPU E5-2609 0 @
> 2.40GHz
> Not your problem either.
> 
>   - The SSDs are indeed Intel S3500s.  I agree — not ideal, but
> supposedly capable of up to 75,000 random 4KB reads/writes.  Throughput
> and longevity is quite low for an SSD, rated at about 400MB/s reads and
> 100MB/s writes, though.  When we added these as journals in front of the
> SATA spindles, both VM performance and rados benchmark numbers were
> relatively unchanged.
> 
> The only thing relevant in regards to journal SSDs is the sequential write
> speed (SYNC), they don't seek and normally don't get read either.
> This is why a 200GB DC S3700 is a better journal SSD than the 200GB S3710
> which is faster in any other aspect but sequential writes. ^o^
> 
> Latency should have gone down with the SSD journals in place, but that's
>

Re: [ceph-users] Ceph Performance Questions with rbd images access by qemu-kvm

2015-09-01 Thread Kenneth Van Alstyne

Thanks for the awesome advice folks.  Until I can go larger scale (50+ SATA 
disks), I’m thinking my best option here is to just swap out these 1TB SATA 
disks with 1TB SSDs.  Am I oversimplifying the short term solution?

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com 
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 2 / ISO 27001

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

> On Aug 31, 2015, at 7:29 PM, Christian Balzer  wrote:
> 
> 
> Hello,
> 
> On Mon, 31 Aug 2015 12:28:15 -0500 Kenneth Van Alstyne wrote:
> 
> In addition to the spot on comments by Warren and Quentin, verify this by
> watching your nodes with atop, iostat, etc. 
> The culprit (HDDs) should be plainly visible.
> 
> More inline:
> 
>> Christian, et al:
>> 
>> Sorry for the lack of information.  I wasn’t sure what of our hardware
>> specifications or Ceph configuration was useful information at this
>> point.  Thanks for the feedback — any feedback, is appreciated at this
>> point, as I’ve been beating my head against a wall trying to figure out
>> what’s going on.  (If anything.  Maybe the spindle count is indeed our
>> upper limit or our SSDs really suck? :-) )
>> 
> Your SSDs aren't the problem.
> 
>> To directly address your questions, see answers below:
>>  - CBT is the Ceph Benchmarking Tool.  Since my question was more
>> generic rather than with CBT itself, it was probably more useful to post
>> in the ceph-users list rather than cbt.
>>  - 8 Cores are from 2x quad core Intel(R) Xeon(R) CPU E5-2609 0 @
>> 2.40GHz
> Not your problem either.
> 
>>  - The SSDs are indeed Intel S3500s.  I agree — not ideal, but
>> supposedly capable of up to 75,000 random 4KB reads/writes.  Throughput
>> and longevity is quite low for an SSD, rated at about 400MB/s reads and
>> 100MB/s writes, though.  When we added these as journals in front of the
>> SATA spindles, both VM performance and rados benchmark numbers were
>> relatively unchanged.
>> 
> The only thing relevant in regards to journal SSDs is the sequential write
> speed (SYNC), they don't seek and normally don't get read either.
> This is why a 200GB DC S3700 is a better journal SSD than the 200GB S3710
> which is faster in any other aspect but sequential writes. ^o^
> 
> Latency should have gone down with the SSD journals in place, but that's
> their main function/benefit. 
> 
>>  - Regarding throughput vs iops, indeed — the throughput that I’m
>> seeing is nearly worst case scenario, with all I/O being 4KB block
>> size.  With RBD cache enabled and the writeback option set in the VM
>> configuration, I was hoping more coalescing would occur, increasing the
>> I/O block size.
>> 
> That can only help with non-SYNC writes, so your MySQL VMs and certain
> file system ops will have to bypass that and that hurts.
> 
>> As an aside, the orchestration layer on top of KVM is OpenNebula if
>> that’s of any interest.
>> 
> It is actually, as I've been eying OpenNebula (alas no Debian Jessie
> packages). However not relevant to your problem indeed.
> 
>> VM information:
>>  - Number = 15
>>  - Worload = Mixed (I know, I know — that’s as vague of an answer
>> as they come)  A handful of VMs are running some MySQL databases and
>> some web applications in Apache Tomcat.  One is running a syslog
>> server.  Everything else is mostly static web page serving for a low
>> number of users.
>> 
> As others have mentioned, would you expect this load to work well with
> just 2 HDDs and via NFS to introduce network latency?
> 
>> I can duplicate the blocked request issue pretty consistently, just by
>> running something simple like a “yum -y update” in one VM.  While that
>> is running, ceph -w and ceph -s show the following: root@dashboard:~#
>> ceph -s cluster f79d8c2a-3c14-49be-942d-83fc5f193a25 health HEALTH_WARN
>>1 requests are blocked > 32 sec
>> monmap e3: 3 mons at
>> {storage-1=10.0.0.1:6789/0,storage-2=10.0.0.2:6789/0,storage-3=10.0.0.3:6789/0}
>> election epoch 136, quorum 0,

Re: [ceph-users] Ceph Performance Questions with rbd images access by qemu-kvm

2015-08-31 Thread Kenneth Van Alstyne

 If spindle count is indeed the 
problem, is there anything else I can do to improve caching or I/O coalescing 
to deal with my crippling IOP limit due to the low number of spindles?

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com 
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 2 / ISO 27001

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

> On Aug 31, 2015, at 11:01 AM, Christian Balzer  wrote:
> 
> 
> Hello,
> 
> On Mon, 31 Aug 2015 08:31:57 -0500 Kenneth Van Alstyne wrote:
> 
>> Sorry about the repost from the cbt list, but it was suggested I post
>> here as well:
>> 
> I wasn't even aware a CBT (what the heck does that acronym stand for?)
> existed...
> 
>> I am attempting to track down some performance issues in a Ceph cluster
>> recently deployed.  Our configuration is as follows: 3 storage nodes,
> 3 nodes is, of course, bare minimum. 
> 
>> each with:
>>  - 8 Cores
> Of what, apples? Detailed information makes for better replies.
> 
>>  - 64GB of RAM
> Ample.
> 
>>  - 2x 1TB 7200 RPM Spindle
> Even if your cores where to be rotten apple ones, that's very few
> spindles, so your CPU is unlikely to be the bottleneck.
> 
>>  - 1x 120GB Intel SSD
> Details, again. From your P.S. I conclude that these are S3500's,
> definitely not my choice for journals when it comes to speed and endurance.
> 
>>  - 2x 10GBit NICs (In LACP Port-channel)
> Massively overspec'ed considering your storage sinks/wells aka HDDs.
> 
>> 
>> The OSD pool min_size is set to “1” and “size” is set to “3”.  When
>> creating a new pool and running RADOS benchmarks, performance isn’t bad
>> — about what I would expect from this hardware configuration:
>> 
> Rados bench uses by default 4MB "blocks", which is the optimum size for
> (default) RBD pools.
> Bandwidth does not equal IOPS (which are commonly measured in 4KB blocks).
> 
>> WRITES:
>> Total writes made:  207
>> Write size: 4194304
>> Bandwidth (MB/sec): 80.017 
>> 
>> Stddev Bandwidth:   34.9212
>> Max bandwidth (MB/sec): 120
>> Min bandwidth (MB/sec): 0
>> Average Latency:0.797667
>> Stddev Latency: 0.313188
>> Max latency:1.72237
>> Min latency:0.253286
>> 
>> RAND READS:
>> Total time run:10.127990
>> Total reads made: 1263
>> Read size:4194304
>> Bandwidth (MB/sec):498.816 
>> 
>> Average Latency:   0.127821
>> Max latency:   0.464181
>> Min latency:   0.0220425
>> 
>> This all looks fine, until we try to use the cluster for its purpose,
>> which is to house images for qemu-kvm, which are access using librbd.
> Not that it probably matters, but knowing if this Openstack, Ganeti or
> something else might be of interest.
> 
>> I/O inside VMs have excessive I/O wait times (in the hundreds of ms at
>> times, making some operating systems, like Windows unusable) and
>> throughput struggles to exceed 10MB/s (or less).  Looking at ceph
>> health, we see very low op/s numbers as well as throughput and the
>> requests blocked number seems very high.  Any ideas as to what to look
>> at here?
>> 
> Again, details.
> 
> How many VMs? 
> What are they doing? 
> Keep in mind that the BEST sustained result you could hope for here
> (ignoring Ceph overhead and network latency) is the IOPS of 2 HDDs, so
> about 300 IOPS at best. TOTAL.
> 
>> health HEALTH_WARN
>>8 requests are blocked > 32 sec
>> monmap e3: 3 mons at
>> {storage-1=10.0.0.1:6789/0,storage-2=10.0.0.2:6789/0,storage-3=10.0.0.3:6789/0}
>> election epoch 128, quorum 0,1,2 storage-1,storage-2,storage-3 osdmap
>> e69615: 6 osds: 6 up, 6 in pgmap v3148541: 224 pgs, 1 pools, 819 GB
> 256 or 512 PGs would have been the "correct" number here, but that's of
> little importance. 
> 
>> data, 227 kobjects 2726 GB used, 2844 GB / 5571 GB avail
>> 224 active+cle

[ceph-users] Ceph Performance Questions with rbd images access by qemu-kvm

2015-08-31 Thread Kenneth Van Alstyne

Sorry about the repost from the cbt list, but it was suggested I post here as 
well:

I am attempting to track down some performance issues in a Ceph cluster 
recently deployed.  Our configuration is as follows:
3 storage nodes, each with:
- 8 Cores
- 64GB of RAM
- 2x 1TB 7200 RPM Spindle
- 1x 120GB Intel SSD
- 2x 10GBit NICs (In LACP Port-channel)

The OSD pool min_size is set to “1” and “size” is set to “3”.  When creating a 
new pool and running RADOS benchmarks, performance isn’t bad — about what I 
would expect from this hardware configuration:

WRITES:
Total writes made:  207
Write size: 4194304
Bandwidth (MB/sec): 80.017 

Stddev Bandwidth:   34.9212
Max bandwidth (MB/sec): 120
Min bandwidth (MB/sec): 0
Average Latency:0.797667
Stddev Latency: 0.313188
Max latency:1.72237
Min latency:0.253286

RAND READS:
Total time run:10.127990
Total reads made: 1263
Read size:4194304
Bandwidth (MB/sec):498.816 

Average Latency:   0.127821
Max latency:   0.464181
Min latency:   0.0220425

This all looks fine, until we try to use the cluster for its purpose, which is 
to house images for qemu-kvm, which are access using librbd.  I/O inside VMs 
have excessive I/O wait times (in the hundreds of ms at times, making some 
operating systems, like Windows unusable) and throughput struggles to exceed 
10MB/s (or less).  Looking at ceph health, we see very low op/s numbers as well 
as throughput and the requests blocked number seems very high.  Any ideas as to 
what to look at here?

 health HEALTH_WARN
8 requests are blocked > 32 sec
 monmap e3: 3 mons at 
{storage-1=10.0.0.1:6789/0,storage-2=10.0.0.2:6789/0,storage-3=10.0.0.3:6789/0}
election epoch 128, quorum 0,1,2 storage-1,storage-2,storage-3
 osdmap e69615: 6 osds: 6 up, 6 in
  pgmap v3148541: 224 pgs, 1 pools, 819 GB data, 227 kobjects
2726 GB used, 2844 GB / 5571 GB avail
 224 active+clean
  client io 3957 B/s rd, 3494 kB/s wr, 30 op/s

Of note, on the other list, I was asked to provide the following:
- ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)
- The SSD is split into 8GB partitions. These 8GB partitions are used 
as journal devices, specified in /etc/ceph/ceph.conf.  For example:
[osd.0]
host = storage-1
osd journal = 
/dev/mapper/INTEL_SSDSC2BB120G4_CVWL4363006R120LGNp1
- rbd_cache is enabled and qemu cache is set to “writeback"
- rbd_concurrent_management_ops is unset, so it appears the default is 
“10”

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com 
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 2 / ISO 27001

Notice: This e-mail message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, copy, use, disclosure, or distribution is 
STRICTLY prohibited. If you are not the intended recipient, please contact the 
sender by reply e-mail and destroy all copies of the original message.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Panic in kernel CephFS client after kernel update

[ceph-users] Panic in kernel CephFS client after kernel update

Re: [ceph-users] Ceph capacity versus pool replicated size discrepancy?

[ceph-users] Ceph capacity versus pool replicated size discrepancy?

Re: [ceph-users] Data distribution question

Re: [ceph-users] Data distribution question

Re: [ceph-users] VM management setup

Re: [ceph-users] Nautilus upgrade but older releases reported by features

Re: [ceph-users] Filestore OSD on CephFS?

Re: [ceph-users] Filestore OSD on CephFS?

Re: [ceph-users] Filestore OSD on CephFS?

[ceph-users] Filestore OSD on CephFS?

Re: [ceph-users] RBD Mirror Proxy Support?

Re: [ceph-users] RBD Mirror Proxy Support?

Re: [ceph-users] RBD Mirror Proxy Support?

[ceph-users] RBD Mirror Proxy Support?

Re: [ceph-users] Image has watchers, but cannot determine why

[ceph-users] Image has watchers, but cannot determine why

Re: [ceph-users] Anyone tested Samsung 860 DCT SSDs?

[ceph-users] Anyone tested Samsung 860 DCT SSDs?

Re: [ceph-users] OSD Crash When Upgrading from Jewel to Luminous?

[ceph-users] OSD Crash When Upgrading from Jewel to Luminous?

[ceph-users] Snapshot cleanup performance impact on client I/O?

Re: [ceph-users] Ceph Performance Questions with rbd images access by qemu-kvm

Re: [ceph-users] Ceph Performance Questions with rbd images access by qemu-kvm

Re: [ceph-users] Ceph Performance Questions with rbd images access by qemu-kvm

[ceph-users] Ceph Performance Questions with rbd images access by qemu-kvm

27 matches

Site Navigation

Mail list logo

Footer information