Re: [ceph-users] VM management setup

2019-04-24 Thread Alexandre DERUMIER
+1 for proxmox. (I'm contributor and I can say that ceph support is very good)

- Mail original -
De: jes...@krogh.cc
À: "ceph-users" 
Envoyé: Vendredi 5 Avril 2019 21:34:02
Objet: [ceph-users] VM management setup

Hi. Knowing this is a bit off-topic but seeking recommendations 
and advise anyway. 

We're seeking a "management" solution for VM's - currently in the 40-50 
VM - but would like to have better access in managing them and potintially 
migrate them across multiple hosts, setup block devices, etc, etc. 

This is only to be used internally in a department where a bunch of 
engineering people will manage it, no costumers and that kind of thing. 

Up until now we have been using virt-manager with kvm - and have been 
quite satisfied when we were in the "few vms", but it seems like the 
time to move on. 

Thus we're looking for something "simple" that can help manage a ceph+kvm 
based setup - the simpler and more to the point the better. 

Any recommendations? 

.. found a lot of names allready .. 
OpenStack 
CloudStack 
Proxmox 
.. 

But recommendations are truely welcome. 

Thanks. 

___ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] msgr2 and cephfs

2019-04-24 Thread Aaron Bassett
Ah nevermind, I found ceph mon set addrs and I'm good to go. 

Aaron

> On Apr 24, 2019, at 4:36 PM, Aaron Bassett  
> wrote:
> 
> Yea ok thats what I guessed. I'm struggling to get my mons to listen on both 
> ports. On startup they report:
> 
> 2019-04-24 19:58:43.652 7fcf9cd3c040 -1 WARNING: 'mon addr' config option 
> [v2:172.17.40.143:3300/0,v1:172.17.40.143:6789/0] does not match monmap file
> continuing with monmap configuration
> 2019-04-24 19:58:43.652 7fcf9cd3c040  0 starting mon.bos-r1-r3-head1 rank 0 
> at public addrs v2:172.17.40.143:3300/0 at bind addrs v2:172.17.40.143:3300/0 
> mon_data /var/lib/ceph/mon/ceph-bos-r1-r3-head1 fsid 
> 4a361f9c-e28b-4b6b-ab59-264dcb51da97
> 
> 
> which means i assume I have to jump through the add/remove mons hoops or just 
> burn it down and start over? FWIW the docs seem to indicate they'll listen to 
> both by default (in nautilus).
> 
> Aaron
> 
>> On Apr 24, 2019, at 4:29 PM, Jason Dillaman  wrote:
>> 
>> AFAIK, the kernel clients for CephFS and RBD do not support msgr2 yet.
>> 
>> On Wed, Apr 24, 2019 at 4:19 PM Aaron Bassett
>>  wrote:
>>> 
>>> Hi,
>>> I'm standing up a new cluster on nautilus to play with some of the new 
>>> features, and I've somehow got my monitors only listening on msgrv2 port 
>>> (3300) and not the legacy port (6789). I'm running kernel 4.15 on my 
>>> clients. Can I mount cephfs via port 3300 or do I have to figure out how to 
>>> get my mons listening to both?
>>> 
>>> Thanks,
>>> Aaron
>>> CONFIDENTIALITY NOTICE
>>> This e-mail message and any attachments are only for the use of the 
>>> intended recipient and may contain information that is privileged, 
>>> confidential or exempt from disclosure under applicable law. If you are not 
>>> the intended recipient, any disclosure, distribution or other use of this 
>>> e-mail message or attachments is prohibited. If you have received this 
>>> e-mail message in error, please delete and notify the sender immediately. 
>>> Thank you.
>>> 
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=DwIFaQ=Tpa2GKmmYSmpYS4baANxQwQYqA0vwGXwkJOPBegaiTs=5nKer5huNDFQXjYpOR4o_7t5CRI8wb5Vb_v1pBywbYw=zjPqBuK3C5vPalm69GpAWDz3vdkT0jYEVhvV0NG3OyI=wUk0q5ArWhrXvqzMNGRcL3qzKPjAoDQ481ek_5j4BQ0=
>> 
>> 
>> 
>> -- 
>> Jason
> 


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] msgr2 and cephfs

2019-04-24 Thread Aaron Bassett
Yea ok thats what I guessed. I'm struggling to get my mons to listen on both 
ports. On startup they report:

2019-04-24 19:58:43.652 7fcf9cd3c040 -1 WARNING: 'mon addr' config option 
[v2:172.17.40.143:3300/0,v1:172.17.40.143:6789/0] does not match monmap file
 continuing with monmap configuration
2019-04-24 19:58:43.652 7fcf9cd3c040  0 starting mon.bos-r1-r3-head1 rank 0 at 
public addrs v2:172.17.40.143:3300/0 at bind addrs v2:172.17.40.143:3300/0 
mon_data /var/lib/ceph/mon/ceph-bos-r1-r3-head1 fsid 
4a361f9c-e28b-4b6b-ab59-264dcb51da97


which means i assume I have to jump through the add/remove mons hoops or just 
burn it down and start over? FWIW the docs seem to indicate they'll listen to 
both by default (in nautilus).

Aaron

> On Apr 24, 2019, at 4:29 PM, Jason Dillaman  wrote:
> 
> AFAIK, the kernel clients for CephFS and RBD do not support msgr2 yet.
> 
> On Wed, Apr 24, 2019 at 4:19 PM Aaron Bassett
>  wrote:
>> 
>> Hi,
>> I'm standing up a new cluster on nautilus to play with some of the new 
>> features, and I've somehow got my monitors only listening on msgrv2 port 
>> (3300) and not the legacy port (6789). I'm running kernel 4.15 on my 
>> clients. Can I mount cephfs via port 3300 or do I have to figure out how to 
>> get my mons listening to both?
>> 
>> Thanks,
>> Aaron
>> CONFIDENTIALITY NOTICE
>> This e-mail message and any attachments are only for the use of the intended 
>> recipient and may contain information that is privileged, confidential or 
>> exempt from disclosure under applicable law. If you are not the intended 
>> recipient, any disclosure, distribution or other use of this e-mail message 
>> or attachments is prohibited. If you have received this e-mail message in 
>> error, please delete and notify the sender immediately. Thank you.
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=DwIFaQ=Tpa2GKmmYSmpYS4baANxQwQYqA0vwGXwkJOPBegaiTs=5nKer5huNDFQXjYpOR4o_7t5CRI8wb5Vb_v1pBywbYw=zjPqBuK3C5vPalm69GpAWDz3vdkT0jYEVhvV0NG3OyI=wUk0q5ArWhrXvqzMNGRcL3qzKPjAoDQ481ek_5j4BQ0=
> 
> 
> 
> -- 
> Jason


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] msgr2 and cephfs

2019-04-24 Thread Jason Dillaman
AFAIK, the kernel clients for CephFS and RBD do not support msgr2 yet.

On Wed, Apr 24, 2019 at 4:19 PM Aaron Bassett
 wrote:
>
> Hi,
> I'm standing up a new cluster on nautilus to play with some of the new 
> features, and I've somehow got my monitors only listening on msgrv2 port 
> (3300) and not the legacy port (6789). I'm running kernel 4.15 on my clients. 
> Can I mount cephfs via port 3300 or do I have to figure out how to get my 
> mons listening to both?
>
> Thanks,
> Aaron
> CONFIDENTIALITY NOTICE
> This e-mail message and any attachments are only for the use of the intended 
> recipient and may contain information that is privileged, confidential or 
> exempt from disclosure under applicable law. If you are not the intended 
> recipient, any disclosure, distribution or other use of this e-mail message 
> or attachments is prohibited. If you have received this e-mail message in 
> error, please delete and notify the sender immediately. Thank you.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] msgr2 and cephfs

2019-04-24 Thread Aaron Bassett
Hi,
I'm standing up a new cluster on nautilus to play with some of the new 
features, and I've somehow got my monitors only listening on msgrv2 port (3300) 
and not the legacy port (6789). I'm running kernel 4.15 on my clients. Can I 
mount cephfs via port 3300 or do I have to figure out how to get my mons 
listening to both?

Thanks,
Aaron
CONFIDENTIALITY NOTICE
This e-mail message and any attachments are only for the use of the intended 
recipient and may contain information that is privileged, confidential or 
exempt from disclosure under applicable law. If you are not the intended 
recipient, any disclosure, distribution or other use of this e-mail message or 
attachments is prohibited. If you have received this e-mail message in error, 
please delete and notify the sender immediately. Thank you.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] VM management setup

2019-04-24 Thread ceph
Hello,

I would also recommend proxmox
It is very easy to install and to Manage your kvm/lxc with Huge amount of 
Support for possible storages.

Just my 2 Cents
Hth
- Mehmet 


Am 6. April 2019 17:48:32 MESZ schrieb Marc Roos :
>
>We have also hybrid ceph/libvirt-kvm setup, using some scripts to do 
>live migration, do you have auto failover in your setup?
>
>
>
>-Original Message-
>From: jes...@krogh.cc [mailto:jes...@krogh.cc] 
>Sent: 05 April 2019 21:34
>To: ceph-users
>Subject: [ceph-users] VM management setup
>
>Hi. Knowing this is a bit off-topic but seeking recommendations and 
>advise anyway.
>
>We're seeking a "management" solution for VM's - currently in the 40-50
>
>VM - but would like to have better access in managing them and 
>potintially migrate them across multiple hosts, setup block devices, 
>etc, etc.
>
>This is only to be used internally in a department where a bunch of 
>engineering people will manage it, no costumers and that kind of thing.
>
>Up until now we have been using virt-manager with kvm - and have been 
>quite satisfied when we were in the "few vms", but it seems like the 
>time to move on.
>
>Thus we're looking for something "simple" that can help manage a 
>ceph+kvm based setup -  the simpler and more to the point the better.
>
>Any recommendations?
>
>.. found a lot of names allready ..
>OpenStack
>CloudStack
>Proxmox
>..
>
>But recommendations are truely welcome.
>
>Thanks.
>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] rbd omap disappeared

2019-04-24 Thread lin zhou
my cluster occur a big error this morning.
many osd suicide because of heartbeat_map timeout.
when I start all osd manually.it looks well.

but when I using rbd info for a rbd from rbd ls, it say not such file or
directory.
And the I use the way in
https://fnordahl.com/2017/04/17/ceph-rbd-volume-header-recovery/ to
recovery omap.

At last,most rbd works well.But still have 5 rbd can not reocvery.

2019-04-24 16:00:40.045342 7fd5fdffb700 -1 librbd::image::OpenRequest:
failed to retreive immutable metadata: (5) Input/output error
2019-04-24 16:00:40.045419 7fd5fd7fa700 -1 librbd::ImageState:
0x5593262fb830 failed to open image: (5) Input/output error
rbd: error opening image volume-bef74858-0fcb-4a0b-b197-9618e6824c46: (5)
Input/output error

I try to locate its omap head object, and stop master osd, delete the pg
data in master osd, then restart master osd. It looks three replication is
the same  when I using attr.

But I can not get the rbd info.Alwayse the vm among this  rbd is still
alive and works well.

ceph -s
cluster 2bec9425-ea5f-4a48-b56a-fe88e126bced
 health HEALTH_WARN
noout flag(s) set
 monmap e1: 3 mons at {a=
10.191.175.249:6789/0,b=10.191.175.250:6789/0,c=10.191.175.251:6789/0}
election epoch 26, quorum 0,1,2 a,b,c
 osdmap e22551: 1080 osds: 1078 up, 1078 in
flags noout,sortbitwise,require_jewel_osds
  pgmap v29327873: 90112 pgs, 3 pools, 69753 GB data, 30081 kobjects
214 TB used, 1500 TB / 1715 TB avail
   90111 active+clean
   1 active+clean+scrubbing+deep
  client io 57082 kB/s rd, 207 MB/s wr, 1091 op/s rd, 7658 op/s wr

the log of osd.219 which is the firstest log heartbeat no reply.

2019-04-24 04:00:54.905504 7f5eb7fab700  1 heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7f5e97182700' had timed out after 15
...
2019-04-24 04:00:54.905536 7f5eb7fab700  1 heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7f5e9b18a700' had timed out after 15
2019-04-24 04:00:57.499903 7f5df2997700  0 -- 10.191.175.20:6855/3169631 >>
10.191.175.33:6947/3001469 pipe(0x55d93b470800 sd=1914 :6855 s=2 pgs=514958
cs=17 l=0 c=0x55d91a3c1200).fault with nothing to send, going to standby
... repeat large
2019-04-24 04:00:57.651603 7f5e77735700  0 -- 10.191.175.20:6855/3169631 >>
10.191.175.36:6845/1062649 pipe(0x55d92d2cb400 sd=105 :45295 s=2 pgs=493749
cs=11 l=0 c=0x55d943755200).fault with nothing to send, going to standby
2019-04-24 04:00:59.905846 7f5eb7fab700  1 heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7f5e97182700' had timed out after 15
...
2019-04-24 04:01:04.666159 7f5dbb6d7700  0 -- 10.191.175.20:6856/3169631 >>
:/0 pipe(0x55d9234b1400 sd=1926 :6856 s=0 pgs=0 cs=0 l=0
c=0x55d924743600).accept failed to getpeername (107) Transport endpoint is
not connected
2019-04-24 04:01:04.905958 7f5eb7fab700  1 heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7f5e97182700' had timed out after 15
2019-04-24 04:01:14.550812 7f5db7793700  0 -- 10.191.175.20:6856/3169631 >>
:/0 pipe(0x55d9234b2800 sd=1927 :6856 s=0 pgs=0 cs=0 l=0
c=0x55d924742a00).accept failed to getpeername (107) Transport endpoint is
not connected
...
2019-04-24 04:01:14.691430 7f5db7793700  0 -- 10.191.175.20:6858/3169631 >>
:/0 pipe(0x55d936a78800 sd=1383 :6858 s=0 pgs=0 cs=0 l=0
c=0x55d94ecf5a80).accept failed to getpeername (107) Transport endpoint is
not connected
2019-04-24 04:01:14.692176 7f5e90469700  0 -- 10.191.175.20:6855/3169631 >>
10.191.175.31:6835/2092007 pipe(0x55d93c41a800 sd=67 :17328 s=2 pgs=571754
cs=83 l=0 c=0x55d939913180).fault, initiating reconnect
2019-04-24 04:01:14.693742 7f5e4c889700  0 -- 10.191.175.20:6855/3169631 >>
10.191.175.31:6835/2092007 pipe(0x55d93c41a800 sd=67 :17766 s=1 pgs=571754
cs=84 l=0 c=0x55d939913180).connect got RESETSESSION
2019-04-24 04:01:14.697098 7f5db256c700  0 -- 10.191.175.20:6908/169631 >>
:/0 pipe(0x55d95606c800 sd=352 :6908 s=0 pgs=0 cs=0 l=0
c=0x55d920d79500).accept failed to getpeername (107) Transport endpoint is
not connected
2019-04-24 04:01:14.697516 7f5e08523700  0 -- 10.191.175.20:6908/169631 >>
:/0 pipe(0x55d91b30c000 sd=1926 :6908 s=0 pgs=0 cs=0 l=0
c=0x55d9263cb780).accept failed to getpeername (107) Transport endpoint is
not connected
...
2019-04-24 04:01:14.704225 7f5dd7790700  0 -- 10.191.175.20:6908/169631 >>
:/0 pipe(0x55d9531c5400 sd=1927 :6908 s=0 pgs=0 cs=0 l=0
c=0x55d9263ca280).accept failed to getpeername (107) Transport endpoint is
not connected
2019-04-24 04:01:14.704511 7f5e78c4a700  0 -- 10.191.175.20:6855/3169631 >>
10.191.175.37:6801/2131766 pipe(0x55d939274800 sd=507 :10332 s=1 pgs=604906
cs=3 l=0 c=0x55d925714600).connect got RESETSESSION
...
2019-04-24 04:01:14.705970 7f5dc4980700  0 -- 10.191.175.20:6855/3169631 >>
10.191.175.38:6833/3181256 pipe(0x55d950fed400 sd=455 :58907 s=1 pgs=563194
cs=17 l=0 c=0x55d9320a2480).connect got RESETSESSION
2019-04-24 04:01:14.696315 7f5db548e700  0 -- 10.191.175.20:6908/169631 >>
:/0 pipe(0x55d939e45400 sd=1929 :6908 s=0 

[ceph-users] unable to manually flush cache: failed to flush /xxx: (2) No such file or directory

2019-04-24 Thread Nikola Ciprich
Hi,

we're having issue on one of our clusters, while wanting
to remove cache tier, trying to manually flush cache always
ends up with error:

rados -p ssd-cache cache-flush-evict-all
.
.
.
failed to flush /rb.0.965780.238e1f29.1641: (2) No such file or 
directory
   rb.0.965780.238e1f29.02c8
failed to flush /rb.0.965780.238e1f29.02c8: (2) No such file or 
directory
   rb.0.965780.238e1f29.9113
failed to flush /rb.0.965780.238e1f29.9113: (2) No such file or 
directory
   rb.0.965780.238e1f29.9b0f
failed to flush /rb.0.965780.238e1f29.9b0f: (2) No such file or 
directory
   rb.0.965780.238e1f29.62b6
failed to flush /rb.0.965780.238e1f29.62b6: (2) No such file or 
directory
   rb.0.965780.238e1f29.030c
.
.
.


cluster is healthy, running 13.2.5

any idea on what might be wrong?

should I provide more details, please let me know

BR

nik



-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] getting pg inconsistent periodly

2019-04-24 Thread Zhenshi Zhou
Hi,

I remember that there's some bug about cephfs when upgrading from 12.2.5.
Is it safe to upgrade the cluster now?

Thanks

Janne Johansson  于2019年4月24日周三 下午4:06写道:

>
>
> Den ons 24 apr. 2019 kl 08:46 skrev Zhenshi Zhou :
>
>> Hi,
>>
>> I'm running a cluster for a period of time. I find the cluster usually
>> run into unhealthy state recently.
>>
>> With 'ceph health detail', one or two pg are inconsistent. What's
>> more, pg in wrong state each day are not placed on the same disk,
>> so that I don't think it's a disk problem.
>>
>> The cluster is using version 12.2.5. Any idea about this strange issue?
>>
>>
> There was lots of fixes for releases around that version,
> do read https://ceph.com/releases/12-2-7-luminous-released/
> and later release notes on the 12.2.x series.
>
>
> --
> May the most significant bit of your life be positive.
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] getting pg inconsistent periodly

2019-04-24 Thread Janne Johansson
Den ons 24 apr. 2019 kl 08:46 skrev Zhenshi Zhou :

> Hi,
>
> I'm running a cluster for a period of time. I find the cluster usually
> run into unhealthy state recently.
>
> With 'ceph health detail', one or two pg are inconsistent. What's
> more, pg in wrong state each day are not placed on the same disk,
> so that I don't think it's a disk problem.
>
> The cluster is using version 12.2.5. Any idea about this strange issue?
>
>
There was lots of fixes for releases around that version,
do read https://ceph.com/releases/12-2-7-luminous-released/
and later release notes on the 12.2.x series.


-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] getting pg inconsistent periodly

2019-04-24 Thread Zhenshi Zhou
Hi,

I'm running a cluster for a period of time. I find the cluster usually
run into unhealthy state recently.

With 'ceph health detail', one or two pg are inconsistent. What's
more, pg in wrong state each day are not placed on the same disk,
so that I don't think it's a disk problem.

The cluster is using version 12.2.5. Any idea about this strange issue?

Thanks
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com