Re: [ceph-users] [URGENT]. Can't connect to CEPH after upgrade from 0.72 to 0.80
Hi Mark, update: after restarting libvirtd and cloudstack-agent and management server God know how many times - it WORKS now ! Not sure what is happening here, but it works again... I know for sure it was not CEPH cluster, since it was fine, and accessible via qemu-img, etc... Thanks Mark for your time for my issue... Best. Andrija On 13 July 2014 10:20, Mark Kirkwood wrote: > On 13/07/14 19:15, Mark Kirkwood wrote: > >> On 13/07/14 18:38, Andrija Panic wrote: >> > > Any suggestion on need to recompile libvirt ? I got info from Wido, that >>> libvirt does NOT need to be recompiled >>> >>> > Thinking about this a bit more - Wido *may* have meant: > > - *libvirt* does not need to be rebuild > - ...but you need to get/build a later ceph client i.e - 0.80 > > Of course depending on how your libvirt build was set up (e.g static > linkage), this *might* have meant you needed to rebuild it too. > > Regards > > Mark > > -- Andrija Panić -- http://admintweets.com -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] [URGENT]. Can't connect to CEPH after upgrade from 0.72 to 0.80
On 13/07/14 19:15, Mark Kirkwood wrote: On 13/07/14 18:38, Andrija Panic wrote: Any suggestion on need to recompile libvirt ? I got info from Wido, that libvirt does NOT need to be recompiled Thinking about this a bit more - Wido *may* have meant: - *libvirt* does not need to be rebuild - ...but you need to get/build a later ceph client i.e - 0.80 Of course depending on how your libvirt build was set up (e.g static linkage), this *might* have meant you needed to rebuild it too. Regards Mark ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] [URGENT]. Can't connect to CEPH after upgrade from 0.72 to 0.80
On 13/07/14 18:38, Andrija Panic wrote: Hi Mark, actually, CEPH is running fine, and I have deployed NEW host (new compile libvirt with ceph 0.8 devel, and newer kernel) - and it works... so migrating some VMs to this new host... I have 3 physical hosts, that are both MON and 2x OSD per host, all3 don't work-cloudstack/libvirt... Any suggestion on need to recompile libvirt ? I got info from Wido, that libvirt does NOT need to be recompiled Looking at the differences between src/include/ceph_features.h in 0.72 and 0.81 [1] (note, not *quite* the same version as you are using), there's erasure codes and other new features that are advertised by the later version that the client will need to match. Now *some* of these (crush tunables) can be switched off via: $ ceph osd crush tunables legacy ...which would have been worth a try, but my guess is would not have worked, as (for example) I *don't* think erasure codes feature can be switched off. Hence, unless I'm mistaken (which is always possible) I think you did in fact need to recompile. regards Mark [1] e.g: --- ceph_features.h.72 2014-07-13 19:00:36.805825203 +1200 +++ ceph_features.h.81 2014-07-13 19:02:22.065826068 +1200 @@ -40,6 +40,18 @@ #define CEPH_FEATURE_MON_SCRUB (1ULL<<33) #define CEPH_FEATURE_OSD_PACKED_RECOVERY (1ULL<<34) #define CEPH_FEATURE_OSD_CACHEPOOL (1ULL<<35) +#define CEPH_FEATURE_CRUSH_V2 (1ULL<<36) /* new indep; SET_* steps */ +#define CEPH_FEATURE_EXPORT_PEER (1ULL<<37) +#define CEPH_FEATURE_OSD_ERASURE_CODES (1ULL<<38) +#define CEPH_FEATURE_OSD_TMAP2OMAP (1ULL<<38) /* overlap with EC */ +/* The process supports new-style OSDMap encoding. Monitors also use + this bit to determine if peers support NAK messages. */ +#define CEPH_FEATURE_OSDMAP_ENC(1ULL<<39) +#define CEPH_FEATURE_MDS_INLINE_DATA (1ULL<<40) +#define CEPH_FEATURE_CRUSH_TUNABLES3 (1ULL<<41) +#define CEPH_FEATURE_OSD_PRIMARY_AFFINITY (1ULL<<41) /* overlap w/ tunables3 */ +#define CEPH_FEATURE_MSGR_KEEPALIVE2 (1ULL<<42) +#define CEPH_FEATURE_OSD_POOLRESEND(1ULL<<43) /* * The introduction of CEPH_FEATURE_OSD_SNAPMAPPER caused the feature @@ -102,7 +114,16 @@ CEPH_FEATURE_OSD_SNAPMAPPER | \ CEPH_FEATURE_MON_SCRUB | \ CEPH_FEATURE_OSD_PACKED_RECOVERY | \ -CEPH_FEATURE_OSD_CACHEPOOL | \ +CEPH_FEATURE_OSD_CACHEPOOL | \ +CEPH_FEATURE_CRUSH_V2 |\ +CEPH_FEATURE_EXPORT_PEER | \ + CEPH_FEATURE_OSD_ERASURE_CODES | \ +CEPH_FEATURE_OSDMAP_ENC | \ +CEPH_FEATURE_MDS_INLINE_DATA | \ +CEPH_FEATURE_CRUSH_TUNABLES3 | \ +CEPH_FEATURE_OSD_PRIMARY_AFFINITY |\ +CEPH_FEATURE_MSGR_KEEPALIVE2 | \ +CEPH_FEATURE_OSD_POOLRESEND | \ 0ULL) #define CEPH_FEATURES_SUPPORTED_DEFAULT CEPH_FEATURES_ALL @@ -112,6 +133,8 @@ */ #define CEPH_FEATURES_CRUSH\ (CEPH_FEATURE_CRUSH_TUNABLES | \ -CEPH_FEATURE_CRUSH_TUNABLES2) +CEPH_FEATURE_CRUSH_TUNABLES2 | \ +CEPH_FEATURE_CRUSH_TUNABLES3 | \ +CEPH_FEATURE_CRUSH_V2) #endif ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] [URGENT]. Can't connect to CEPH after upgrade from 0.72 to 0.80
Hi Mark, actually, CEPH is running fine, and I have deployed NEW host (new compile libvirt with ceph 0.8 devel, and newer kernel) - and it works... so migrating some VMs to this new host... I have 3 physical hosts, that are both MON and 2x OSD per host, all3 don't work-cloudstack/libvirt... Any suggestion on need to recompile libvirt ? I got info from Wido, that libvirt does NOT need to be recompiled Best On 13 July 2014 08:35, Mark Kirkwood wrote: > On 13/07/14 17:07, Andrija Panic wrote: > >> Hi, >> >> Sorry to bother, but I have urgent situation: upgraded CEPH from 0.72 to >> 0.80 (centos 6.5), and now all my CloudStack HOSTS can not connect. >> >> I did basic "yum update ceph" on the first MON leader, and all CEPH >> services on that HOST, have been restarted - done same on other CEPH >> nodes (I have 1MON + 2 OSD per physical host), then I have set variables >> to optimal with "ceph osd crush tunables optimal" and after some >> rebalancing, ceph shows HEALTH_OK. >> >> Also, I can create new images with qemu-img -f rbd rbd:/cloudstack >> >> Libvirt 1.2.3 was compiled while ceph was 0.72, but I got instructions >> from Wido that I don't need to REcompile now with ceph 0.80... >> >> Libvirt logs: >> >> libvirt: Storage Driver error : Storage pool not found: no storage pool >> with matching uuid ‡Îhyš> >> Note there are some strange "uuid" - not sure what is happening ? >> >> Did I forget to do something after CEPH upgrade ? >> > > Have you got any ceph logs to examine on the host running libvirt? When I > try to connect a v0.72 client to v0.81 cluster I get: > > 2014-07-13 18:21:23.860898 7fc3bd2ca700 0 -- 192.168.122.41:0/1002012 >> > 192.168.122.21:6789/0 pipe(0x7fc3c00241f0 sd=3 :49451 s=1 pgs=0 cs=0 l=1 > c=0x7fc3c0024450).connect protocol feature mismatch, my f < peer > 5f missing 50 > > Regards > > Mark > > -- Andrija Panić -- http://admintweets.com -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] [URGENT]. Can't connect to CEPH after upgrade from 0.72 to 0.80
On 13/07/14 17:07, Andrija Panic wrote: Hi, Sorry to bother, but I have urgent situation: upgraded CEPH from 0.72 to 0.80 (centos 6.5), and now all my CloudStack HOSTS can not connect. I did basic "yum update ceph" on the first MON leader, and all CEPH services on that HOST, have been restarted - done same on other CEPH nodes (I have 1MON + 2 OSD per physical host), then I have set variables to optimal with "ceph osd crush tunables optimal" and after some rebalancing, ceph shows HEALTH_OK. Also, I can create new images with qemu-img -f rbd rbd:/cloudstack Libvirt 1.2.3 was compiled while ceph was 0.72, but I got instructions from Wido that I don't need to REcompile now with ceph 0.80... Libvirt logs: libvirt: Storage Driver error : Storage pool not found: no storage pool with matching uuid Îhy Have you got any ceph logs to examine on the host running libvirt? When I try to connect a v0.72 client to v0.81 cluster I get: 2014-07-13 18:21:23.860898 7fc3bd2ca700 0 -- 192.168.122.41:0/1002012 >> 192.168.122.21:6789/0 pipe(0x7fc3c00241f0 sd=3 :49451 s=1 pgs=0 cs=0 l=1 c=0x7fc3c0024450).connect protocol feature mismatch, my f < peer 5f missing 50 Regards Mark ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] [URGENT]. Can't connect to CEPH after upgrade from 0.72 to 0.80
Hi, Sorry to bother, but I have urgent situation: upgraded CEPH from 0.72 to 0.80 (centos 6.5), and now all my CloudStack HOSTS can not connect. I did basic "yum update ceph" on the first MON leader, and all CEPH services on that HOST, have been restarted - done same on other CEPH nodes (I have 1MON + 2 OSD per physical host), then I have set variables to optimal with "ceph osd crush tunables optimal" and after some rebalancing, ceph shows HEALTH_OK. Also, I can create new images with qemu-img -f rbd rbd:/cloudstack Libvirt 1.2.3 was compiled while ceph was 0.72, but I got instructions from Wido that I don't need to REcompile now with ceph 0.80... Libvirt logs: libvirt: Storage Driver error : Storage pool not found: no storage pool with matching uuid Îhy___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com