Re: [ceph-users] [URGENT]. Can't connect to CEPH after upgrade from 0.72 to 0.80

2014-07-13 Thread Andrija Panic
Hi Mark,

update:

after restarting libvirtd and cloudstack-agent and management server God
know how many times - it WORKS now !

Not sure what is happening here, but it works again... I know for sure it
was not CEPH cluster, since it was fine, and accessible via qemu-img, etc...

Thanks Mark for your time for my issue...
Best.
Andrija




On 13 July 2014 10:20, Mark Kirkwood  wrote:

> On 13/07/14 19:15, Mark Kirkwood wrote:
>
>> On 13/07/14 18:38, Andrija Panic wrote:
>>
>
>  Any suggestion on need to recompile libvirt ? I got info from Wido, that
>>> libvirt does NOT need to be recompiled
>>>
>>>
> Thinking about this a bit more - Wido *may* have meant:
>
> - *libvirt* does not need to be rebuild
> - ...but you need to get/build a later ceph client i.e - 0.80
>
> Of course depending on how your libvirt build was set up (e.g static
> linkage), this *might* have meant you needed to rebuild it too.
>
> Regards
>
> Mark
>
>


-- 

Andrija Panić
--
  http://admintweets.com
--
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [URGENT]. Can't connect to CEPH after upgrade from 0.72 to 0.80

2014-07-13 Thread Mark Kirkwood

On 13/07/14 19:15, Mark Kirkwood wrote:

On 13/07/14 18:38, Andrija Panic wrote:



Any suggestion on need to recompile libvirt ? I got info from Wido, that
libvirt does NOT need to be recompiled



Thinking about this a bit more - Wido *may* have meant:

- *libvirt* does not need to be rebuild
- ...but you need to get/build a later ceph client i.e - 0.80

Of course depending on how your libvirt build was set up (e.g static 
linkage), this *might* have meant you needed to rebuild it too.


Regards

Mark

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [URGENT]. Can't connect to CEPH after upgrade from 0.72 to 0.80

2014-07-13 Thread Mark Kirkwood

On 13/07/14 18:38, Andrija Panic wrote:

Hi Mark,
actually, CEPH is running fine, and I have deployed NEW host (new
compile libvirt with ceph 0.8 devel, and newer kernel) - and it works...
so migrating some VMs to this new host...

I have 3 physical hosts, that are both MON and 2x OSD per host, all3
don't work-cloudstack/libvirt...

Any suggestion on need to recompile libvirt ? I got info from Wido, that
libvirt does NOT need to be recompiled



Looking at the differences between src/include/ceph_features.h in 0.72 
and 0.81 [1] (note, not *quite* the same version as you are using), 
there's erasure codes and other new features that are advertised by the 
later version that the client will need to match. Now *some* of these 
(crush tunables) can be switched off via:


$ ceph osd crush tunables legacy

...which would have been worth a try, but my guess is would not have 
worked, as (for example) I *don't* think erasure codes feature can be 
switched off. Hence, unless I'm mistaken (which is always possible) I 
think you did in fact need to recompile.


regards

Mark


[1] e.g:
--- ceph_features.h.72  2014-07-13 19:00:36.805825203 +1200
+++ ceph_features.h.81  2014-07-13 19:02:22.065826068 +1200
@@ -40,6 +40,18 @@
 #define CEPH_FEATURE_MON_SCRUB  (1ULL<<33)
 #define CEPH_FEATURE_OSD_PACKED_RECOVERY (1ULL<<34)
 #define CEPH_FEATURE_OSD_CACHEPOOL (1ULL<<35)
+#define CEPH_FEATURE_CRUSH_V2  (1ULL<<36)  /* new indep; SET_* steps */
+#define CEPH_FEATURE_EXPORT_PEER   (1ULL<<37)
+#define CEPH_FEATURE_OSD_ERASURE_CODES (1ULL<<38)
+#define CEPH_FEATURE_OSD_TMAP2OMAP (1ULL<<38)   /* overlap with EC */
+/* The process supports new-style OSDMap encoding. Monitors also use
+   this bit to determine if peers support NAK messages. */
+#define CEPH_FEATURE_OSDMAP_ENC(1ULL<<39)
+#define CEPH_FEATURE_MDS_INLINE_DATA (1ULL<<40)
+#define CEPH_FEATURE_CRUSH_TUNABLES3 (1ULL<<41)
+#define CEPH_FEATURE_OSD_PRIMARY_AFFINITY (1ULL<<41)  /* overlap w/ 
tunables3 */

+#define CEPH_FEATURE_MSGR_KEEPALIVE2   (1ULL<<42)
+#define CEPH_FEATURE_OSD_POOLRESEND(1ULL<<43)

 /*
  * The introduction of CEPH_FEATURE_OSD_SNAPMAPPER caused the feature
@@ -102,7 +114,16 @@
 CEPH_FEATURE_OSD_SNAPMAPPER |  \
 CEPH_FEATURE_MON_SCRUB |   \
 CEPH_FEATURE_OSD_PACKED_RECOVERY | \
-CEPH_FEATURE_OSD_CACHEPOOL | \
+CEPH_FEATURE_OSD_CACHEPOOL |   \
+CEPH_FEATURE_CRUSH_V2 |\
+CEPH_FEATURE_EXPORT_PEER | \
+ CEPH_FEATURE_OSD_ERASURE_CODES |   \
+CEPH_FEATURE_OSDMAP_ENC |  \
+CEPH_FEATURE_MDS_INLINE_DATA | \
+CEPH_FEATURE_CRUSH_TUNABLES3 | \
+CEPH_FEATURE_OSD_PRIMARY_AFFINITY |\
+CEPH_FEATURE_MSGR_KEEPALIVE2 | \
+CEPH_FEATURE_OSD_POOLRESEND |  \
 0ULL)

 #define CEPH_FEATURES_SUPPORTED_DEFAULT  CEPH_FEATURES_ALL
@@ -112,6 +133,8 @@
  */
 #define CEPH_FEATURES_CRUSH\
(CEPH_FEATURE_CRUSH_TUNABLES |  \
-CEPH_FEATURE_CRUSH_TUNABLES2)
+CEPH_FEATURE_CRUSH_TUNABLES2 | \
+CEPH_FEATURE_CRUSH_TUNABLES3 | \
+CEPH_FEATURE_CRUSH_V2)

 #endif

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [URGENT]. Can't connect to CEPH after upgrade from 0.72 to 0.80

2014-07-12 Thread Andrija Panic
Hi Mark,
actually, CEPH is running fine, and I have deployed NEW host (new compile
libvirt with ceph 0.8 devel, and newer kernel) - and it works... so
migrating some VMs to this new host...

I have 3 physical hosts, that are both MON and 2x OSD per host, all3 don't
work-cloudstack/libvirt...

Any suggestion on need to recompile libvirt ? I got info from Wido, that
libvirt does NOT need to be recompiled


Best


On 13 July 2014 08:35, Mark Kirkwood  wrote:

> On 13/07/14 17:07, Andrija Panic wrote:
>
>> Hi,
>>
>> Sorry to bother, but I have urgent situation: upgraded CEPH from 0.72 to
>> 0.80 (centos 6.5), and now all my CloudStack HOSTS can not connect.
>>
>> I did basic "yum update ceph" on the first MON leader, and all CEPH
>> services on that HOST, have been restarted - done same on other CEPH
>> nodes (I have 1MON + 2 OSD per physical host), then I have set variables
>> to optimal with "ceph osd crush tunables optimal" and after some
>> rebalancing, ceph shows HEALTH_OK.
>>
>> Also, I can create new images with qemu-img -f rbd rbd:/cloudstack
>>
>> Libvirt 1.2.3 was compiled while ceph was 0.72, but I got instructions
>> from Wido that I don't need to REcompile now with ceph 0.80...
>>
>> Libvirt logs:
>>
>> libvirt: Storage Driver error : Storage pool not found: no storage pool
>> with matching uuid ‡Îhyš>
>> Note there are some strange "uuid" - not sure what is happening ?
>>
>> Did I forget to do something after CEPH upgrade ?
>>
>
> Have you got any ceph logs to examine on the host running libvirt? When I
> try to connect a v0.72 client to v0.81 cluster I get:
>
> 2014-07-13 18:21:23.860898 7fc3bd2ca700  0 -- 192.168.122.41:0/1002012 >>
> 192.168.122.21:6789/0 pipe(0x7fc3c00241f0 sd=3 :49451 s=1 pgs=0 cs=0 l=1
> c=0x7fc3c0024450).connect protocol feature mismatch, my f < peer
> 5f missing 50
>
> Regards
>
> Mark
>
>


-- 

Andrija Panić
--
  http://admintweets.com
--
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [URGENT]. Can't connect to CEPH after upgrade from 0.72 to 0.80

2014-07-12 Thread Mark Kirkwood

On 13/07/14 17:07, Andrija Panic wrote:

Hi,

Sorry to bother, but I have urgent situation: upgraded CEPH from 0.72 to
0.80 (centos 6.5), and now all my CloudStack HOSTS can not connect.

I did basic "yum update ceph" on the first MON leader, and all CEPH
services on that HOST, have been restarted - done same on other CEPH
nodes (I have 1MON + 2 OSD per physical host), then I have set variables
to optimal with "ceph osd crush tunables optimal" and after some
rebalancing, ceph shows HEALTH_OK.

Also, I can create new images with qemu-img -f rbd rbd:/cloudstack

Libvirt 1.2.3 was compiled while ceph was 0.72, but I got instructions
from Wido that I don't need to REcompile now with ceph 0.80...

Libvirt logs:

libvirt: Storage Driver error : Storage pool not found: no storage pool
with matching uuid ‡Îhyš

Have you got any ceph logs to examine on the host running libvirt? When 
I try to connect a v0.72 client to v0.81 cluster I get:


2014-07-13 18:21:23.860898 7fc3bd2ca700  0 -- 192.168.122.41:0/1002012 
>> 192.168.122.21:6789/0 pipe(0x7fc3c00241f0 sd=3 :49451 s=1 pgs=0 cs=0 
l=1 c=0x7fc3c0024450).connect protocol feature mismatch, my f < 
peer 5f missing 50


Regards

Mark

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] [URGENT]. Can't connect to CEPH after upgrade from 0.72 to 0.80

2014-07-12 Thread Andrija Panic
Hi,

Sorry to bother, but I have urgent situation: upgraded CEPH from 0.72 to
0.80 (centos 6.5), and now all my CloudStack HOSTS can not connect.

I did basic "yum update ceph" on the first MON leader, and all CEPH
services on that HOST, have been restarted - done same on other CEPH nodes
(I have 1MON + 2 OSD per physical host), then I have set variables to
optimal with "ceph osd crush tunables optimal" and after some rebalancing,
ceph shows HEALTH_OK.

Also, I can create new images with qemu-img -f rbd rbd:/cloudstack

Libvirt 1.2.3 was compiled while ceph was 0.72, but I got instructions from
Wido that I don't need to REcompile now with ceph 0.80...

Libvirt logs:

libvirt: Storage Driver error : Storage pool not found: no storage pool
with matching uuid ‡Îhyš___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com