Integration work

2012-08-28 Thread Ross Turk


Hi, ceph-devel! It's me, your friendly community guy.

Inktank has an engineering team dedicated to Ceph, and we want to work 
on the right stuff. From time to time, I'd like to check in with you to 
make sure that we are.


Over the past several months, Inktank's engineers have focused on core 
stability, radosgw, and feature expansion for RBD. At the same time, 
they have been regularly allocating cycles to integration work. 
Recently, this has consisted of improvements to the way Ceph works 
within OpenStack (even though OpenStack isn't the only technology that 
we think Ceph should play nicely with).


What other sorts of integrations would you like to see Inktank engineers 
work on? For example, are you interested in seeing Inktank spend more of 
its resources improving interoperability with Apache CloudStack or 
Eucalyptus? How about Xen?


Please share your thoughts. We want to contribute in the best way 
possible with the resources we have, and your input can help.


Thx,
Ross

--
Ross Turk
Community, Ceph
@rossturk @inktank @ceph



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration work

2012-08-28 Thread Plaetinck, Dieter
On Tue, 28 Aug 2012 11:12:16 -0700
Ross Turk  wrote:

> 
> Hi, ceph-devel! It's me, your friendly community guy.
> 
> Inktank has an engineering team dedicated to Ceph, and we want to work 
> on the right stuff. From time to time, I'd like to check in with you to 
> make sure that we are.
> 
> Over the past several months, Inktank's engineers have focused on core 
> stability, radosgw, and feature expansion for RBD. At the same time, 
> they have been regularly allocating cycles to integration work. 
> Recently, this has consisted of improvements to the way Ceph works 
> within OpenStack (even though OpenStack isn't the only technology that 
> we think Ceph should play nicely with).
> 
> What other sorts of integrations would you like to see Inktank engineers 
> work on?

are we only supposed to give answers wrt. integration with other software?
if not, I would suggest to write documentation.
and also integration with CM like puppet/chef
both of these points can give a shorter "time from zero to working cluster" 
which IMHO is critical
in attracting new users. (myself included)
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration work

2012-08-28 Thread Dieter Kasper
Hi Ross,

focusing on core stability and feature expansion for RBD was the right appoach 
in the past and I feel you have reached an adequate maturity level here.

Performance enhancements - especially to reduce the latency of a single IO / 
increase IOPS -
and a stronger engagement on the CephFS client would be very much appreciated.
A stable and fast CephFS client would allow an efficient integration with
- (clustered) NFS (v3 and v4)
- (clustered) Samba v4


Cheers,
-Dieter


On Tue, Aug 28, 2012 at 08:12:16PM +0200, Ross Turk wrote:
> 
> Hi, ceph-devel! It's me, your friendly community guy.
> 
> Inktank has an engineering team dedicated to Ceph, and we want to work 
> on the right stuff. From time to time, I'd like to check in with you to 
> make sure that we are.
> 
> Over the past several months, Inktank's engineers have focused on core 
> stability, radosgw, and feature expansion for RBD. At the same time, 
> they have been regularly allocating cycles to integration work. 
> Recently, this has consisted of improvements to the way Ceph works 
> within OpenStack (even though OpenStack isn't the only technology that 
> we think Ceph should play nicely with).
> 
> What other sorts of integrations would you like to see Inktank engineers 
> work on? For example, are you interested in seeing Inktank spend more of 
> its resources improving interoperability with Apache CloudStack or 
> Eucalyptus? How about Xen?
> 
> Please share your thoughts. We want to contribute in the best way 
> possible with the resources we have, and your input can help.
> 
> Thx,
> Ross
> 
> --
> Ross Turk
> Community, Ceph
> @rossturk @inktank @ceph
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration work

2012-08-28 Thread Smart Weblications GmbH - Florian Wiessner
Am 28.08.2012 20:51, schrieb Dieter Kasper:

> Performance enhancements - especially to reduce the latency of a single IO / 
> increase IOPS -
> and a stronger engagement on the CephFS client would be very much appreciated.
> A stable and fast CephFS client would allow an efficient integration with
> - (clustered) NFS (v3 and v4)
> - (clustered) Samba v4

Have you tried ocfs2 ontop of rbd in the meanwhile until cephFS gets ready?


-- 

Mit freundlichen Grüßen,

Florian Wiessner

Smart Weblications GmbH
Martinsberger Str. 1
D-95119 Naila

fon.: +49 9282 9638 200
fax.: +49 9282 9638 205
24/7: +49 900 144 000 00 - 0,99 EUR/Min*
http://www.smart-weblications.de

--
Sitz der Gesellschaft: Naila
Geschäftsführer: Florian Wiessner
HRB-Nr.: HRB 3840 Amtsgericht Hof
*aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration work

2012-08-28 Thread Dieter Kasper
On Tue, Aug 28, 2012 at 08:57:02PM +0200, Smart Weblications GmbH - Florian 
Wiessner wrote:
> Am 28.08.2012 20:51, schrieb Dieter Kasper:
> 
> > Performance enhancements - especially to reduce the latency of a single IO 
> > / increase IOPS -
> > and a stronger engagement on the CephFS client would be very much 
> > appreciated.
> > A stable and fast CephFS client would allow an efficient integration with
> > - (clustered) NFS (v3 and v4)
> > - (clustered) Samba v4
> 
> Have you tried ocfs2 ontop of rbd in the meanwhile until cephFS gets ready?
No, I haven't, but I know its limitations.

OCFS2 (like GFS/GFS2 from sistina/RH) is build on the cluster-FS design
of the 90s.
I'm looking for Cluster-FS which is based on 
+ a system which is inherently dynamic
+ failures in a cluster are the norm, rather than an exception
+ that characters of workloads are constantly shifting over time
+ a system which is inevitably built incrementally
+ a system which is self-managing
= Ceph (RBD + CephFS)


Cheers,
Dieter Kasper

> 
> 
> -- 
> 
> Mit freundlichen Grüßen,
> 
> Florian Wiessner
> 
> Smart Weblications GmbH
> Martinsberger Str. 1
> D-95119 Naila
> 
> fon.: +49 9282 9638 200
> fax.: +49 9282 9638 205
> 24/7: +49 900 144 000 00 - 0,99 EUR/Min*
> http://www.smart-weblications.de
> 
> --
> Sitz der Gesellschaft: Naila
> Geschäftsführer: Florian Wiessner
> HRB-Nr.: HRB 3840 Amtsgericht Hof
> *aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Principal Consultant, Data Center Storage Architecture and Technology
FTS CTO
FUJITSU TECHNOLOGY SOLUTIONS GMBH
Mies-van-der-Rohe-Straße 8 / 4F
80807 München
Germany

Telephone:  +49 89 62060 1898
Telefax:+49 89 62060 329 1898
Mobile: +49 170 8563173
Email:  dieter.kas...@ts.fujitsu.com
Internet:   http://ts.fujitsu.com
Company Details: http://ts.fujitsu.com/imprint.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration work

2012-08-28 Thread Tren Blackburn
On Tue, Aug 28, 2012 at 11:51 AM, Dieter Kasper  wrote:
> Hi Ross,
>
> focusing on core stability and feature expansion for RBD was the right appoach
> in the past and I feel you have reached an adequate maturity level here.
>
> Performance enhancements - especially to reduce the latency of a single IO / 
> increase IOPS -
> and a stronger engagement on the CephFS client would be very much appreciated.
> A stable and fast CephFS client would allow an efficient integration with
> - (clustered) NFS (v3 and v4)
> - (clustered) Samba v4

+1 to CephFS being worked on. Things like the multi-mds being improved
upon would be amazing.

Regards,

Tren
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration work

2012-08-28 Thread Florian Haas
On 08/28/2012 11:32 AM, Plaetinck, Dieter wrote:
> On Tue, 28 Aug 2012 11:12:16 -0700
> Ross Turk  wrote:
> 
>>
>> Hi, ceph-devel! It's me, your friendly community guy.
>>
>> Inktank has an engineering team dedicated to Ceph, and we want to work 
>> on the right stuff. From time to time, I'd like to check in with you to 
>> make sure that we are.
>>
>> Over the past several months, Inktank's engineers have focused on core 
>> stability, radosgw, and feature expansion for RBD. At the same time, 
>> they have been regularly allocating cycles to integration work. 
>> Recently, this has consisted of improvements to the way Ceph works 
>> within OpenStack (even though OpenStack isn't the only technology that 
>> we think Ceph should play nicely with).
>>
>> What other sorts of integrations would you like to see Inktank engineers 
>> work on?
> 
> are we only supposed to give answers wrt. integration with other software?
> if not, I would suggest to write documentation.

If I may say so, the amount of work that John has poured into this in
recent week has been incredible (http://www.ceph.com/docs/master/). So
while it's definitely not complete nor perfect, I'm sure he would
appreciate a little more specific information as to where you believe
documentation is lacking.

I for my part, in the documentation space, would love for the admin
tools to become self-documenting. For example, I would love a "help"
subcommand at any level of the ceph shell, listing the supported
subcommands in that level. As in "ceph help", "ceph mon help", "ceph osd
getmap help".

Even better, the ceph shell could support a general-purpose hook that
bash-completion can use (kind of like "hg" does in Mercurial), and this
and the above-conjectured help facility could arguably share quite a bit
of code.

> and also integration with CM like puppet/chef

+1, although people are already working on both. So maybe this is just
about the need to tell more people about that. :)

Cheers,
Florian
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration work

2012-08-28 Thread Tommi Virtanen
On Tue, Aug 28, 2012 at 5:03 PM, Florian Haas  wrote:
> I for my part, in the documentation space, would love for the admin
> tools to become self-documenting. For example, I would love a "help"
> subcommand at any level of the ceph shell, listing the supported
> subcommands in that level. As in "ceph help", "ceph mon help", "ceph osd
> getmap help".
>
> Even better, the ceph shell could support a general-purpose hook that
> bash-completion can use (kind of like "hg" does in Mercurial), and this
> and the above-conjectured help facility could arguably share quite a bit
> of code.

I would love to see all of that. But, a lot of the "ceph" tool
functionality is implemented by shoveling strings in and out of the
monitors. It largely doesn't understand what's happening.

If we were to redo that from scratch, I'd convert that to have some
sort of API to monitors, and make the cli understand all the relevant
things. Understandably, that can feel a little bit more rigid; to add
a command means adding it to both the server and a client, where as
currently the client is very very generic.

>> and also integration with CM like puppet/chef
> +1, although people are already working on both. So maybe this is just
> about the need to tell more people about that. :)

Please do give constructive feedback on

http://ceph.com/docs/master/install/chef/
http://ceph.com/docs/master/config-cluster/chef/
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration work

2012-08-28 Thread Josh Durgin

On 08/28/2012 02:15 PM, Tommi Virtanen wrote:

On Tue, Aug 28, 2012 at 5:03 PM, Florian Haas  wrote:

I for my part, in the documentation space, would love for the admin
tools to become self-documenting. For example, I would love a "help"
subcommand at any level of the ceph shell, listing the supported
subcommands in that level. As in "ceph help", "ceph mon help", "ceph osd
getmap help".

Even better, the ceph shell could support a general-purpose hook that
bash-completion can use (kind of like "hg" does in Mercurial), and this
and the above-conjectured help facility could arguably share quite a bit
of code.


I would love to see all of that. But, a lot of the "ceph" tool
functionality is implemented by shoveling strings in and out of the
monitors. It largely doesn't understand what's happening.


It doesn't need to understand what's happening to give basic usage info 
though - the monitors can provide that themselves in the short term

while we don't have an admin api like you describe below.

I added a feature request for this a little while back:

http://www.tracker.newdream.net/issues/2894


If we were to redo that from scratch, I'd convert that to have some
sort of API to monitors, and make the cli understand all the relevant
things. Understandably, that can feel a little bit more rigid; to add
a command means adding it to both the server and a client, where as
currently the client is very very generic.


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration work

2012-08-29 Thread Amon Ott
On Tuesday 28 August 2012 you wrote:
> On Tue, Aug 28, 2012 at 11:51 AM, Dieter Kasper  
wrote:
> > Hi Ross,
> >
> > focusing on core stability and feature expansion for RBD was the right
> > appoach in the past and I feel you have reached an adequate maturity
> > level here.
> >
> > Performance enhancements - especially to reduce the latency of a single
> > IO / increase IOPS - and a stronger engagement on the CephFS client would
> > be very much appreciated. A stable and fast CephFS client would allow an
> > efficient integration with - (clustered) NFS (v3 and v4)
> > - (clustered) Samba v4
>
> +1 to CephFS being worked on. Things like the multi-mds being improved
> upon would be amazing.

Stable CephFS is what we need, too - many concurrent write accesses from 
multiple clients with a real file system underneath. However, most stability 
problems we have had so far were crashes in the daemons, not the Linux 
kernel. Great for me would be some solution to what we call the Domino 
effect - one daemon crashes, the next takes over, crashes at the same place 
(same data...), until the whole cluster is down. There will always be bugs, 
but they should not kill the whole cluster.

In our tests, single MDS was no real bottleneck, it was only lacking 
stability. I have not tested the newest releases, so it might be better now. 
Improved performance with many small files being written concurrently would 
be great, but CephFS has been getting significantly faster over the last year 
and performance is being worked on all the time.

Amon Ott
-- 
Dr. Amon Ott
m-privacy GmbH   Tel: +49 30 24342334
Am Köllnischen Park 1Fax: +49 30 99296856
10179 Berlin http://www.m-privacy.de

Amtsgericht Charlottenburg, HRB 84946

Geschäftsführer:
 Dipl.-Kfm. Holger Maczkowsky,
 Roman Maczkowsky

GnuPG-Key-ID: 0x2DD3A649
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration work

2012-08-29 Thread Sylvain Munaut
Hi,

> How about Xen?

I vote for this :)

Using RBD storage for Xen VM images / disks is IMHO a very nice fit,
the same way people do with QEMU. This should even allow live
migration of VM.

Currently we have to rely on the RBD kernel driver which has some
downsides (no caching / need recent kernel to get latest ceph
patches). There also seem to be some weird interactions between RBD
and Xen that lead to significant performance hits that are not present
when using only RBD or only Xen.

One possibility would be to develop a blktap driver for xen to provide
block device backend in userspace using librbd rather than kernel mode
rbd.

Cheers,

Sylvain
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration work

2012-08-29 Thread Wido den Hollander

On 08/29/2012 10:20 AM, Sylvain Munaut wrote:

Hi,


How about Xen?


I vote for this :)

Using RBD storage for Xen VM images / disks is IMHO a very nice fit,
the same way people do with QEMU. This should even allow live
migration of VM.



Correct me if I'm wrong, but when I was at Citrix in May this year 
somebody there told me that Xen was going 100% Qemu?


By going 100% Qemu they would also get RBD support.

Wido


Currently we have to rely on the RBD kernel driver which has some
downsides (no caching / need recent kernel to get latest ceph
patches). There also seem to be some weird interactions between RBD
and Xen that lead to significant performance hits that are not present
when using only RBD or only Xen.

One possibility would be to develop a blktap driver for xen to provide
block device backend in userspace using librbd rather than kernel mode
rbd.

Cheers,

 Sylvain
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration work

2012-08-29 Thread Sylvain Munaut
> Correct me if I'm wrong, but when I was at Citrix in May this year somebody
> there told me that Xen was going 100% Qemu?

Huh ... I've never heard this. Also the guys in ##xen haven't either.
I'm not really involved in xen dev and don't follow it closely but
that seems unlikely. The few slides I looked at from the Xen Summit a
couple days ago show that they really like their PV model.

AFAIK QEMU is only used for HVM guests to emulate the hw. And even for
those HVM guest, it's recommended to use PV drivers for performance
which bypass the qemu layer all together.

Cheers,

Sylvain
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration work

2012-08-29 Thread Wido den Hollander

On 08/29/2012 02:35 PM, Sylvain Munaut wrote:

Correct me if I'm wrong, but when I was at Citrix in May this year somebody
there told me that Xen was going 100% Qemu?


Huh ... I've never heard this. Also the guys in ##xen haven't either.
I'm not really involved in xen dev and don't follow it closely but
that seems unlikely. The few slides I looked at from the Xen Summit a
couple days ago show that they really like their PV model.



I must be wrong then!


AFAIK QEMU is only used for HVM guests to emulate the hw. And even for
those HVM guest, it's recommended to use PV drivers for performance
which bypass the qemu layer all together.



Not sure where this came from then, but in that case it would take work 
from Citrix to get RBD in Xen.


Wido


Cheers,

 Sylvain
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration work

2012-08-29 Thread Tommi Virtanen
On Wed, Aug 29, 2012 at 9:40 AM, Wido den Hollander  wrote:
>> Huh ... I've never heard this. Also the guys in ##xen haven't either.
>> I'm not really involved in xen dev and don't follow it closely but
>> that seems unlikely. The few slides I looked at from the Xen Summit a
>> couple days ago show that they really like their PV model.
> I must be wrong then!

They are (at least, Red Hat is) looking at using more qemu for
xen-hvm. Whether that has any effect on the PV side, I wouldn't know.
It might make sense for them to use virtio even for PV, so they might
use qemu to implement the hypervisor side of virtio too, and that
would get you librbd support.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration work

2012-08-29 Thread Joseph Glanville
On 29 August 2012 23:43, Tommi Virtanen  wrote:
> On Wed, Aug 29, 2012 at 9:40 AM, Wido den Hollander  wrote:
>>> Huh ... I've never heard this. Also the guys in ##xen haven't either.
>>> I'm not really involved in xen dev and don't follow it closely but
>>> that seems unlikely. The few slides I looked at from the Xen Summit a
>>> couple days ago show that they really like their PV model.
>> I must be wrong then!
>
> They are (at least, Red Hat is) looking at using more qemu for
> xen-hvm. Whether that has any effect on the PV side, I wouldn't know.
> It might make sense for them to use virtio even for PV, so they might
> use qemu to implement the hypervisor side of virtio too, and that
> would get you librbd support.
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

I don't there is much going on in terms of increasing use of QEMU,
only that Xen can now use upsteam QEMU rather than the Xen specific
fork (qemu-xen-traditional).
There was GSOC project to build a virtio front/backend for Xen but I
am not sure if this would be the way to go.
As far as I can see Xen dominates KVM in terms network and I/O
performance on every benchmark so apart from compatibility the gains
of using virtio don't seem that great... Xen's blkback/netback PV
system is just that much faster and more scalable with large numbers
of domains or 100k+ IOPs.

With regards to blktap.. blktap is currently in a state where blktap2
is included in a minimal amount of distros and is non-upstreamable.
blktap3 which is coming will but fully userspace but I have never been
a big fan of userspace block devices, YMMV.
That being said, building blktap devices is really easy (similar to tuntap).

Ideally improving the kernel RBD device would provide the best
performance across the board and the most compatibility (anything can
use a raw block device).

Joseph.

-- 
CTO | Orion Virtualisation Solutions | www.orionvm.com.au
Phone: 1300 56 99 52 | Mobile: 0428 754 846
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration work

2012-08-30 Thread João Eduardo Luís
On 08/28/2012 10:20 PM, Josh Durgin wrote:
> On 08/28/2012 02:15 PM, Tommi Virtanen wrote:
>> On Tue, Aug 28, 2012 at 5:03 PM, Florian Haas 
>> wrote:
>>> I for my part, in the documentation space, would love for the admin
>>> tools to become self-documenting. For example, I would love a "help"
>>> subcommand at any level of the ceph shell, listing the supported
>>> subcommands in that level. As in "ceph help", "ceph mon help", "ceph osd
>>> getmap help".
>>>
>>> Even better, the ceph shell could support a general-purpose hook that
>>> bash-completion can use (kind of like "hg" does in Mercurial), and this
>>> and the above-conjectured help facility could arguably share quite a bit
>>> of code.
>>
>> I would love to see all of that. But, a lot of the "ceph" tool
>> functionality is implemented by shoveling strings in and out of the
>> monitors. It largely doesn't understand what's happening.
> 
> It doesn't need to understand what's happening to give basic usage info
> though - the monitors can provide that themselves in the short term
> while we don't have an admin api like you describe below.
> 
> I added a feature request for this a little while back:
> 
> http://www.tracker.newdream.net/issues/2894

I believe this is pretty straightforward to get done.


-- 
João Eduardo Luís
gpg key: 477C26E5 from pool.keyserver.eu



signature.asc
Description: OpenPGP digital signature


RE: Integration work

2012-08-31 Thread Ryan Nicholson
Ross, All:

I've read through several recommendations, and I'd like to add 2 to that list 
for consideration.

First: For my local project, I'm using rbd with Oracle VM and VM manager, 
mainly because of the other engineers' familiarity with the Oracle platforms, 
and they're certified by MS to run Windows on Xen (using Oracle's stuff).

Now, due to necessity, I'll be working on a Storage plugin that allows the 
Orcale VM to understand RBD, to make pools, etc for our project. I would be 
interested to know if anyone else has actually started on their own version of 
the same.

Secondly: Through some trials, I've found that if one loses all of his Monitors 
in a way that they also lose their disks, one basically loses their cluster. I 
would like to recommend a lower priority shift in design that allows for 
"recovery of the entire monitor set from data/snapshots automatically stored at 
the osd's". 

For example, a monitor boots:
-keyring file and ceph.conf are available
-monitor sees that it is missing its local copy of maps, etc.
-goes onto the first OSD's it sees and pulls down a snapshot of the same
-checks for another running monitor, syncs with it, if not,
-boots at quorum 0, verifying OSD states
-life continues.

The big deal here, is that while the entire cluster is able to recover from 
failures using one storage philosophy, the monitors are using an entirely 
different, and more legacy storage philosophy - basically local RAID/power in 
numbers. Perhaps this has already been considered, and I would be interested in 
knowing what people think here, as well. Or perhaps I missed something and this 
is already done?

Thanks, for your time!

Ryan Nicholson

-Original Message-
From: ceph-devel-ow...@vger.kernel.org 
[mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Ross Turk
Sent: Tuesday, August 28, 2012 1:12 PM
To: ceph-devel@vger.kernel.org
Subject: Integration work


Hi, ceph-devel! It's me, your friendly community guy.

Inktank has an engineering team dedicated to Ceph, and we want to work on the 
right stuff. From time to time, I'd like to check in with you to make sure that 
we are.

Over the past several months, Inktank's engineers have focused on core 
stability, radosgw, and feature expansion for RBD. At the same time, they have 
been regularly allocating cycles to integration work. 
Recently, this has consisted of improvements to the way Ceph works within 
OpenStack (even though OpenStack isn't the only technology that we think Ceph 
should play nicely with).

What other sorts of integrations would you like to see Inktank engineers work 
on? For example, are you interested in seeing Inktank spend more of its 
resources improving interoperability with Apache CloudStack or Eucalyptus? How 
about Xen?

Please share your thoughts. We want to contribute in the best way possible with 
the resources we have, and your input can help.

Thx,
Ross

--
Ross Turk
Community, Ceph
@rossturk @inktank @ceph



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the 
body of a message to majord...@vger.kernel.org More majordomo info at  
http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration work

2012-09-04 Thread Tommi Virtanen
On Fri, Aug 31, 2012 at 11:02 PM, Ryan Nicholson
 wrote:
> Secondly: Through some trials, I've found that if one loses all of his 
> Monitors in a way that they also lose their disks, one basically loses their 
> cluster. I would like to recommend a lower priority shift in design that 
> allows for "recovery of the entire monitor set from data/snapshots 
> automatically stored at the osd's".
>
> For example, a monitor boots:
> -keyring file and ceph.conf are available
> -monitor sees that it is missing its local copy of maps, etc.
> -goes onto the first OSD's it sees and pulls down a snapshot of the 
> same
> -checks for another running monitor, syncs with it, if not,
> -boots at quorum 0, verifying OSD states
> -life continues.

Monitor fetching initial information from an OSD is full of
challenges. The monitor won't know what IP addresses and ports the
OSDs are, the OSDs won't trust the monitor to talk to them, etc (it
lost its crypto keys, after all). It wouldn't even know which OSD to
talk to, and I highly doubt having the backup on every OSD would be a
good idea.

> The big deal here, is that while the entire cluster is able to recover from 
> failures using one storage philosophy, the monitors are using an entirely 
> different, and more legacy storage philosophy - basically local RAID/power in 
> numbers. Perhaps this has already been considered, and I would be interested 
> in knowing what people think here, as well. Or perhaps I missed something and 
> this is already done?

That's why you run multiple monitors: they provide High Availability
to the monitor service, as a whole. Losing all of your monitors at all
disrupts operation of the cluster. Losing all of their stable storage
really is disastrous. This is why you are supposed to deploy them in
different failure domains, e.g. in different rows or rooms.

If a monitor has its mon. keyring and ceph.conf, it should be able to
join an existing monitor cluster as a new member, no special-case
recovery needed.

I'm not sure what kind of architecture you have that makes losing all
the of the monitor disks somehow likely, but perhaps you should just
take backups of their disks, with plain-old backup tools? Don't try to
store that backup in the same Ceph cluster, though. It would be
interesting to hear more about what you're thinking of, here.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html