Re: CephFS Backend for Hadoop
Dmitrii, Many thanks for your insight here. ~James On Wed, Jul 26, 2017 at 11:34 PM, Patrizio Bassiwrote: > > 2017-07-26 9:17 GMT+02:00 Mark Shuttleworth : > >> On 26/07/17 07:14, Patrizio Bassi wrote: >> >> Deploying hadoop via juju in an openstack tenant requires a separate >> model (as far as i could design it). >> So we may use the new juju 2.2 cross model relation to relate the hadoop >> charms to the openstack ceph units. >> >> does it sound feasible? >> >> >> Yes, that sounds feasible. I'm not sure how Ceph identity / permissions >> will work in that case (i.e. who has access to which data, how Ceph will >> correlate tenants in OpenStack both through Cinder and through a direct >> relationship). In principle though, as long as the networking is arranged >> so that IP addresses and routes enable traffic to flow between your tenant >> network and your Ceph network, and as long as both sets of machines can see >> the Juju controller, they can exchange messages and traffic. >> >> Mark >> > > > Dear Mark, > > On relation join event we may create a new ceph storage pool dedicated to > the incoming unit (i.e. prefixed with the controller/model/unit/charm name > by default). Can cephx proto > > Regarding networking openstack neutron by default block traffic from > tenant VM to the admin network which it required to access the same ceph > mon/osd. It requires changing neutron or implement an external nat for > instance (our solution at the moment) > > Patrizio > > > -- Juju mailing list Juju@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju
Re: CephFS Backend for Hadoop
2017-07-26 9:17 GMT+02:00 Mark Shuttleworth: > On 26/07/17 07:14, Patrizio Bassi wrote: > > Deploying hadoop via juju in an openstack tenant requires a separate model > (as far as i could design it). > So we may use the new juju 2.2 cross model relation to relate the hadoop > charms to the openstack ceph units. > > does it sound feasible? > > > Yes, that sounds feasible. I'm not sure how Ceph identity / permissions > will work in that case (i.e. who has access to which data, how Ceph will > correlate tenants in OpenStack both through Cinder and through a direct > relationship). In principle though, as long as the networking is arranged > so that IP addresses and routes enable traffic to flow between your tenant > network and your Ceph network, and as long as both sets of machines can see > the Juju controller, they can exchange messages and traffic. > > Mark > Dear Mark, On relation join event we may create a new ceph storage pool dedicated to the incoming unit (i.e. prefixed with the controller/model/unit/charm name by default). Can cephx proto Regarding networking openstack neutron by default block traffic from tenant VM to the admin network which it required to access the same ceph mon/osd. It requires changing neutron or implement an external nat for instance (our solution at the moment) Patrizio -- Juju mailing list Juju@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju
Re: CephFS Backend for Hadoop
Hi {James, Patrizio}, Be careful with using cephfs in production before ceph Luminous though (RC now). Although cephfs was declared stable in Jewel, http://ceph.com/releases/v10-2-0-jewel-released/ "CephFS: This is the first release in which CephFS is declared stable! Several features are disabled by default, including snapshots and multiple active MDS servers" having multiple active MDS servers is considered experimental for anything prior to Luminous (12.2.x) and running in 1 active/multiple standby mode has certain issues (scalability & performance, availability) http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-June/010728.html http://docs.ceph.com/docs/kraken/cephfs/best-practices/ "For the best chance of a happy healthy filesystem, use a single active MDS and do not use snapshots. Both of these are the default. Note that creating multiple MDS daemons is fine, as these will simply be used as standbys. However, for best stability you should avoid adjusting max_mds upwards, as this would cause multiple daemons to be active at once." http://docs.ceph.com/docs/master/cephfs/experimental-features/#multiple-active-metadata-servers "Prior to the Luminous (12.2.x) release, running multiple active metadata servers within a single filesystem was considered experimental. Creating multiple active metadata servers is now permitted by default on new filesystems..." http://ceph.com/releases/v12-1-0-luminous-rc-released/ "Multiple active MDS daemons is now considered stable. The number of active MDS servers may be adjusted up or down on an active CephFS file system." http://docs.ceph.com/docs/master/cephfs/multimds/ "Even with multiple active MDS daemons, a highly available system still requires standby daemons to take over if any of the servers running an active daemon fail." As far as I can see, a ceph filesystem metadata will be sharded across multiple MDS servers if configured. So having a multi-mds setup does not alleviate the need for standby servers and failover - this setup provides more parallelism but MDS high-availability is still needed for individual shards. http://docs.ceph.com/docs/master/cephfs/standby/ "Each CephFS filesystem has a number of ranks, one by default, which start at zero. A rank may be thought of as a metadata shard. Controlling the number of ranks in a filesystem is described in Configuring multiple active MDS daemons ... Each file system may specify a number of standby daemons to be considered healthy. This number includes daemons in standby-replay waiting for a rank to fail (remember that a standby-replay daemon will not be assigned to take over a failure for another rank or a failure in a another CephFS file system)." Also, if you need multiple cephfs file systems, it looks like you will need this amount of MDS instances: * * "Each CephFS ceph-mds process (a daemon) initially starts up without a rank. It may be assigned one by the monitor cluster. A daemon may only hold one rank at a time. Daemons only give up a rank when the ceph-mds process stops." It is interesting how rank assignment is performed by the monitor cluster - I would very much like to avoid cases where you have multiple or all ranks of a single file system stored on one machine with multiple active MDS daemons. -- I think the scope of work in charm-cephfs would be to: - implement standby MDS configuration; - implement multi-active MDS configuration. Best Regards, Dmitrii Shcherbakov Field Software Engineer IRC (freenode): Dmitrii-Sh On Wed, Jul 26, 2017 at 9:14 AM, Patrizio Bassiwrote: > > Il giorno mer 26 lug 2017 alle 06:28 James Beedy > ha scritto: > >> Hello all, >> >> I will be evaluating CephFS as a backend for Hadoop over the next few >> weeks, probably start investigating how this can be delivered via the >> charms in the morning. If anyone has ventured to this realm, or has an idea >> on what the best way to deliver this might be, I would love to hear from >> you. >> >> Thanks, >> >> James >> >> >> > > I do! > > Probably i won't be able to test before end of the year but i plan to host > hadoop clusters in openstack tenants and i would like to share the same > ceph osd providing infrastructural storage to openstack nova/cinder. > > Deploying hadoop via juju in an openstack tenant requires a separate model > (as far as i could design it). > So we may use the new juju 2.2 cross model relation to relate the hadoop > charms to the openstack ceph units. > > does it sound feasible? > > regards > > Patrizio > > > > -- > Juju mailing list > Juju@lists.ubuntu.com > Modify settings or unsubscribe at: https://lists.ubuntu.com/ > mailman/listinfo/juju > > -- Juju mailing list Juju@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju
Re: CephFS Backend for Hadoop
On 26/07/17 07:14, Patrizio Bassi wrote: > Deploying hadoop via juju in an openstack tenant requires a separate > model (as far as i could design it). > So we may use the new juju 2.2 cross model relation to relate the > hadoop charms to the openstack ceph units. > > does it sound feasible? Yes, that sounds feasible. I'm not sure how Ceph identity / permissions will work in that case (i.e. who has access to which data, how Ceph will correlate tenants in OpenStack both through Cinder and through a direct relationship). In principle though, as long as the networking is arranged so that IP addresses and routes enable traffic to flow between your tenant network and your Ceph network, and as long as both sets of machines can see the Juju controller, they can exchange messages and traffic. Mark -- Juju mailing list Juju@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju
Re: CephFS Backend for Hadoop
Il giorno mer 26 lug 2017 alle 06:28 James Beedyha scritto: > Hello all, > > I will be evaluating CephFS as a backend for Hadoop over the next few > weeks, probably start investigating how this can be delivered via the > charms in the morning. If anyone has ventured to this realm, or has an idea > on what the best way to deliver this might be, I would love to hear from > you. > > Thanks, > > James > > > I do! Probably i won't be able to test before end of the year but i plan to host hadoop clusters in openstack tenants and i would like to share the same ceph osd providing infrastructural storage to openstack nova/cinder. Deploying hadoop via juju in an openstack tenant requires a separate model (as far as i could design it). So we may use the new juju 2.2 cross model relation to relate the hadoop charms to the openstack ceph units. does it sound feasible? regards Patrizio -- Juju mailing list Juju@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju
CephFS Backend for Hadoop
Hello all, I will be evaluating CephFS as a backend for Hadoop over the next few weeks, probably start investigating how this can be delivered via the charms in the morning. If anyone has ventured to this realm, or has an idea on what the best way to deliver this might be, I would love to hear from you. Thanks, James -- Juju mailing list Juju@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju