Re: [ceph-users] experimental features
On Sun, 7 Dec 2014, Justin Erenkrantz wrote: On Fri, Dec 5, 2014 at 12:46 PM, Mark Nelson mark.nel...@inktank.com wrote: I'm in favor of the allow experimental features but instead call it: ALLOW UNRECOVERABLE DATA CORRUPTING FEATURES which makes things a little more explicit. With great power comes great responsibility. +1. For Subversion, we utilize SVN_I_LOVE_CORRUPTED_XXX for a few options that can cause data corruption. -- justin Thanks, I think we should go for this one. 1. a generic option enable unrecoverable data corrupting features = foo bar baz 2. rename keyvaluestore-dev to keyvaluestore (and require 'osd-objectstore-keyvaluestore' it be listed above). 3. include 'ms-type-async', 'filestore-zfs-snap' as experimental. And whatever else we want to flag... sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] experimental features
On Mon, 08 Dec 2014 08:33:25 -0600 Mark Nelson wrote: I've been thinking for a while that we need another more general command than Ceph health to more generally inform you about your cluster. IE I personally don't like having min/max PG warnings in Ceph health (they can be independently controlled by ceph.conf options but that kind of approach won't scale). I'd like another command that I can run that tells me about this kind of thing. Same thing with experimental features. I don't want ceph health warning me if they've been enabled, but I do want to know if they've ever been enabled, when, and whether they are still in effect. Very much agreed. Expanding on this, setting a cluster to no-scrub will result in a warning (very arguably) and while a slow request after 30 seconds is WRN worthy, after something like a minute it ought to be ERR level as this is likely to have massive impact on clients. Christian Mark On 12/08/2014 06:57 AM, Fred Yang wrote: You will have to consider in the real world whoever built the cluster might not document the dangerous option to let support stuff or successor aware. Thus any experimental feature considered not safe for production should be included in a warning message in 'ceph health', and logs, either log it periodically or log the warning msg upon restart. Feature-wise, 'ceph health detail' should give you a report over all important features/options of the cluster as well. -Fred On Sun, Dec 7, 2014, 11:15 PM Justin Erenkrantz jus...@erenkrantz.com mailto:jus...@erenkrantz.com wrote: On Fri, Dec 5, 2014 at 12:46 PM, Mark Nelson mark.nel...@inktank.com mailto:mark.nel...@inktank.com wrote: I'm in favor of the allow experimental features but instead call it: ALLOW UNRECOVERABLE DATA CORRUPTING FEATURES which makes things a little more explicit. With great power comes great responsibility. +1. For Subversion, we utilize SVN_I_LOVE_CORRUPTED_XXX for a few options that can cause data corruption. -- justin -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org mailto:majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/__majordomo-info.html http://vger.kernel.org/majordomo-info.html ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Fusion Communications http://www.gol.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] experimental features
You will have to consider in the real world whoever built the cluster might not document the dangerous option to let support stuff or successor aware. Thus any experimental feature considered not safe for production should be included in a warning message in 'ceph health', and logs, either log it periodically or log the warning msg upon restart. Feature-wise, 'ceph health detail' should give you a report over all important features/options of the cluster as well. -Fred On Sun, Dec 7, 2014, 11:15 PM Justin Erenkrantz jus...@erenkrantz.com wrote: On Fri, Dec 5, 2014 at 12:46 PM, Mark Nelson mark.nel...@inktank.com wrote: I'm in favor of the allow experimental features but instead call it: ALLOW UNRECOVERABLE DATA CORRUPTING FEATURES which makes things a little more explicit. With great power comes great responsibility. +1. For Subversion, we utilize SVN_I_LOVE_CORRUPTED_XXX for a few options that can cause data corruption. -- justin -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] experimental features
On Fri, Dec 5, 2014 at 12:46 PM, Mark Nelson mark.nel...@inktank.com wrote: I'm in favor of the allow experimental features but instead call it: ALLOW UNRECOVERABLE DATA CORRUPTING FEATURES which makes things a little more explicit. With great power comes great responsibility. +1. For Subversion, we utilize SVN_I_LOVE_CORRUPTED_XXX for a few options that can cause data corruption. -- justin ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] experimental features
I've been thinking for a while that we need another more general command than Ceph health to more generally inform you about your cluster. IE I personally don't like having min/max PG warnings in Ceph health (they can be independently controlled by ceph.conf options but that kind of approach won't scale). I'd like another command that I can run that tells me about this kind of thing. Same thing with experimental features. I don't want ceph health warning me if they've been enabled, but I do want to know if they've ever been enabled, when, and whether they are still in effect. Mark On 12/08/2014 06:57 AM, Fred Yang wrote: You will have to consider in the real world whoever built the cluster might not document the dangerous option to let support stuff or successor aware. Thus any experimental feature considered not safe for production should be included in a warning message in 'ceph health', and logs, either log it periodically or log the warning msg upon restart. Feature-wise, 'ceph health detail' should give you a report over all important features/options of the cluster as well. -Fred On Sun, Dec 7, 2014, 11:15 PM Justin Erenkrantz jus...@erenkrantz.com mailto:jus...@erenkrantz.com wrote: On Fri, Dec 5, 2014 at 12:46 PM, Mark Nelson mark.nel...@inktank.com mailto:mark.nel...@inktank.com wrote: I'm in favor of the allow experimental features but instead call it: ALLOW UNRECOVERABLE DATA CORRUPTING FEATURES which makes things a little more explicit. With great power comes great responsibility. +1. For Subversion, we utilize SVN_I_LOVE_CORRUPTED_XXX for a few options that can cause data corruption. -- justin -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org mailto:majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/__majordomo-info.html http://vger.kernel.org/majordomo-info.html ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] experimental features
A while back we merged Haomai's experimental OSD backend KeyValueStore. We named the config option 'keyvaluestore_dev', hoping to make it clear to users that it was still under development, not fully tested, and not yet ready for production. In retrospect, I don't think '_dev' was sufficiently scary because many users tried it and ran into unexpectd trouble. There are several other features we've recently added or are considering adding that fall into this category. Having them in the tree is great because it streamlines QA and testing, but I want to make sure that users are not able to enable the features without being aware of the risks. A few possible suggestions: - scarier option names, like osd objectstore = keyvaluestore_experimental_danger_danger ms type = async_experimental_danger_danger ms type = xio_experimental_danger_danger Once the feature becomes stable, they'll have to adjust their config, or we'll need to support both names going forward. - a separate config option that allows any experimental option allow experimental features danger danger = true osd objectstore = keyvaluestore ms type = xio This runs the risk that the user will enable experimental features to get X, and later start using Y without realizing Y is also experiemental. - enumerate experiemntal options we want to enable allow experimental features danger danger = keyvaluestore, xio ms type = xio osd objectstore = keyvaluestore This has the property that no config change is necessary when the feature drops its experimental status. In all of these cases, we can also make a point of sending something to the log on daemon startup. I don't think too many people will notice this, but it is better than nothing. Other ideas? sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] experimental features
* On 05 Dec 2014, Sage Weil wrote: adding that fall into this category. Having them in the tree is great because it streamlines QA and testing, but I want to make sure that users are not able to enable the features without being aware of the risks. A few possible suggestions: - scarier option names, like - a separate config option that allows any experimental option - enumerate experiemntal options we want to enable Other ideas? A separate config file for experimental options: /etc/ceph/danger-danger.conf -- David Champion • d...@uchicago.edu • University of Chicago ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] experimental features
On 12/05/2014 11:39 AM, Gregory Farnum wrote: On Fri, Dec 5, 2014 at 9:36 AM, Sage Weil sw...@redhat.com wrote: A while back we merged Haomai's experimental OSD backend KeyValueStore. We named the config option 'keyvaluestore_dev', hoping to make it clear to users that it was still under development, not fully tested, and not yet ready for production. In retrospect, I don't think '_dev' was sufficiently scary because many users tried it and ran into unexpectd trouble. There are several other features we've recently added or are considering adding that fall into this category. Having them in the tree is great because it streamlines QA and testing, but I want to make sure that users are not able to enable the features without being aware of the risks. A few possible suggestions: - scarier option names, like osd objectstore = keyvaluestore_experimental_danger_danger ms type = async_experimental_danger_danger ms type = xio_experimental_danger_danger Once the feature becomes stable, they'll have to adjust their config, or we'll need to support both names going forward. - a separate config option that allows any experimental option allow experimental features danger danger = true osd objectstore = keyvaluestore ms type = xio This runs the risk that the user will enable experimental features to get X, and later start using Y without realizing Y is also experiemental. - enumerate experiemntal options we want to enable allow experimental features danger danger = keyvaluestore, xio ms type = xio osd objectstore = keyvaluestore This has the property that no config change is necessary when the feature drops its experimental status. In all of these cases, we can also make a point of sending something to the log on daemon startup. I don't think too many people will notice this, but it is better than nothing. Other ideas? I don't think these should even be going into release packages for users to work with. We can build them on the dev gitbuilders for QA and testing without them ever reaching the hands of users grabbing our production packages. ;) -Greg -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html I'm in favor of the allow experimental features but instead call it: ALLOW UNRECOVERABLE DATA CORRUPTING FEATURES which makes things a little more explicit. With great power comes great responsibility. Mark ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] experimental features
On 12/05/2014 11:47 AM, David Champion wrote: * On 05 Dec 2014, Sage Weil wrote: adding that fall into this category. Having them in the tree is great because it streamlines QA and testing, but I want to make sure that users are not able to enable the features without being aware of the risks. A few possible suggestions: - scarier option names, like - a separate config option that allows any experimental option - enumerate experiemntal options we want to enable Other ideas? A separate config file for experimental options: /etc/ceph/danger-danger.conf One of the questions I have in this is once you've enabled experimental features, should the cluster be considered experimental forever, even after the feature has become stable? Maybe some kind of subtle corruption has worked it's way in it will take a while to manifest. It seems to me like if you've enabled experimental features on a cluster that all bets are off. It seems to me like having the features in a separate ceph.conf file would imply that you just get rid of the danger.conf file and things are back to normal, but that's not really how it is imho. Mark ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] experimental features
On Fri, Dec 5, 2014 at 9:36 AM, Sage Weil sw...@redhat.com wrote: A while back we merged Haomai's experimental OSD backend KeyValueStore. We named the config option 'keyvaluestore_dev', hoping to make it clear to users that it was still under development, not fully tested, and not yet ready for production. In retrospect, I don't think '_dev' was sufficiently scary because many users tried it and ran into unexpectd trouble. There are several other features we've recently added or are considering adding that fall into this category. Having them in the tree is great because it streamlines QA and testing, but I want to make sure that users are not able to enable the features without being aware of the risks. A few possible suggestions: - scarier option names, like osd objectstore = keyvaluestore_experimental_danger_danger ms type = async_experimental_danger_danger ms type = xio_experimental_danger_danger Once the feature becomes stable, they'll have to adjust their config, or we'll need to support both names going forward. - a separate config option that allows any experimental option allow experimental features danger danger = true osd objectstore = keyvaluestore ms type = xio This runs the risk that the user will enable experimental features to get X, and later start using Y without realizing Y is also experiemental. - enumerate experiemntal options we want to enable allow experimental features danger danger = keyvaluestore, xio ms type = xio osd objectstore = keyvaluestore This has the property that no config change is necessary when the feature drops its experimental status. In all of these cases, we can also make a point of sending something to the log on daemon startup. I don't think too many people will notice this, but it is better than nothing. Other ideas? I don't think these should even be going into release packages for users to work with. We can build them on the dev gitbuilders for QA and testing without them ever reaching the hands of users grabbing our production packages. ;) -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] experimental features
I prefer the third option (enumeration). I don't see a point where we would enable experimental features on our production clusters, but it would be nice to have the same bits and procedures between our dev/beta and production clusters. On Fri, Dec 5, 2014 at 10:36 AM, Sage Weil sw...@redhat.com wrote: A while back we merged Haomai's experimental OSD backend KeyValueStore. We named the config option 'keyvaluestore_dev', hoping to make it clear to users that it was still under development, not fully tested, and not yet ready for production. In retrospect, I don't think '_dev' was sufficiently scary because many users tried it and ran into unexpectd trouble. There are several other features we've recently added or are considering adding that fall into this category. Having them in the tree is great because it streamlines QA and testing, but I want to make sure that users are not able to enable the features without being aware of the risks. A few possible suggestions: - scarier option names, like osd objectstore = keyvaluestore_experimental_danger_danger ms type = async_experimental_danger_danger ms type = xio_experimental_danger_danger Once the feature becomes stable, they'll have to adjust their config, or we'll need to support both names going forward. - a separate config option that allows any experimental option allow experimental features danger danger = true osd objectstore = keyvaluestore ms type = xio This runs the risk that the user will enable experimental features to get X, and later start using Y without realizing Y is also experiemental. - enumerate experiemntal options we want to enable allow experimental features danger danger = keyvaluestore, xio ms type = xio osd objectstore = keyvaluestore This has the property that no config change is necessary when the feature drops its experimental status. In all of these cases, we can also make a point of sending something to the log on daemon startup. I don't think too many people will notice this, but it is better than nothing. Other ideas? sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] experimental features
On Sat, Dec 6, 2014 at 4:36 AM, Sage Weil sw...@redhat.com wrote: - enumerate experiemntal options we want to enable ... This has the property that no config change is necessary when the feature drops its experimental status. It keeps the risky options in one place too so easier to spot. In all of these cases, we can also make a point of sending something to the log on daemon startup. I don't think too many people will notice this, but it is better than nothing. Perhaps change the cluster health status to FRAGILE? or AT_RISK? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com