Re: [ceph-users] dropping filestore+btrfs testing for luminous
On 2017-06-30T16:48:04, Sage Weilwrote: > > Simply disabling the tests while keeping the code in the distribution is > > setting up users who happen to be using Btrfs for failure. > > I don't think we can wait *another* cycle (year) to stop testing this. > > We can, however, > > - prominently feature this in the luminous release notes, and > - require the 'enable experimental unrecoverable data corrupting features = > btrfs' in order to use it, so that users are explicitly opting in to > luminous+btrfs territory. > > The only good(ish) news is that we aren't touching FileStore if we can > help it, so it less likely to regress than other things. And we'll > continue testing filestore+btrfs on jewel for some time. That makes sense. Though btrfs is something users really shouldn't run unless they get a heavily debugged and supported version from somewhere. I'd also not mind just plain out dropping it completely, since I don't believe any of our users runs it, they're all on XFS and will upconvert to BlueStore. That might be a good reasoning though: upgrading folks should be able to get the OSDs on btrfs up (if they still have any) and go directly the BlueStore, without having to first go via XFS. Regards, Lars -- SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] dropping filestore+btrfs testing for luminous
Le 04/07/2017 à 19:00, Jack a écrit : > You may just upgrade to Luminous, then replace filestore by bluestore You don't just "replace" filestore by bluestore on a production cluster : you transition over several weeks/months from the first to the second. The two must be rock stable and have predictable performance characteristics to do that. We took more than 6 months with Firefly to migrate from XFS to Btrfs and studied/tuned the cluster along the way. Simply replacing a store by another without any experience of the real world behavior of the new one is just playing with fire (and a huge heap of customer data). Best regards, Lionel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] dropping filestore+btrfs testing for luminous
You may just upgrade to Luminous, then replace filestore by bluestore Don't be scared, as Sage said: > The only good(ish) news is that we aren't touching FileStore if we can > help it, so it less likely to regress than other things. And we'll > continue testing filestore+btrfs on jewel for some time. In my opinion, it should be fine that way On 04/07/2017 18:54, Lionel Bouton wrote: > Le 30/06/2017 à 18:48, Sage Weil a écrit : >> On Fri, 30 Jun 2017, Lenz Grimmer wrote: >>> Hi Sage, >>> >>> On 06/30/2017 05:21 AM, Sage Weil wrote: >>> The easiest thing is to 1/ Stop testing filestore+btrfs for luminous onward. We've recommended against btrfs for a long time and are moving toward bluestore anyway. >>> Searching the documentation for "btrfs" does not really give a user any >>> clue that the use of Btrfs is discouraged. >>> >>> Where exactly has this been recommended? >>> >>> The documentation currently states: >>> >>> http://docs.ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=btrfs#osds >>> >>> "We recommend using the xfs file system or the btrfs file system when >>> running mkfs." >>> >>> http://docs.ceph.com/docs/master/rados/configuration/filesystem-recommendations/?highlight=btrfs#filesystems >>> >>> "btrfs is still supported and has a comparatively compelling set of >>> features, but be mindful of its stability and support status in your >>> Linux distribution." >>> >>> http://docs.ceph.com/docs/master/start/os-recommendations/?highlight=btrfs#ceph-dependencies >>> >>> "If you use the btrfs file system with Ceph, we recommend using a recent >>> Linux kernel (3.14 or later)." >>> >>> As an end user, none of these statements would really sound as >>> recommendations *against* using Btrfs to me. >>> >>> I'm therefore concerned about just disabling the tests related to >>> filestore on Btrfs while still including and shipping it. This has >>> potential to introduce regressions that won't get caught and fixed. >> Ah, crap. This is what happens when devs don't read their own >> documetnation. I recommend against btrfs every time it ever comes up, the >> downstream distributions all support only xfs, but yes, it looks like the >> docs never got updated... despite the xfs focus being 5ish years old now. >> >> I'll submit a PR to clean this up, but >> 2/ Leave btrfs in the mix for jewel, and manually tolerate and filter out the occasional ENOSPC errors we see. (They make the test runs noisy but are pretty easy to identify.) If we don't stop testing filestore on btrfs now, I'm not sure when we would ever be able to stop, and that's pretty clearly not sustainable. Does that seem reasonable? (Pretty please?) >>> If you want to get rid of filestore on Btrfs, start a proper deprecation >>> process and inform users that support for it it's going to be removed in >>> the near future. The documentation must be updated accordingly and it >>> must be clearly emphasized in the release notes. >>> >>> Simply disabling the tests while keeping the code in the distribution is >>> setting up users who happen to be using Btrfs for failure. >> I don't think we can wait *another* cycle (year) to stop testing this. >> >> We can, however, >> >> - prominently feature this in the luminous release notes, and >> - require the 'enable experimental unrecoverable data corrupting features = >> btrfs' in order to use it, so that users are explicitly opting in to >> luminous+btrfs territory. >> >> The only good(ish) news is that we aren't touching FileStore if we can >> help it, so it less likely to regress than other things. And we'll >> continue testing filestore+btrfs on jewel for some time. >> >> Is that good enough? > > Not sure how we will handle the transition. Is bluestore considered > stable in Jewel ? Then our current clusters (recently migrated from > Firefly to Hammer) will have support for both BTRFS+Filestore and > Bluestore when the next upgrade takes place. If Bluestore is only > considered stable on Luminous I don't see how we can manage the > transition easily. The only path I see is to : > - migrate to XFS+filestore with Jewel (which will not only take time but > will be a regression for us : this will cause performance and sizing > problems on at least one of our clusters and we will lose the silent > corruption detection from BTRFS) > - then upgrade to Luminous and migrate again to Bluestore. > I was not expecting the transition from Btrfs+Filestore to Bluestore to > be this convoluted (we were planning to add Bluestore OSDs one at a time > and study the performance/stability for months before migrating the > whole clusters). Is there any way to restrict your BTRFS tests to at > least a given stable configuration (BTRFS is known to have problems with > the high rate of snapshot deletion Ceph generates by default for example > and we use 'filestore btrfs snap = false') ? > > Best regards, > > Lionel >
Re: [ceph-users] dropping filestore+btrfs testing for luminous
Le 30/06/2017 à 18:48, Sage Weil a écrit : > On Fri, 30 Jun 2017, Lenz Grimmer wrote: >> Hi Sage, >> >> On 06/30/2017 05:21 AM, Sage Weil wrote: >> >>> The easiest thing is to >>> >>> 1/ Stop testing filestore+btrfs for luminous onward. We've recommended >>> against btrfs for a long time and are moving toward bluestore anyway. >> Searching the documentation for "btrfs" does not really give a user any >> clue that the use of Btrfs is discouraged. >> >> Where exactly has this been recommended? >> >> The documentation currently states: >> >> http://docs.ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=btrfs#osds >> >> "We recommend using the xfs file system or the btrfs file system when >> running mkfs." >> >> http://docs.ceph.com/docs/master/rados/configuration/filesystem-recommendations/?highlight=btrfs#filesystems >> >> "btrfs is still supported and has a comparatively compelling set of >> features, but be mindful of its stability and support status in your >> Linux distribution." >> >> http://docs.ceph.com/docs/master/start/os-recommendations/?highlight=btrfs#ceph-dependencies >> >> "If you use the btrfs file system with Ceph, we recommend using a recent >> Linux kernel (3.14 or later)." >> >> As an end user, none of these statements would really sound as >> recommendations *against* using Btrfs to me. >> >> I'm therefore concerned about just disabling the tests related to >> filestore on Btrfs while still including and shipping it. This has >> potential to introduce regressions that won't get caught and fixed. > Ah, crap. This is what happens when devs don't read their own > documetnation. I recommend against btrfs every time it ever comes up, the > downstream distributions all support only xfs, but yes, it looks like the > docs never got updated... despite the xfs focus being 5ish years old now. > > I'll submit a PR to clean this up, but > >>> 2/ Leave btrfs in the mix for jewel, and manually tolerate and filter out >>> the occasional ENOSPC errors we see. (They make the test runs noisy but >>> are pretty easy to identify.) >>> >>> If we don't stop testing filestore on btrfs now, I'm not sure when we >>> would ever be able to stop, and that's pretty clearly not sustainable. >>> Does that seem reasonable? (Pretty please?) >> If you want to get rid of filestore on Btrfs, start a proper deprecation >> process and inform users that support for it it's going to be removed in >> the near future. The documentation must be updated accordingly and it >> must be clearly emphasized in the release notes. >> >> Simply disabling the tests while keeping the code in the distribution is >> setting up users who happen to be using Btrfs for failure. > I don't think we can wait *another* cycle (year) to stop testing this. > > We can, however, > > - prominently feature this in the luminous release notes, and > - require the 'enable experimental unrecoverable data corrupting features = > btrfs' in order to use it, so that users are explicitly opting in to > luminous+btrfs territory. > > The only good(ish) news is that we aren't touching FileStore if we can > help it, so it less likely to regress than other things. And we'll > continue testing filestore+btrfs on jewel for some time. > > Is that good enough? Not sure how we will handle the transition. Is bluestore considered stable in Jewel ? Then our current clusters (recently migrated from Firefly to Hammer) will have support for both BTRFS+Filestore and Bluestore when the next upgrade takes place. If Bluestore is only considered stable on Luminous I don't see how we can manage the transition easily. The only path I see is to : - migrate to XFS+filestore with Jewel (which will not only take time but will be a regression for us : this will cause performance and sizing problems on at least one of our clusters and we will lose the silent corruption detection from BTRFS) - then upgrade to Luminous and migrate again to Bluestore. I was not expecting the transition from Btrfs+Filestore to Bluestore to be this convoluted (we were planning to add Bluestore OSDs one at a time and study the performance/stability for months before migrating the whole clusters). Is there any way to restrict your BTRFS tests to at least a given stable configuration (BTRFS is known to have problems with the high rate of snapshot deletion Ceph generates by default for example and we use 'filestore btrfs snap = false') ? Best regards, Lionel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] dropping filestore+btrfs testing for luminous
> Op 30 juni 2017 om 18:48 schreef Sage Weil: > > > On Fri, 30 Jun 2017, Lenz Grimmer wrote: > > Hi Sage, > > > > On 06/30/2017 05:21 AM, Sage Weil wrote: > > > > > The easiest thing is to > > > > > > 1/ Stop testing filestore+btrfs for luminous onward. We've recommended > > > against btrfs for a long time and are moving toward bluestore anyway. > > > > Searching the documentation for "btrfs" does not really give a user any > > clue that the use of Btrfs is discouraged. > > > > Where exactly has this been recommended? > > > > The documentation currently states: > > > > http://docs.ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=btrfs#osds > > > > "We recommend using the xfs file system or the btrfs file system when > > running mkfs." > > > > http://docs.ceph.com/docs/master/rados/configuration/filesystem-recommendations/?highlight=btrfs#filesystems > > > > "btrfs is still supported and has a comparatively compelling set of > > features, but be mindful of its stability and support status in your > > Linux distribution." > > > > http://docs.ceph.com/docs/master/start/os-recommendations/?highlight=btrfs#ceph-dependencies > > > > "If you use the btrfs file system with Ceph, we recommend using a recent > > Linux kernel (3.14 or later)." > > > > As an end user, none of these statements would really sound as > > recommendations *against* using Btrfs to me. > > > > I'm therefore concerned about just disabling the tests related to > > filestore on Btrfs while still including and shipping it. This has > > potential to introduce regressions that won't get caught and fixed. > > Ah, crap. This is what happens when devs don't read their own > documetnation. I recommend against btrfs every time it ever comes up, the > downstream distributions all support only xfs, but yes, it looks like the > docs never got updated... despite the xfs focus being 5ish years old now. > > I'll submit a PR to clean this up, but > > > > 2/ Leave btrfs in the mix for jewel, and manually tolerate and filter out > > > the occasional ENOSPC errors we see. (They make the test runs noisy but > > > are pretty easy to identify.) > > > > > > If we don't stop testing filestore on btrfs now, I'm not sure when we > > > would ever be able to stop, and that's pretty clearly not sustainable. > > > Does that seem reasonable? (Pretty please?) > > > > If you want to get rid of filestore on Btrfs, start a proper deprecation > > process and inform users that support for it it's going to be removed in > > the near future. The documentation must be updated accordingly and it > > must be clearly emphasized in the release notes. > > > > Simply disabling the tests while keeping the code in the distribution is > > setting up users who happen to be using Btrfs for failure. > > I don't think we can wait *another* cycle (year) to stop testing this. > > We can, however, > > - prominently feature this in the luminous release notes, and > - require the 'enable experimental unrecoverable data corrupting features = > btrfs' in order to use it, so that users are explicitly opting in to > luminous+btrfs territory. > > The only good(ish) news is that we aren't touching FileStore if we can > help it, so it less likely to regress than other things. And we'll > continue testing filestore+btrfs on jewel for some time. > > Is that good enough? Sounds good to me. Every cluster I run into runs XFS. People running btrfs did that deliberately and by adding that flag you encourage them to go to BlueStore. Wido > > sage > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] dropping filestore+btrfs testing for luminous
On 30/06/2017 18:48, Sage Weil wrote: > We can, however, > > - prominently feature this in the luminous release notes, and > - require the 'enable experimental unrecoverable data corrupting features = > btrfs' in order to use it, so that users are explicitly opting in to > luminous+btrfs territory. > Is that good enough? > > sage This seems sane to me ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] dropping filestore+btrfs testing for luminous
On Fri, 30 Jun 2017, Lenz Grimmer wrote: > Hi Sage, > > On 06/30/2017 05:21 AM, Sage Weil wrote: > > > The easiest thing is to > > > > 1/ Stop testing filestore+btrfs for luminous onward. We've recommended > > against btrfs for a long time and are moving toward bluestore anyway. > > Searching the documentation for "btrfs" does not really give a user any > clue that the use of Btrfs is discouraged. > > Where exactly has this been recommended? > > The documentation currently states: > > http://docs.ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=btrfs#osds > > "We recommend using the xfs file system or the btrfs file system when > running mkfs." > > http://docs.ceph.com/docs/master/rados/configuration/filesystem-recommendations/?highlight=btrfs#filesystems > > "btrfs is still supported and has a comparatively compelling set of > features, but be mindful of its stability and support status in your > Linux distribution." > > http://docs.ceph.com/docs/master/start/os-recommendations/?highlight=btrfs#ceph-dependencies > > "If you use the btrfs file system with Ceph, we recommend using a recent > Linux kernel (3.14 or later)." > > As an end user, none of these statements would really sound as > recommendations *against* using Btrfs to me. > > I'm therefore concerned about just disabling the tests related to > filestore on Btrfs while still including and shipping it. This has > potential to introduce regressions that won't get caught and fixed. Ah, crap. This is what happens when devs don't read their own documetnation. I recommend against btrfs every time it ever comes up, the downstream distributions all support only xfs, but yes, it looks like the docs never got updated... despite the xfs focus being 5ish years old now. I'll submit a PR to clean this up, but > > 2/ Leave btrfs in the mix for jewel, and manually tolerate and filter out > > the occasional ENOSPC errors we see. (They make the test runs noisy but > > are pretty easy to identify.) > > > > If we don't stop testing filestore on btrfs now, I'm not sure when we > > would ever be able to stop, and that's pretty clearly not sustainable. > > Does that seem reasonable? (Pretty please?) > > If you want to get rid of filestore on Btrfs, start a proper deprecation > process and inform users that support for it it's going to be removed in > the near future. The documentation must be updated accordingly and it > must be clearly emphasized in the release notes. > > Simply disabling the tests while keeping the code in the distribution is > setting up users who happen to be using Btrfs for failure. I don't think we can wait *another* cycle (year) to stop testing this. We can, however, - prominently feature this in the luminous release notes, and - require the 'enable experimental unrecoverable data corrupting features = btrfs' in order to use it, so that users are explicitly opting in to luminous+btrfs territory. The only good(ish) news is that we aren't touching FileStore if we can help it, so it less likely to regress than other things. And we'll continue testing filestore+btrfs on jewel for some time. Is that good enough? sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] dropping filestore+btrfs testing for luminous
On Fri, 30 Jun 2017 16:29:43 + David Turner wrote: > I actually don't see either of these as issues with just flat out saying > that Btrfs will not be supported in Luminous. It's a full new release and > it sounds like it is no longer a relevant Filestore backend in Luminous. > People can either plan to migrate their OSDs to Bluestore once they reach > Luminous or just not upgrade to Luminous. Upgrading is optional and not > mandatory. > You tell that to the people in charge when there's a critical bug in a version that's no longer maintained. At the release cycle speed of Ceph this tends to be an option only for those of us who are happy to freeze a cluster at a certain version until it dies of natural causes. That being said, anybody who deployed BTRFS within the last 1-2 years should have seen the writing on the wall, but the ability of reading between the lines is not an excuse for a "proper deprecation" indeed. And at this time that probably should be extended formally to ZFS. Christian > On Fri, Jun 30, 2017 at 11:47 AM Lenz Grimmerwrote: > > > Hi Sage, > > > > On 06/30/2017 05:21 AM, Sage Weil wrote: > > > > > The easiest thing is to > > > > > > 1/ Stop testing filestore+btrfs for luminous onward. We've recommended > > > against btrfs for a long time and are moving toward bluestore anyway. > > > > Searching the documentation for "btrfs" does not really give a user any > > clue that the use of Btrfs is discouraged. > > > > Where exactly has this been recommended? > > > > The documentation currently states: > > > > > > http://docs.ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=btrfs#osds > > > > "We recommend using the xfs file system or the btrfs file system when > > running mkfs." > > > > > > http://docs.ceph.com/docs/master/rados/configuration/filesystem-recommendations/?highlight=btrfs#filesystems > > > > "btrfs is still supported and has a comparatively compelling set of > > features, but be mindful of its stability and support status in your > > Linux distribution." > > > > > > http://docs.ceph.com/docs/master/start/os-recommendations/?highlight=btrfs#ceph-dependencies > > > > "If you use the btrfs file system with Ceph, we recommend using a recent > > Linux kernel (3.14 or later)." > > > > As an end user, none of these statements would really sound as > > recommendations *against* using Btrfs to me. > > > > I'm therefore concerned about just disabling the tests related to > > filestore on Btrfs while still including and shipping it. This has > > potential to introduce regressions that won't get caught and fixed. > > > > > 2/ Leave btrfs in the mix for jewel, and manually tolerate and filter out > > > the occasional ENOSPC errors we see. (They make the test runs noisy but > > > are pretty easy to identify.) > > > > > > If we don't stop testing filestore on btrfs now, I'm not sure when we > > > would ever be able to stop, and that's pretty clearly not sustainable. > > > Does that seem reasonable? (Pretty please?) > > > > If you want to get rid of filestore on Btrfs, start a proper deprecation > > process and inform users that support for it it's going to be removed in > > the near future. The documentation must be updated accordingly and it > > must be clearly emphasized in the release notes. > > > > Simply disabling the tests while keeping the code in the distribution is > > setting up users who happen to be using Btrfs for failure. > > > > Just my 0.02€, > > > > Lenz > > > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] dropping filestore+btrfs testing for luminous
I actually don't see either of these as issues with just flat out saying that Btrfs will not be supported in Luminous. It's a full new release and it sounds like it is no longer a relevant Filestore backend in Luminous. People can either plan to migrate their OSDs to Bluestore once they reach Luminous or just not upgrade to Luminous. Upgrading is optional and not mandatory. On Fri, Jun 30, 2017 at 11:47 AM Lenz Grimmerwrote: > Hi Sage, > > On 06/30/2017 05:21 AM, Sage Weil wrote: > > > The easiest thing is to > > > > 1/ Stop testing filestore+btrfs for luminous onward. We've recommended > > against btrfs for a long time and are moving toward bluestore anyway. > > Searching the documentation for "btrfs" does not really give a user any > clue that the use of Btrfs is discouraged. > > Where exactly has this been recommended? > > The documentation currently states: > > > http://docs.ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=btrfs#osds > > "We recommend using the xfs file system or the btrfs file system when > running mkfs." > > > http://docs.ceph.com/docs/master/rados/configuration/filesystem-recommendations/?highlight=btrfs#filesystems > > "btrfs is still supported and has a comparatively compelling set of > features, but be mindful of its stability and support status in your > Linux distribution." > > > http://docs.ceph.com/docs/master/start/os-recommendations/?highlight=btrfs#ceph-dependencies > > "If you use the btrfs file system with Ceph, we recommend using a recent > Linux kernel (3.14 or later)." > > As an end user, none of these statements would really sound as > recommendations *against* using Btrfs to me. > > I'm therefore concerned about just disabling the tests related to > filestore on Btrfs while still including and shipping it. This has > potential to introduce regressions that won't get caught and fixed. > > > 2/ Leave btrfs in the mix for jewel, and manually tolerate and filter out > > the occasional ENOSPC errors we see. (They make the test runs noisy but > > are pretty easy to identify.) > > > > If we don't stop testing filestore on btrfs now, I'm not sure when we > > would ever be able to stop, and that's pretty clearly not sustainable. > > Does that seem reasonable? (Pretty please?) > > If you want to get rid of filestore on Btrfs, start a proper deprecation > process and inform users that support for it it's going to be removed in > the near future. The documentation must be updated accordingly and it > must be clearly emphasized in the release notes. > > Simply disabling the tests while keeping the code in the distribution is > setting up users who happen to be using Btrfs for failure. > > Just my 0.02€, > > Lenz > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] dropping filestore+btrfs testing for luminous
On Fri, 30 Jun 2017, Lenz Grimmer said: > > 1/ Stop testing filestore+btrfs for luminous onward. We've recommended > > against btrfs for a long time and are moving toward bluestore anyway. > > Searching the documentation for "btrfs" does not really give a user any > clue that the use of Btrfs is discouraged. > > Where exactly has this been recommended? As a new user, I certainly picked up on btrfs being discouraged, or not as stable as XFS. e.g. http://docs.ceph.com/docs/master/rados/configuration/filesystem-recommendations/?highlight=btrfs "We currently recommend XFS for production deployments. We used to recommend btrfs for testing, development, and any non-critical deployments ..." http://docs.ceph.com/docs/master/start/hardware-recommendations/?highlight=btrfs "btrfs is not quite stable enough for production" > If you want to get rid of filestore on Btrfs, start a proper deprecation > process and inform users that support for it it's going to be removed in > the near future. The documentation must be updated accordingly and it > must be clearly emphasized in the release notes. But this sounds sane. Sean Purdy CV-Library Ltd ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] dropping filestore+btrfs testing for luminous
Hi Sage, On 06/30/2017 05:21 AM, Sage Weil wrote: > The easiest thing is to > > 1/ Stop testing filestore+btrfs for luminous onward. We've recommended > against btrfs for a long time and are moving toward bluestore anyway. Searching the documentation for "btrfs" does not really give a user any clue that the use of Btrfs is discouraged. Where exactly has this been recommended? The documentation currently states: http://docs.ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=btrfs#osds "We recommend using the xfs file system or the btrfs file system when running mkfs." http://docs.ceph.com/docs/master/rados/configuration/filesystem-recommendations/?highlight=btrfs#filesystems "btrfs is still supported and has a comparatively compelling set of features, but be mindful of its stability and support status in your Linux distribution." http://docs.ceph.com/docs/master/start/os-recommendations/?highlight=btrfs#ceph-dependencies "If you use the btrfs file system with Ceph, we recommend using a recent Linux kernel (3.14 or later)." As an end user, none of these statements would really sound as recommendations *against* using Btrfs to me. I'm therefore concerned about just disabling the tests related to filestore on Btrfs while still including and shipping it. This has potential to introduce regressions that won't get caught and fixed. > 2/ Leave btrfs in the mix for jewel, and manually tolerate and filter out > the occasional ENOSPC errors we see. (They make the test runs noisy but > are pretty easy to identify.) > > If we don't stop testing filestore on btrfs now, I'm not sure when we > would ever be able to stop, and that's pretty clearly not sustainable. > Does that seem reasonable? (Pretty please?) If you want to get rid of filestore on Btrfs, start a proper deprecation process and inform users that support for it it's going to be removed in the near future. The documentation must be updated accordingly and it must be clearly emphasized in the release notes. Simply disabling the tests while keeping the code in the distribution is setting up users who happen to be using Btrfs for failure. Just my 0.02€, Lenz signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] dropping filestore+btrfs testing for luminous
On 06/30/17 05:21, Sage Weil wrote: > We're having a series of problems with the valgrind included in xenial[1] > that have led us to restrict all valgrind tests to centos nodes. At teh > same time, we're also seeing spurious ENOSPC errors from btrfs on both > centos on xenial kernels[2], making trusty the only distro where btrfs > works reliably. Do you guys know about balance filters and how to use them to prevent ENOSPC? see: https://btrfs.wiki.kernel.org/index.php/Balance_Filters Basically it sometimes (when using snaps heavily) just has many partially used chunks and so you rebalance the data inside them so it can remove the fully empty ones and reuse the space. The above page says to run commands like: > btrfs balance start -dusage=50 / where you start at 50 or so and raise it up and rerun if you want, until you reclaimed enough space. So, to make the automated tests eat less of your time, you could script something that runs that after some number of unit tests, or in the btrfs filestore itself can do it after removing some amount of snapshots, or after ENOSPC and retry. I don't know what's easy to implement, just making sure you're aware of the option. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] dropping filestore+btrfs testing for luminous
We're having a series of problems with the valgrind included in xenial[1] that have led us to restrict all valgrind tests to centos nodes. At teh same time, we're also seeing spurious ENOSPC errors from btrfs on both centos on xenial kernels[2], making trusty the only distro where btrfs works reliably. Teuthology doesn't handle this well when it tries to put together the test matrix (we can't test filestore+btrfs+valgrind). The easiest thing is to 1/ Stop testing filestore+btrfs for luminous onward. We've recommended against btrfs for a long time and are moving toward bluestore anyway. 2/ Leave btrfs in the mix for jewel, and manually tolerate and filter out the occasional ENOSPC errors we see. (They make the test runs noisy but are pretty easy to identify.) If we don't stop testing filestore on btrfs now, I'm not sure when we would ever be able to stop, and that's pretty clearly not sustainable. Does that seem reasonable? (Pretty please?) sage [1] http://tracker.ceph.com/issues/18126 and http://tracker.ceph.com/issues/20360 [2] http://tracker.ceph.com/issues/20169 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com