Re: Feedback on a base "fake" type in the testing repo
High level thoughts: 1. Using an interface{} instead of patching function pointers sounds like a great thing 2. Writing a provider via faking out rather than implementing a double means that it is really hard to run the cross-provider interface tests (environs/jujutest), which means we run into what we have with Azure, where *some* of the functions return a different error than the one we expect (see the recent 'juju restore' regression). Our interface tests aren't amazing, but if we can push more on that, then we can avoid having implementations leak through their abstractions. 3. I do realize that time-to-something-working is a lot faster with a Fake implementation, but ongoing support and maintenance might pay off more. (I love how quickly the GCE provider was available, I'm concerned if we know that it actually acts like EC2/Openstack when it comes to edge cases.) John =:-> On Wed, Feb 11, 2015 at 10:53 PM, Eric Snow wrote: > tl;dr Using fakes for testing works well so I wrote a base fake type. [1] > > While working on the GCE provider, Wayne and I started taking a > different approach to unit testing than the usual 1. expose an > external dependency as an unexported var; 2.export it in > export_test.go; 3. patch it out in tests. [2] Instead we wrote an > interface for the provider's low-level API and implemented the > provider relative to that. [3] Then in tests we used a fake > implementation of that interface instead of the concrete one. Instead > of making the actual API requests, the fake simply tracked method > calls and controlled return values. > > Using the fake had a number of benefits: > > 1. our tests were very focused on what code gets exercised, meaning we > don't take responsibility for what amounts to testing our dependencies > at the same time as the code at hand. > 2. the tests ran faster since they aren't making HTTP requests (even > if just to localhost). > 3. the provider code isn't cluttered up with > vars-only-there-to-be-patched-out. [2] > 4. the test code isn't cluttered with many separate s.PatchValue calls. [2] > 5. setting up the faked return values was cleaner. > 6. checking which methods were called (in order) along with the > arguments, was easier and cleaner. > > In addition to all that, taking the fake approach required that we > encapsulate our low-level dependencies in a single interface. This > was good because it forced us to spell out those dependencies. That > helped us write the provider better. The low-level interface also > made the code more maintainable since it makes the boundary layers > explicit, and formats it in a concise display (the interface > definition). > > So I've taken the base fake type from the GCE patch and pulled it into > patch against the testing repo. [1] I've made some adjustments based > on use cases I've had in subsequent patches. Nate has the bright idea > of getting some feedback from the team before landing anything "since > it's the kind of thing that'll start popping up everywhere, and I want > to make sure it passes muster with others." > > I'm not suggesting that fakes are the solution to all testing problems > so let's please avoid that discussion. :) Neither am I proposing that > any existing tests should be converted to use fakes. Nor am I > suggesting the need for any policy recommending the use of fakes. As > well, we still need to make sure we have enough "full stack" test > coverage to be confident about integration between layers. Relatedly, > the concrete implementation of low-level interfaces needs adequate > test coverage to ensure the underlying APIs (remote, etc.) continue to > match expected behavior. So yeah, neither fakes nor isolated unit > tests meet all our testing needs. I just think the base fake type > I've written will help facilitate a cleaner approach where it's > appropriate. > > Thoughts? > > -eric > > [1] https://github.com/juju/testing/pull/48 > [2] We were still feeling out the approach so we still took the > var-and-patch in some cases. > [3] See "gceConnection" in http://reviews.vapour.ws/r/771/diff/#46. > > -- > Juju-dev mailing list > Juju-dev@lists.ubuntu.com > Modify settings or unsubscribe at: > https://lists.ubuntu.com/mailman/listinfo/juju-dev > -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Feedback on a base "fake" type in the testing repo
On Wed, Feb 11, 2015 at 4:53 PM, Eric Snow wrote: > tl;dr Using fakes for testing works well so I wrote a base fake type. [1] > > While working on the GCE provider, Wayne and I started taking a > different approach to unit testing than the usual 1. expose an > external dependency as an unexported var; 2.export it in > export_test.go; 3. patch it out in tests. [2] Instead we wrote an > interface for the provider's low-level API and implemented the > provider relative to that. [3] Then in tests we used a fake > implementation of that interface instead of the concrete one. Instead > of making the actual API requests, the fake simply tracked method > calls and controlled return values. This is a "mock object" under some well known people's terminology [1]. There are certainly benefits, but there are also very well known problems in that approach which should be kept in mind. We've experienced them very closely in the first Python-based implementation of juju, and I did many other times elsewhere. I would probably not have written [2] if I had a time machine. The most problematic aspect of this approach is that tests are pretty much always very closely tied to the implementation, in a way that you suddenly cannot touch the implementation anymore without also fixing a vast number tests to comply. Tests organized in this fashion also tend to obfuscate the important semantics being tested, and instead of that you see a sequence of apparently meaningless calls and results out of context. Understanding and working on these tests just a month after you cooked them is a nightmare. The perfect test breaks if and only if the real assumption being challenged changes. Anything walking away from this is decreasing the test quality. As a recommendation to avoid digging a hole -- one that is pretty difficult to climb out of once you're in -- instead of testing method calls and cooking fake return values in your own test, build a real fake object: one that pretends to be a real implementation of that interface, and understands the business logic of it. Then, have methods on it that allow tailoring its behavior, but in a high-level way, closer to the problem than to the code. [1] http://martinfowler.com/articles/mocksArentStubs.html [2] https://labix.org/mocker gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Feedback on a base "fake" type in the testing repo
Thanks for the feedback. :) Your points are helpful and I just need some clarification. Before that, though, I want to make sure we are on the same page with the term "fake". Unfortunately testing nomenclature is a bit muddled so you might be thinking of something (even slightly) different. What gave me pause is you contrasted fakes with doubles. So perhaps you could briefly explain what you mean by both in the context of Go. As for me, by "fake" I mean a struct that implements an interface with essentially no logic other than to keep track of method calls and facilitate controlling their return values explicitly. For examples see the implementations for GCE and in `testing/fake.go`. Thus in tests a fake may be used in place of the concrete implementation of the interface that would be used in production. The onus is on the test writer to populate the fake with the correct return values such that they would match the expected behavior of the concrete implementation. This contrasts with the approach (e.g. taken by EC2) of plugging in an API server that simulates the necessary portions of the low-level API. At that point you're testing your simulated API server just as much as the code you actually want to test. Furthermore, you still have to explicitly control the return values. You're just doing it at a lower level. Regardless, I'm convinced that testing needs to include both high coverage via isolated unit tests and good enough coverage via "full stack" integration tests. Essentially we have to ensure that layers work together properly and that low-level APIs work the way we expect (and don't change unexpectedly). However, the discussion of when fakes are appropriate is somewhat orthogonal to that of how to use fakes in Go (and juju) and what they should look like there. I was hoping to focus on the latter, assuming we agree that fakes are appropriate in at least some situations (which it appears we do). :) On Fri, Feb 13, 2015 at 5:22 AM, John Meinel wrote: > High level thoughts: > > 1. Using an interface{} instead of patching function pointers sounds like a > great thing Exactly! :) Basically that was the original motivator. > 2. Writing a provider via faking out rather than implementing a double > means that it is really hard to run the cross-provider interface tests > (environs/jujutest), which means we run into what we have with Azure, where > *some* of the functions return a different error than the one we expect (see > the recent 'juju restore' regression). Our interface tests aren't amazing, > but if we can push more on that, then we can avoid having implementations > leak through their abstractions. Could you elaborate on this? I'm not terribly familiar with environs/jujutest and hopefully we didn't need to take it into consideration for the GCE provider. It sounds like you are saying we should have tests that tie into the livetests code (I noticed at least the EC2 tests that do so). So I'm otherwise not clear on how our use of fakes in the GCE provider has an impact. Wayne and I definitely want it be correct. Furthermore, one goal we have is for the GCE provider to be the go-to example for a provider implementation. If we're missing something then we want to correct that. [1] > 3. I do realize that time-to-something-working is a lot faster with a Fake > implementation, but ongoing support and maintenance might pay off more. (I > love how quickly the GCE provider was available, I'm concerned if we know > that it actually acts like EC2/Openstack when it comes to edge cases.) By "pay off more" I presume you mean "cost more", based on the context. :) Are you saying that there is a uniform mechanism (via environs/jujutest?) for checking all those edge cases for a provider? If so, we certainly need to use it for the GCE provider. However, I'm not clear on how our use of fakes relates to that, other than that it implies we did not use the aforementioned uniform mechanism. -eric [1] We've discussed these motications on numerous occaions but Wayne can chime in I've mischaracterized them. :) -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Feedback on a base "fake" type in the testing repo
On Fri, Feb 13, 2015 at 2:05 PM, Eric Snow wrote: > As for me, by "fake" I mean a struct that implements an interface with > essentially no logic other than to keep track of method calls and > facilitate controlling their return values explicitly. For examples > see the implementations for GCE and in `testing/fake.go`. Thus in > tests a fake may be used in place of the concrete implementation of > the interface that would be used in production. To me this is a good fake implementation: https://github.com/juju/juju/tree/master/provider/dummy > The onus is on the test writer to populate the fake with the correct > return values such that they would match the expected behavior of the > concrete implementation. That's an optimistic view of it, as I described. > Regardless, I'm convinced that testing needs to include both high > coverage via isolated unit tests and good enough coverage via "full > stack" integration tests. Essentially we have to ensure that layers > work together properly and that low-level APIs work the way we expect > (and don't change unexpectedly). That's globally agreed. What's at stake is how to do these. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Feedback on a base "fake" type in the testing repo
Thanks for the great feedback, Gustavo! I have some follow-up inline. It reflects my understanding of what you wrote, so don't hate me if any of it reflects a *mis*understanding. :) If anything I say *seems* to imply some deeper criticism or veiled aspersion, just assume I didn't actually mean to imply anything and take my statement/question at face value. I'm a pretty straight-forward person and I likely just phrased things poorly. :) -eric On Fri, Feb 13, 2015 at 6:00 AM, Gustavo Niemeyer wrote: > On Wed, Feb 11, 2015 at 4:53 PM, Eric Snow wrote: >> tl;dr Using fakes for testing works well so I wrote a base fake type. [1] >> >> While working on the GCE provider, Wayne and I started taking a >> different approach to unit testing than the usual 1. expose an >> external dependency as an unexported var; 2.export it in >> export_test.go; 3. patch it out in tests. [2] Instead we wrote an >> interface for the provider's low-level API and implemented the >> provider relative to that. [3] Then in tests we used a fake >> implementation of that interface instead of the concrete one. Instead >> of making the actual API requests, the fake simply tracked method >> calls and controlled return values. > > This is a "mock object" under some well known people's terminology [1]. With all due respect to Fowler, the terminology in this space is fairly muddled still. :) > > There are certainly benefits, but there are also very well known > problems in that approach which should be kept in mind. We've > experienced them very closely in the first Python-based implementation > of juju, and I did many other times elsewhere. I would probably not > have written [2] if I had a time machine. > > The most problematic aspect of this approach is that tests are pretty > much always very closely tied to the implementation, in a way that you > suddenly cannot touch the implementation anymore without also fixing a > vast number tests to comply. Let's look at this from the context of "unit" (i.e. function signature) testing. By "implementation" do you mean you mean the function you are testing, or the low-level API the function is using, or both? If the low-level API then it seems like the "real fake object" you describe further on would help by moving at least part of the test setup down out of the test and down into the fake. However aren't you then just as susceptible to changes in the fake with the same maintenance consequences? Ultimately I just don't see how you can avoid depending on low-level details ("closely tied to the implementation") in your tests and still have confidence that you are testing things rigorously. I think the best you can do is define an interface encapsulating the low-level behavior your functions depend on and then implement them relative to that interface and write your tests with that interface faked out. Also, the testing world puts a lot are emphasis on branch coverage in tests. It almost sounds like you are suggesting that is not such an important goal. Could you clarify? Perhaps I'm inferring too much from what you've said. :) > Tests organized in this fashion also tend > to obfuscate the important semantics being tested, and instead of that > you see a sequence of apparently meaningless calls and results out of > context. Understanding and working on these tests just a month after > you cooked them is a nightmare. I think this is alleviated if you implement and test against an interface that strictly defines the low-level API on which you depend. > > The perfect test breaks if and only if the real assumption being > challenged changes. Anything walking away from this is decreasing the > test quality. Agreed. The challenge is in managing the preconditions of each test in a way that is robust to low-level changes but also doesn't "obfuscate the important semantics being tested". In my experience, this is not solved by any single testing approach so a judiciously chosen mix is required. > > As a recommendation to avoid digging a hole -- one that is pretty > difficult to climb out of once you're in -- instead of testing method > calls and cooking fake return values in your own test, build a real > fake object: one that pretends to be a real implementation of that > interface, and understands the business logic of it. Then, have > methods on it that allow tailoring its behavior, but in a high-level > way, closer to the problem than to the code. Ah, I like that! So to rephrase, instead of a type where you just track calls and explicitly control return values, it is better to use a type that implements your expectations about the low-level system, exposed via the same API as the actual one? This would likely still involve both to implement the same interface, right? The thing I like about that approach is that is forces you to "document" your expectations (i.e. dependencies) as code. The problem is that you pay (in development time and in complexity) for an extra layer to enginee
Re: Feedback on a base "fake" type in the testing repo
On Fri, Feb 13, 2015 at 3:25 PM, Eric Snow wrote: >> This is a "mock object" under some well known people's terminology [1]. > > With all due respect to Fowler, the terminology in this space is > fairly muddled still. :) Sure, I'm happy to use any terminology, but I'd prefer to not make one up just now. >> The most problematic aspect of this approach is that tests are pretty >> much always very closely tied to the implementation, in a way that you >> suddenly cannot touch the implementation anymore without also fixing a >> vast number tests to comply. > > Let's look at this from the context of "unit" (i.e. function > signature) testing. By "implementation" do you mean you mean the > function you are testing, or the low-level API the function is using, > or both? If the low-level API then it seems like the "real fake > object" you describe further on would help by moving at least part of > the test setup down out of the test and down into the fake. However > aren't you then just as susceptible to changes in the fake with the > same maintenance consequences? No, because the fake should behave as a normal type would, instead of expecting a very precisely constrained orchestration of calls into its interface. If we hand the implementation a fake value, it should be able to call that value as many times as it wants, with whatever parameters it wants, in whatever order it wants, and its behavior should be consistent with a realistic implementation. Again, see the dummy provider for a convenient example of that in practice. > Ultimately I just don't see how you can avoid depending on low-level > details ("closely tied to the implementation") in your tests and still > have confidence that you are testing things rigorously. I think the I could perceive that on your original email, and it's precisely why I'm worried and responding to this thread. If that logic held any ground, we'd never be able to have organizations that could certify the quality and conformance of devices based on the device itself. Instead, they'd have to go into the industries to see how the device was manufactured. But that's not what it happens.. these organizations get the outcome of the production line, no matter how that worked, because that's the most relevant thing to test. You can change the production line, you can optimize it away, and you can even replace entire components, and it doesn't matter as long as you preserve the quality of the outcome. Of course, on the way to producing a device you'll generally make use of smaller devices, which have their own production lines, and which ensure that the outcome of their own production lines is of high quality. The same thing is true in code. If you spend a lot of time writing tests for your production line, you are optimizing for the wrong goal. You are spending a lot of time, the outcome can still be of poor quality, and you are making it hard to optimize your production line and potentially replace its components and methods by something completely different. Of course, as in actual devices, code is layered, so sub-components can be tested on their own to ensure their promised interfaces hold water, but even there what matters is ensuring that what they promise is being satisfied, rather than how they are doing it. > Also, the testing world puts a lot are emphasis on branch coverage in > tests. It almost sounds like you are suggesting that is not such an > important goal. Could you clarify? Perhaps I'm inferring too much > from what you've said. :) I'd be happy to dive into that, but it's a distraction in this conversation. You can use or not use your coverage tool irrespective of your testing approach. >> As a recommendation to avoid digging a hole -- one that is pretty >> difficult to climb out of once you're in -- instead of testing method >> calls and cooking fake return values in your own test, build a real >> fake object: one that pretends to be a real implementation of that >> interface, and understands the business logic of it. Then, have >> methods on it that allow tailoring its behavior, but in a high-level >> way, closer to the problem than to the code. > > Ah, I like that! So to rephrase, instead of a type where you just > track calls and explicitly control return values, it is better to use > a type that implements your expectations about the low-level system, > exposed via the same API as the actual one? This would likely still > involve both to implement the same interface, right? The thing I like That's right. > about that approach is that is forces you to "document" your > expectations (i.e. dependencies) as code. The problem is that you pay > (in development time and in complexity) for an extra layer to engineer This is irrelevant if you take into account the monumental future cost of mocking everything up. > Regardless, as I noted in an earlier message, I think testing needs to > involve: > > 1. a mix of high branch coverage through isolated
Re: Feedback on a base "fake" type in the testing repo
Again, thanks for the feedback. Responses inline again. -eric On Fri, Feb 13, 2015 at 11:36 AM, Gustavo Niemeyer wrote: > On Fri, Feb 13, 2015 at 3:25 PM, Eric Snow wrote: >>> This is a "mock object" under some well known people's terminology [1]. >> >> With all due respect to Fowler, the terminology in this space is >> fairly muddled still. :) > > Sure, I'm happy to use any terminology, but I'd prefer to not make one > up just now. I've decided to just follow the definitions to which Fowler refers (citing Meszaros). That will be my small effort to help standardize the terminology. :) The name then for what I'm talking about is "stub". > >>> The most problematic aspect of this approach is that tests are pretty >>> much always very closely tied to the implementation, in a way that you >>> suddenly cannot touch the implementation anymore without also fixing a >>> vast number tests to comply. >> >> Let's look at this from the context of "unit" (i.e. function >> signature) testing. By "implementation" do you mean you mean the >> function you are testing, or the low-level API the function is using, >> or both? If the low-level API then it seems like the "real fake >> object" you describe further on would help by moving at least part of >> the test setup down out of the test and down into the fake. However >> aren't you then just as susceptible to changes in the fake with the >> same maintenance consequences? > > No, because the fake should behave as a normal type would, instead of > expecting a very precisely constrained orchestration of calls into its > interface. If we hand the implementation a fake value, it should be > able to call that value as many times as it wants, with whatever > parameters it wants, in whatever order it wants, and its behavior > should be consistent with a realistic implementation. Again, see the > dummy provider for a convenient example of that in practice. Right, that became clear to me after I send my message. So it seems to me that fakes should typically only be written by the maintainer of the thing they fake; and they should leverage as much of the real thing as possible. Otherwise you have to write your own fake and try to duplicate all the business logic in the real deal and run the risk of getting it wrong. Either way you'd end up with the risk of getting out-of-sync (if you don't leverage the production code). So with a fake you still have to manage its state for at least some tests (e.g. stick data into its DB) to satisfy each test's preconditions. That implies you need knowledge of how to manage the fake's state, which isn't free, particularly for a large or complex faked system. So I guess fake vs. stub is still a trade-off. > >> Ultimately I just don't see how you can avoid depending on low-level >> details ("closely tied to the implementation") in your tests and still >> have confidence that you are testing things rigorously. I think the > > I could perceive that on your original email, and it's precisely why > I'm worried and responding to this thread. I'm realizing that we're talking about the same thing from two different points of view. Your objection is to testing a function's code in contrast to validating its outputs in response to inputs. If that's the case then I think we're on the same page. I've just been describing a function's low-level dependencies in terms of it's implementation. I agree that tests focused on the implementation rather than the "contract" are fragile and misguided, though I admit that I do it from time to time. :) My point is just that the low-level API used by a function (regardless of how) is an input for the function, even if often just an implicit input. Furthermore, the state backing that low-level API is necessarily part of that input. Code should be written with that in mind and so should tests. Using a fake for that input means you don't have to encode the low-level business logic in each test (just any setup of the fake's state). You can be confident about the low-level behavior during tests as matching production operation (as long as the fake implementation is correct and bug free). The potential downsides are any performance costs of using the fake, maintaining the fake (if applicable), and knowing how to manage the fake's state. Consequently, there should be a mechanism to ensure that the fake's behavior matches the real thing. Alternately you can use a "stub" (what I was calling a fake) for that input. On the upside, stubs are lightweight, both performance-wise and in terms of engineer time. They also help limit the scope of what executes to just the code in the function under test. The downside is that each test must encode the relevant business logic (mapping low-level inputs to low-level outputs) into the test's setup. Not only is that fragile but the low-level return values will probably not have a lot of context where they appear in the test setup code (without comments). Still, kee
Start/Stop machines
Hello, Does anybody is aware of a blueprint or WIP for implementing cross-provider machine start/stop operations ? This would be really useful for suspend/resuming environments. We are currently using this plugin (https://github.com/niedbalski/juju-suspend) but would be nice to have a proper implementation accessible through the API for all the providers. I found this bug ( https://bugs.launchpad.net/juju-core/+bug/1252781 ) associated with this but has been triaged since a while. Any ideas? -- Jorge Niedbalski R. -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Feedback on a base "fake" type in the testing repo
On Fri, Feb 13, 2015 at 6:50 PM, Eric Snow wrote: > Using a fake for that input means you don't have to encode the > low-level business logic in each test (just any setup of the fake's > state). You can be confident about the low-level behavior during > tests as matching production operation (as long as the fake > implementation is correct and bug free). The potential downsides are > any performance costs of using the fake, maintaining the fake (if > applicable), and knowing how to manage the fake's state. > Consequently, there should be a mechanism to ensure that the fake's > behavior matches the real thing. All of these costs exist whether you are dealing with one fake implementation, or with logic spread through five hundred tests which all include details of that interface. Hopefully the saner approach is obvious. > Alternately you can use a "stub" (what I was calling a fake) for that > input. On the upside, stubs are lightweight, both performance-wise > and in terms of engineer time. They also help limit the scope of what > executes to just the code in the function under test. The downside is > that each test must encode the relevant business logic (mapping > low-level inputs to low-level outputs) into the test's setup. Not > only is that fragile but the low-level return values will probably not > have a lot of context where they appear in the test setup code > (without comments). Precisely. And that also implies the test knows exactly how the function is using the interface, thus leaking its implementation into the test, and preventing the implementation to change even in simple ways without breaking the test. >> I'd be very careful to not overdo this. Covering a line just for the >> fun of seeing the CPU passing there is irrelevant. If you fake every >> single thing around it with no care, you'll have a CPU pointer jumping >> in and out of it, without any relevant achievement. > > I was talking about branches in the source code, not actual CPU branch > operations. :) Oh, so you mean you are not testing all CPU branches!? (/me provokes the inner perfectionist spirit ;-) gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev