Re: A cautionary tale - mgo asserts
Is it mgo/txn that is internally unmarahalling onto that? Let's get that fixed at its heart. On Jun 8, 2016 12:27 PM, "roger peppe" <roger.pe...@canonical.com> wrote: > The Assert field in mgo/txn.Op is an interface{}, so > when it's marshaled and unmarshaled, the order > can change because unmarshaling unmarshals as bson.M > which does not preserve key order. > > https://play.golang.org/p/_1ZPl7iMyn > > On 8 June 2016 at 15:55, Gustavo Niemeyer > <gustavo.nieme...@canonical.com> wrote: > > Is it mgo itself that is changing the order internally? > > > > It should not do that. > > > > On Wed, Jun 8, 2016 at 8:00 AM, roger peppe <rogpe...@gmail.com> wrote: > >> > >> OK, I understand now, I think. > >> > >> The underlying problem is that subdocument searches in MongoDB > >> are order-sensitive. > >> > >> For example, I just tried this in a mongo shell: > >> > >> > db.foo.insert({_id: "one", x: {a: 1, b: 2}}) > >> > db.foo.find({x: {a: 1, b: 2}}) > >> { "_id" : "one", "x" : { "a" : 1, "b" : 2 } } > >> > db.foo.find({x: {b: 2, a: 1}}) > >> > > >> > >> The second find doesn't return anything even though it contains > >> the same fields with the same values as the first. > >> > >> Urk. I did not know about that. What a gotcha! > >> > >> So it *could* technically be OK if the fields in the struct (and > >> any bson.D) are lexically ordered to match the bson Marshaler, > >> but well worth avoiding. > >> > >> I think things would be considerably improved if mgo/bson preserved > >> order by default (by using bson.D) when unmarshaling. > >> Then at least you'd know that the assertion you specify > >> is exactly the one that gets executed. > >> > >> cheers, > >> rog. > >> > >> > >> > >> > >> On 8 June 2016 at 10:42, Menno Smits <menno.sm...@canonical.com> wrote: > >> > > >> > > >> > On 8 June 2016 at 21:05, Tim Penhey <tim.pen...@canonical.com> wrote: > >> >> > >> >> Hi folks, > >> >> > >> >> tl;dr: not use structs in transaction asserts > >> >> > >> >> ... > >> >> > >> >> The solution is to not use a field struct equality, even though it is > >> >> easy > >> >> to write, but to use the dotted field notation to check the embedded > >> >> field > >> >> values. > >> > > >> > > >> > > >> > To give a more concrete example, asserting on a embedded document > field > >> > like > >> > this is problematic: > >> > > >> > ops := []txn.Op{{ > >> > C: "collection", > >> > Id: ..., > >> > Assert: bson.D{{"some-field", Thing{A: "foo", B: 99}}}, > >> > Update: ... > >> > } > >> > > >> > Due to the way mgo works[1], the document the transaction operation is > >> > asserting against may have been written with A and B in reverse order, > >> > or > >> > the Thing struct in the Assert may have A and B swapped by the time > it's > >> > used. Either way, the assertion will fail randomly. > >> > > >> > The correct approach is to express the assertion like this: > >> > > >> > ops := []txn.Op{{ > >> > C: "collection", > >> > Id: ..., > >> > Assert: bson.D{ > >> > {"some-field.A", "foo"}, > >> > {"some-field.B", 99}, > >> > }, > >> > Update: ... > >> > } > >> > > >> > or this: > >> > > >> > ops := []txn.Op{{ > >> > C: "collection", > >> > Id: ..., > >> > Assert: bson.M{ > >> > "some-field.A": "foo", > >> > "some-field.B": 99, > >> > }, > >> > Update: ... > >> > } > >> > > >> >> > >> >> Yet another thing to add to the list of things to check when doing > >> >> reviews. > >> > > >> > > >> > I think we can go a bit further and error on attempts to use structs > for > >> > comparison in txn.Op asserts in Juju's txn layers in state. Just as we > >> > already do some munging and checking of database operations to ensure > >> > correct multi-model behaviour, we should be able to do this same for > >> > this > >> > issue and prevent it from happening again. > >> > > >> > - Menno > >> > > >> > [1] If transaction operations are loaded and used from the DB (more > >> > likely > >> > under load when multiple runners are acting concurrently), the Insert, > >> > Update and Assert fields are loaded as bson.M (this is what the bson > >> > Unmarshaller does for interface{} typed fields). Once this happens > field > >> > ordering is lost. > >> > > >> > > >> > > >> > > >> > -- > >> > Juju-dev mailing list > >> > Juju-dev@lists.ubuntu.com > >> > Modify settings or unsubscribe at: > >> > https://lists.ubuntu.com/mailman/listinfo/juju-dev > >> > > >> > >> -- > >> Juju-dev mailing list > >> Juju-dev@lists.ubuntu.com > >> Modify settings or unsubscribe at: > >> https://lists.ubuntu.com/mailman/listinfo/juju-dev > > > > > > > > > > -- > > gustavo @ http://niemeyer.net > -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: A cautionary tale - mgo asserts
Is it mgo itself that is changing the order internally? It should not do that. On Wed, Jun 8, 2016 at 8:00 AM, roger peppewrote: > OK, I understand now, I think. > > The underlying problem is that subdocument searches in MongoDB > are order-sensitive. > > For example, I just tried this in a mongo shell: > > > db.foo.insert({_id: "one", x: {a: 1, b: 2}}) > > db.foo.find({x: {a: 1, b: 2}}) > { "_id" : "one", "x" : { "a" : 1, "b" : 2 } } > > db.foo.find({x: {b: 2, a: 1}}) > > > > The second find doesn't return anything even though it contains > the same fields with the same values as the first. > > Urk. I did not know about that. What a gotcha! > > So it *could* technically be OK if the fields in the struct (and > any bson.D) are lexically ordered to match the bson Marshaler, > but well worth avoiding. > > I think things would be considerably improved if mgo/bson preserved > order by default (by using bson.D) when unmarshaling. > Then at least you'd know that the assertion you specify > is exactly the one that gets executed. > > cheers, > rog. > > > > > On 8 June 2016 at 10:42, Menno Smits wrote: > > > > > > On 8 June 2016 at 21:05, Tim Penhey wrote: > >> > >> Hi folks, > >> > >> tl;dr: not use structs in transaction asserts > >> > >> ... > >> > >> The solution is to not use a field struct equality, even though it is > easy > >> to write, but to use the dotted field notation to check the embedded > field > >> values. > > > > > > > > To give a more concrete example, asserting on a embedded document field > like > > this is problematic: > > > > ops := []txn.Op{{ > > C: "collection", > > Id: ..., > > Assert: bson.D{{"some-field", Thing{A: "foo", B: 99}}}, > > Update: ... > > } > > > > Due to the way mgo works[1], the document the transaction operation is > > asserting against may have been written with A and B in reverse order, or > > the Thing struct in the Assert may have A and B swapped by the time it's > > used. Either way, the assertion will fail randomly. > > > > The correct approach is to express the assertion like this: > > > > ops := []txn.Op{{ > > C: "collection", > > Id: ..., > > Assert: bson.D{ > > {"some-field.A", "foo"}, > > {"some-field.B", 99}, > > }, > > Update: ... > > } > > > > or this: > > > > ops := []txn.Op{{ > > C: "collection", > > Id: ..., > > Assert: bson.M{ > > "some-field.A": "foo", > > "some-field.B": 99, > > }, > > Update: ... > > } > > > >> > >> Yet another thing to add to the list of things to check when doing > >> reviews. > > > > > > I think we can go a bit further and error on attempts to use structs for > > comparison in txn.Op asserts in Juju's txn layers in state. Just as we > > already do some munging and checking of database operations to ensure > > correct multi-model behaviour, we should be able to do this same for this > > issue and prevent it from happening again. > > > > - Menno > > > > [1] If transaction operations are loaded and used from the DB (more > likely > > under load when multiple runners are acting concurrently), the Insert, > > Update and Assert fields are loaded as bson.M (this is what the bson > > Unmarshaller does for interface{} typed fields). Once this happens field > > ordering is lost. > > > > > > > > > > -- > > Juju-dev mailing list > > Juju-dev@lists.ubuntu.com > > Modify settings or unsubscribe at: > > https://lists.ubuntu.com/mailman/listinfo/juju-dev > > > > -- > Juju-dev mailing list > Juju-dev@lists.ubuntu.com > Modify settings or unsubscribe at: > https://lists.ubuntu.com/mailman/listinfo/juju-dev > -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: fork/exec ... unable to allocate memory
From https://www.kernel.org/doc/Documentation/vm/overcommit-accounting: The Linux kernel supports the following overcommit handling modes 0 - Heuristic overcommit handling. Obvious overcommits of address space are refused. Used for a typical system. It ensures a seriously wild allocation fails while allowing overcommit to reduce swap usage. root is allowed to allocate slightly more memory in this mode. This is the default. 1 - Always overcommit. Appropriate for some scientific applications. Classic example is code using sparse arrays and just relying on the virtual memory consisting almost entirely of zero pages. 2 - Don't overcommit. The total address space commit for the system is not permitted to exceed swap + a configurable amount (default is 50%) of physical RAM. Depending on the amount you use, in most situations this means a process will not be killed while accessing pages but will receive errors on memory allocation as appropriate. Useful for applications that want to guarantee their memory allocations will be available in the future without having to initialize every page. On Wed, Jun 3, 2015 at 7:40 AM, John Meinel j...@arbash-meinel.com wrote: So interestingly we are already fairly heavily overcommitted. We have 4GB of RAM and 4GB of swap available. And cat /proc/meminfo is saying: CommitLimit: 6214344 kB Committed_AS:9764580 kB John =:- On Wed, Jun 3, 2015 at 9:28 AM, Gustavo Niemeyer gust...@niemeyer.net wrote: Ah, and you can also suggest increasing the swap. It would not actually be used, but the system would be able to commit to the amount of memory required, if it really had to. On Jun 3, 2015 1:24 AM, Gustavo Niemeyer gust...@niemeyer.net wrote: Hey John, It's probably an overcommit issue. Even if you don't have the memory in use, cloning it would mean the new process would have a chance to change that memory and thus require real memory pages, which the system obviously cannot give it. You can workaround that by explicitly enabling overcommit, which means the potential to crash late in strange places in the bad case, but would be totally okay for the exec situation. So we're running into this failure mode again at one of our sites. Specifically, the system is running with a reasonable number of nodes (~100) and has been running for a while. It appears that it wanted to restart itself (I don't think it restarted jujud, but I do think it at least restarted a lot of the workers.) Anyway, we have a fair number of things that we exec during startup (kvm-ok, restart rsyslog, etc). But when we get into this situation (whatever it actually is) then we can't exec anything and we start getting failures. Now, this *might* be a golang bug. When I was trying to debug it in the past, I created a small program that just allocated big slices of memory (10MB strings, IIRC) and then tried to run echo hello until it started failing. IIRC the failure point was when I wasn't using swap and the allocated memory was 50% of total available memory. (I have 8GB of RAM, it would start failing once we had allocated 4GB of strings). When I tried digging into the golang code, it looked like they use clone(2) as the create a new process for exec function. And it seemed it wasn't playing nicely with copy-on-write. At least, it appeared that instead of doing a simple copy-on-write clone without allocating any new memory and then exec into a new process, it actually required to have enough RAM available for the new process. On the customer site, though, jujud has a RES size of only 1GB, and they have 4GB of available RAM and swap is enabled (2GB of 4GB swap currently in use). The only workaround I can think of is for us to create a forker process right away at startup that we just send RPC requests to run a command for us and return the results. ATM I don't think we do any fork and run interactively such that we need the stdin/stdout file handles inside our process. I'd rather just have golang fork() work even when the current process is using a large amount of RAM. Any of the golang folks know what is going on? John =:- -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: fork/exec ... unable to allocate memory
Hey John, It's probably an overcommit issue. Even if you don't have the memory in use, cloning it would mean the new process would have a chance to change that memory and thus require real memory pages, which the system obviously cannot give it. You can workaround that by explicitly enabling overcommit, which means the potential to crash late in strange places in the bad case, but would be totally okay for the exec situation. So we're running into this failure mode again at one of our sites. Specifically, the system is running with a reasonable number of nodes (~100) and has been running for a while. It appears that it wanted to restart itself (I don't think it restarted jujud, but I do think it at least restarted a lot of the workers.) Anyway, we have a fair number of things that we exec during startup (kvm-ok, restart rsyslog, etc). But when we get into this situation (whatever it actually is) then we can't exec anything and we start getting failures. Now, this *might* be a golang bug. When I was trying to debug it in the past, I created a small program that just allocated big slices of memory (10MB strings, IIRC) and then tried to run echo hello until it started failing. IIRC the failure point was when I wasn't using swap and the allocated memory was 50% of total available memory. (I have 8GB of RAM, it would start failing once we had allocated 4GB of strings). When I tried digging into the golang code, it looked like they use clone(2) as the create a new process for exec function. And it seemed it wasn't playing nicely with copy-on-write. At least, it appeared that instead of doing a simple copy-on-write clone without allocating any new memory and then exec into a new process, it actually required to have enough RAM available for the new process. On the customer site, though, jujud has a RES size of only 1GB, and they have 4GB of available RAM and swap is enabled (2GB of 4GB swap currently in use). The only workaround I can think of is for us to create a forker process right away at startup that we just send RPC requests to run a command for us and return the results. ATM I don't think we do any fork and run interactively such that we need the stdin/stdout file handles inside our process. I'd rather just have golang fork() work even when the current process is using a large amount of RAM. Any of the golang folks know what is going on? John =:- -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: fork/exec ... unable to allocate memory
Ah, and you can also suggest increasing the swap. It would not actually be used, but the system would be able to commit to the amount of memory required, if it really had to. On Jun 3, 2015 1:24 AM, Gustavo Niemeyer gust...@niemeyer.net wrote: Hey John, It's probably an overcommit issue. Even if you don't have the memory in use, cloning it would mean the new process would have a chance to change that memory and thus require real memory pages, which the system obviously cannot give it. You can workaround that by explicitly enabling overcommit, which means the potential to crash late in strange places in the bad case, but would be totally okay for the exec situation. So we're running into this failure mode again at one of our sites. Specifically, the system is running with a reasonable number of nodes (~100) and has been running for a while. It appears that it wanted to restart itself (I don't think it restarted jujud, but I do think it at least restarted a lot of the workers.) Anyway, we have a fair number of things that we exec during startup (kvm-ok, restart rsyslog, etc). But when we get into this situation (whatever it actually is) then we can't exec anything and we start getting failures. Now, this *might* be a golang bug. When I was trying to debug it in the past, I created a small program that just allocated big slices of memory (10MB strings, IIRC) and then tried to run echo hello until it started failing. IIRC the failure point was when I wasn't using swap and the allocated memory was 50% of total available memory. (I have 8GB of RAM, it would start failing once we had allocated 4GB of strings). When I tried digging into the golang code, it looked like they use clone(2) as the create a new process for exec function. And it seemed it wasn't playing nicely with copy-on-write. At least, it appeared that instead of doing a simple copy-on-write clone without allocating any new memory and then exec into a new process, it actually required to have enough RAM available for the new process. On the customer site, though, jujud has a RES size of only 1GB, and they have 4GB of available RAM and swap is enabled (2GB of 4GB swap currently in use). The only workaround I can think of is for us to create a forker process right away at startup that we just send RPC requests to run a command for us and return the results. ATM I don't think we do any fork and run interactively such that we need the stdin/stdout file handles inside our process. I'd rather just have golang fork() work even when the current process is using a large amount of RAM. Any of the golang folks know what is going on? John =:- -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Pruning the txns collection
Hey Menno, I'm copying the list to ensure we have this documented somewhere for future reference. You are right that it's not that simple, but it's not that complex either once you understand the background. Transactions are applied by the txn package by tagging each one of the documents that will participate in the transaction with the transaction id they are participating in. When mgo goes to apply a transaction in that same document, it will tag the document with the new transaction id, and then evaluate all the transactions it is part of. If you drop one of the transactions that a document claims to be participating in, then the txn package will rightfully complain since it cannot tell the state of a transaction that explicitly asked to be considered for the given document. That means the solution is to make sure removed transactions are 1) in a final state; and 2) not being referenced by any tagged documents. The txn package itself collects garbage from old transactions as new transactions are applied, but it doesn't guarantee that right after a transaction reaches a final state it will be collected. This can lead to pretty old transactions being referenced, if these documents are never touched again. So, you have two choices to collect these old documents: 1. Clean up the transaction references from all documents or 2. Just make sure the transaction being removed is not referenced anywhere I would personally go for 2, as it is a read-only operation everywhere but in the transactions collection itself, to drop the transaction document. Note that the same rules here apply to the stash collection as well. Please let me know if you run into any issues there. On Tue, May 12, 2015 at 9:21 PM, Menno Smits menno.sm...@canonical.com wrote: Hi again, In response to the current production Juju issue, I've been tasked with adding something to Juju to keep the size of Juju's txns and txns.log collections under control so that they don't grow unbounded. The ticket is here: https://bugs.launchpad.net/juju-core/+bug/1453785 Naively, one might think that transactions could be removed if they were (say) over a week old and marked as either applied or aborted but of course it's not that simple[1]. I must admit that I don't completely understand why this is case, even when I look at the code for mgo/txn. How does a pending transaction end up depending on the details of an applied transaction? Given that a typical Juju system has no maintenance window and there's (currently) no way to put a Juju system into a read-only mode can you think of any practical way for Juju to prune the txn and txn.stash collections? Any ideas would be most helpful. - Menno [1] http://grokbase.com/t/gg/mgo-users/13cj7c6kxt/when-to-delete-db-transaction -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Please, no more types called State
When I was new to juju myself, we only had one State, I believe. That one golden state was supposed to represent the state of the whole deployment, so it was indeed The State of the system. Having tons of these indeed sounds awkward. On Thu, Mar 12, 2015 at 8:08 AM, Michael Foord michael.fo...@canonical.com wrote: On 12/03/15 05:01, David Cheney wrote: lucky(~/src/github.com/juju/juju) % pt -i type\ State\ | wc -l 23 Thank you. When I was new to Juju the fact that we had a central State, core to the Juju model, but we had umpteen types called State - so where you saw a State you had no idea what it actually was and when someone mentioned State you couldn't be sure what they meant - was a significant part of the learning curve. Perhaps a better solution would have been a better name for the core State. Michael Dave -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: adding placement directives for ensure-availability
Hi Nate, On Tue, Feb 24, 2015 at 2:24 PM, Nate Finch nate.fi...@canonical.com wrote: (...) To support this, we need a way to say use the default placement policy. For this, we propose the keyword default. Thus, to fix the above example, Bill would type this: $ juju ensure-availability --to lxc:1,default success output here What's the full format of the parameter of --to, with all possible details? Note that this change in no way fixes all of HA's UX problems, and that it actually makes some of the problems a lot more obvious (such as the fact that the number of placements you need can be different even for the same command, depending on the state of the environment). This will be fixed when we revamp the CLI, but for now we'll have to live with it. I don't have much context on the problem, but it seems like the proposal is a change in the design of the CLI. If there are known problems on the current design, the change might well fix it instead of making it worse? gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Feedback on a base fake type in the testing repo
On Fri, Feb 13, 2015 at 2:05 PM, Eric Snow eric.s...@canonical.com wrote: As for me, by fake I mean a struct that implements an interface with essentially no logic other than to keep track of method calls and facilitate controlling their return values explicitly. For examples see the implementations for GCE and in `testing/fake.go`. Thus in tests a fake may be used in place of the concrete implementation of the interface that would be used in production. To me this is a good fake implementation: https://github.com/juju/juju/tree/master/provider/dummy The onus is on the test writer to populate the fake with the correct return values such that they would match the expected behavior of the concrete implementation. That's an optimistic view of it, as I described. Regardless, I'm convinced that testing needs to include both high coverage via isolated unit tests and good enough coverage via full stack integration tests. Essentially we have to ensure that layers work together properly and that low-level APIs work the way we expect (and don't change unexpectedly). That's globally agreed. What's at stake is how to do these. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Feedback on a base fake type in the testing repo
On Fri, Feb 13, 2015 at 3:25 PM, Eric Snow eric.s...@canonical.com wrote: This is a mock object under some well known people's terminology [1]. With all due respect to Fowler, the terminology in this space is fairly muddled still. :) Sure, I'm happy to use any terminology, but I'd prefer to not make one up just now. The most problematic aspect of this approach is that tests are pretty much always very closely tied to the implementation, in a way that you suddenly cannot touch the implementation anymore without also fixing a vast number tests to comply. Let's look at this from the context of unit (i.e. function signature) testing. By implementation do you mean you mean the function you are testing, or the low-level API the function is using, or both? If the low-level API then it seems like the real fake object you describe further on would help by moving at least part of the test setup down out of the test and down into the fake. However aren't you then just as susceptible to changes in the fake with the same maintenance consequences? No, because the fake should behave as a normal type would, instead of expecting a very precisely constrained orchestration of calls into its interface. If we hand the implementation a fake value, it should be able to call that value as many times as it wants, with whatever parameters it wants, in whatever order it wants, and its behavior should be consistent with a realistic implementation. Again, see the dummy provider for a convenient example of that in practice. Ultimately I just don't see how you can avoid depending on low-level details (closely tied to the implementation) in your tests and still have confidence that you are testing things rigorously. I think the I could perceive that on your original email, and it's precisely why I'm worried and responding to this thread. If that logic held any ground, we'd never be able to have organizations that could certify the quality and conformance of devices based on the device itself. Instead, they'd have to go into the industries to see how the device was manufactured. But that's not what it happens.. these organizations get the outcome of the production line, no matter how that worked, because that's the most relevant thing to test. You can change the production line, you can optimize it away, and you can even replace entire components, and it doesn't matter as long as you preserve the quality of the outcome. Of course, on the way to producing a device you'll generally make use of smaller devices, which have their own production lines, and which ensure that the outcome of their own production lines is of high quality. The same thing is true in code. If you spend a lot of time writing tests for your production line, you are optimizing for the wrong goal. You are spending a lot of time, the outcome can still be of poor quality, and you are making it hard to optimize your production line and potentially replace its components and methods by something completely different. Of course, as in actual devices, code is layered, so sub-components can be tested on their own to ensure their promised interfaces hold water, but even there what matters is ensuring that what they promise is being satisfied, rather than how they are doing it. Also, the testing world puts a lot are emphasis on branch coverage in tests. It almost sounds like you are suggesting that is not such an important goal. Could you clarify? Perhaps I'm inferring too much from what you've said. :) I'd be happy to dive into that, but it's a distraction in this conversation. You can use or not use your coverage tool irrespective of your testing approach. As a recommendation to avoid digging a hole -- one that is pretty difficult to climb out of once you're in -- instead of testing method calls and cooking fake return values in your own test, build a real fake object: one that pretends to be a real implementation of that interface, and understands the business logic of it. Then, have methods on it that allow tailoring its behavior, but in a high-level way, closer to the problem than to the code. Ah, I like that! So to rephrase, instead of a type where you just track calls and explicitly control return values, it is better to use a type that implements your expectations about the low-level system, exposed via the same API as the actual one? This would likely still involve both to implement the same interface, right? The thing I like That's right. about that approach is that is forces you to document your expectations (i.e. dependencies) as code. The problem is that you pay (in development time and in complexity) for an extra layer to engineer This is irrelevant if you take into account the monumental future cost of mocking everything up. Regardless, as I noted in an earlier message, I think testing needs to involve: 1. a mix of high branch coverage through isolated unit tests, I'd be very careful to not
Re: Feedback on a base fake type in the testing repo
On Fri, Feb 13, 2015 at 6:50 PM, Eric Snow eric.s...@canonical.com wrote: Using a fake for that input means you don't have to encode the low-level business logic in each test (just any setup of the fake's state). You can be confident about the low-level behavior during tests as matching production operation (as long as the fake implementation is correct and bug free). The potential downsides are any performance costs of using the fake, maintaining the fake (if applicable), and knowing how to manage the fake's state. Consequently, there should be a mechanism to ensure that the fake's behavior matches the real thing. All of these costs exist whether you are dealing with one fake implementation, or with logic spread through five hundred tests which all include details of that interface. Hopefully the saner approach is obvious. Alternately you can use a stub (what I was calling a fake) for that input. On the upside, stubs are lightweight, both performance-wise and in terms of engineer time. They also help limit the scope of what executes to just the code in the function under test. The downside is that each test must encode the relevant business logic (mapping low-level inputs to low-level outputs) into the test's setup. Not only is that fragile but the low-level return values will probably not have a lot of context where they appear in the test setup code (without comments). Precisely. And that also implies the test knows exactly how the function is using the interface, thus leaking its implementation into the test, and preventing the implementation to change even in simple ways without breaking the test. I'd be very careful to not overdo this. Covering a line just for the fun of seeing the CPU passing there is irrelevant. If you fake every single thing around it with no care, you'll have a CPU pointer jumping in and out of it, without any relevant achievement. I was talking about branches in the source code, not actual CPU branch operations. :) Oh, so you mean you are not testing all CPU branches!? (/me provokes the inner perfectionist spirit ;-) gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Juju Resources (a tool / library)
I can, but that's not the right way to proceed if you were in fact trying to implement an important feature of juju that was extensively discussed. 1. The project has a technical lead and a manager which should have the proper information to bootstrap this, or at least know who to talk to. 2. The project has a roadmap. Make sure to talk to the people in (1) to see how this fits in. 3. I'm sure there are previous documents about this, given relevance and prior conversations. You can find these documents yourself by searching through specs, or by asking people that participated in prior conversations and might have a better idea about what to search for. 4. There should be a specification about the feature, before there is an implementation. Stakeholders should review the specification and approve it before there is code. On Wed, Feb 11, 2015 at 6:40 PM, Cory Johns cory.jo...@canonical.com wrote: Can you be more specific on how it differs from the goals of resources streams? As I mentioned in my first email, I asked around to try to get specific information about the proposed feature and wasn't able to get any concrete answers or documentation. So I created this based on what I remembered from the discussions I'd heard (admittedly not much) and what I needed in the charms I was working on. I fully intend for this library to be subsumed / obviated by core as the feature develops, and I tried to make that clear in the library README and documentation. I also intend to update the interface to match the feature as closely as possible as the proposal becomes more concrete. On Wed, Feb 11, 2015 at 2:33 PM, Gustavo Niemeyer gust...@niemeyer.net wrote: Hi Cory, While it's fine and welcome to have such test bed features, it feels like the proposal and implementation have quite different goals from the actual resources feature we've been talking about for a while, so as a very early suggestion and request, I would strongly recommend renaming the feature to avoid creating ambiguity with the feature that we intend juju to have in the long run. Having two resource implementations and taking over important namespaces such as resources.yaml might introduce unnecessary confusion down the road. Instead, the project might have a nice non-generic name, and its configuration file could also be named after it. On Wed, Feb 11, 2015 at 4:17 PM, Cory Johns cory.jo...@canonical.com wrote: Per request, the documentation is now also available on ReadTheDocs.org: http://jujuresources.readthedocs.org/ On Wed, Feb 11, 2015 at 11:25 AM, Cory Johns cory.jo...@canonical.com wrote: Hi all, (cross-posting to juju juju-dev) I've created a tool / library for organizing and managing resources (binary blobs, tarballs, Python packages, and, eventually, apt packages) required by a charm. The idea is to be an interim tool, and a test-bed for the resource features that have been discussed for the future in Juju core. It is available on GitHub [1] and PyPI [2], and the full documentation is on PythonHosted.org [3]. The goals of this project are: * Organize all external resource dependencies into a single resources.yaml file * Provide easy, consistent interfaces to manage, install, and mirror resources * CLI and Python bindings * Enforce best practices, such as cryptographic hash validation I asked around to see if there was an existing proposal for a resources.yaml file format, but couldn't find one. If someone is aware of an existing spec / proposal, I'd prefer to match that as much as possible. The current version is fully functional, and is currently being used in the framework refactor of the Apache Hadoop charms (e.g., [4]). Note that I created this separately from Charm Helpers primarily because I wanted to use it to bootstrap CH, but this also makes it easier to use in Bash charms. My next step is to add apt-get support, but that will requiring cleaning up the mirror server (possibly converting it to use squid, but I may want to keep it self-contained), and learning a bit more about how the apt proxy settings work. Advice here is appreciated. [1] https://github.com/juju-solutions/jujuresources [2] https://pypi.python.org/pypi/jujuresources [3] http://pythonhosted.org/jujuresources/ [4] https://code.launchpad.net/~bigdata-dev/charms/trusty/apache-hadoop-hdfs-master/trunk -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: supplement open--port/close-port with ensure-these-and-only-these-ports?
Reminding people of everything they should *not be doing *to get a feature to be listed in the release notes is very ineffective. What should they *be doing* instead, and why will the process work in the future when it clearly has failed before, despite the assumed good intention we should assume all trusted developers to have? On Mon Nov 03 2014 at 1:20:30 PM Curtis Hovey-Canonical cur...@canonical.com wrote: On Sat, Nov 1, 2014 at 2:08 PM, Kapil Thangavelu kapil.thangav...@canonical.com wrote: On Sat, Nov 1, 2014 at 12:58 PM, John Meinel j...@arbash-meinel.com wrote: I believe there is already opened-ports to tell you what ports Juju is currently tracking. That's cool and news to me, it looks like it landed in trunk earlier on october 2nd (ie 1.21) and hasn't made release notes or docs yet. Especially for charm environment changes we really need corresponding docs as charm env changes are not easily discover-able otherwise. Really great to see that land as its been a common issue for charms and one that previously forced them into state management. :( How will anything get into the release notes if engineers don't announce the new feature when it merges? It is not the not release note because engineers haven't described it. Asking questions to this list to discover new features isn't very efficient. -- Curtis Hovey Canonical Cloud Development and Operations http://launchpad.net/~sinzui -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/ mailman/listinfo/juju-dev -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: how to update dependencies.tsv
I have never used upstream as an actual remote name. I see people commonly using the term as a wildcard to refer to the upstream branch whatever it is. The term is also used widely in git itself with the same meaning, including in the command line interface. For example, you set the upstream branch with --set-upstream (or -u for short), and in most cases people set their origin branch as upstream. Most posts in StackOverflow follow that: http://stackoverflow.com/search?q=%5Bgit%5D+upstream This confirms what Roger pointed out: upstream is well established as a concept, not as a remote label, so it's best to use a well defined name that points out where the code was taken from, rather than overloading the term to mean something else. On Thu Oct 30 2014 at 7:47:49 AM Nate Finch nate.fi...@canonical.com wrote: Upstream and origin are very very common in the git world. Most any how to or stack overflow answer uses those by default. Origin is your repo and upstream is the repo you branched from. I started out doing it your way, Roger, since I agree that information does flow both ways, and naming my repo after myself made sense, but I got so annoyed with every answer I looked up using origin and upstream that I changed to just use those terms. Using standard terms is a good thing so we all know what we're talking about. On Oct 30, 2014 4:22 AM, roger peppe roger.pe...@canonical.com wrote: On 29 October 2014 21:03, Tim Penhey tim.pen...@canonical.com wrote: On 30/10/14 01:11, roger peppe wrote: A better solution here, which I've been meaning to do for a while, would be to change godeps so that it can explore all possible targets. I had a go at that this morning (just adding all tags to build.Context) but it's not quite as easy as that. I should be able to fix it soon though. While you are looking at godeps, I don't suppose you can fix it so it looks for the upstream remote? As things currently are, godeps doesn't know about any remote in particular, and I think that's probably correct - it just uses git fetch (with no arguments) to fetch, and relies on the defaults for that. I was told that we should have the origin remote being our personal github repo and upstream being the team repo. I actually think that this is not a great way to configure things. When you clone a git repository (for example by doing go get) there is only one remote configured, and that's origin. So if I changed godeps to pull from upstream, it would have to fall back to pulling from origin in this, the most common case. Personally, I find the very word upstream confusing when used in this area - information flows both ways. The one certainty is that everything is destined for the main repo, so naming that origin makes sense to me. I never create a repo named upstream - I have origin and I name other remotes after github users, e.g. rogpeppe, which seems to scale better when I'm collaborating with other people. When godeps tries to pull in new revisions into a repo where I have the remotes set as I was told to, godeps fails to pull in new revisions and I normally do something like: (cd ../names git fetch upstream master) Then run the godeps command again. All the above said, I don't think there's anything stopping you from using this. Just do: git branch --set-upstream-to upstream/master and I think it should work (though I haven't actually tried it) cheers, rog. -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/ mailman/listinfo/juju-dev -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Actions :: UUID vs. Tag on command line
The tag (which might be better named internal id) looks like an implementation detail which doesn't seem right to expose. I'd suggest either giving it a proper representation that the user can understand (a sequential action number, for example), or use a hash. I'd also not use a UUID, btw, but rather just a unique hash. On Fri Oct 24 2014 at 2:55:45 PM John Weldon johnweld...@gmail.com wrote: Hi; The current actions spec https://docs.google.com/a/canonical.com/document/d/14W1-QqB1pXZxyZW5QzFFoDwxxeQXBUzgj8IUkLId6cc/edit?usp=sharing indicates that the actions command line should return a UUID as the identifier for an action once it's been en-queued using 'juju do action'. Is there a compelling reason to use UUID's to identify actions, versus using the string representation of the Tag? A UUID would require a command something like: juju status action:9e1e5aa0-5b9d-11e4-8ed6-0800200c9a66 which maybe we could shorten to: juju status action:9e1e5aa0 I would prefer something like: juju status action:mysq/0_a_3 which would be the string representation of the actions Tag. Is there a compelling reason to use UUID? Cheers, -- John Weldon -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/ mailman/listinfo/juju-dev -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Actions :: UUID vs. Tag on command line
It was my mistake to call it a hash.. it may be just a random id, in hex form. Alternatively, use a service-specific sequence number so it's better suited to humans. In the latter case, the sequence number must realistically reflect the sequence in which the actions are submitted to units, otherwise it would be confusing. On Fri Oct 24 2014 at 3:51:04 PM John Weldon johnweld...@gmail.com wrote: Thanks Gustavo; I think a hash would be good too. I'll see what I can find in the juju code base around hash representations of id's, or come up with something. Any suggestions on how to generate and translate the hash are welcome too. Cheers, -- John Weldon On Fri, Oct 24, 2014 at 10:41 AM, Gustavo Niemeyer gustavo.nieme...@canonical.com wrote: The tag (which might be better named internal id) looks like an implementation detail which doesn't seem right to expose. I'd suggest either giving it a proper representation that the user can understand (a sequential action number, for example), or use a hash. I'd also not use a UUID, btw, but rather just a unique hash. On Fri Oct 24 2014 at 2:55:45 PM John Weldon johnweld...@gmail.com wrote: Hi; The current actions spec https://docs.google.com/a/canonical.com/document/d/14W1-QqB1pXZxyZW5QzFFoDwxxeQXBUzgj8IUkLId6cc/edit?usp=sharing indicates that the actions command line should return a UUID as the identifier for an action once it's been en-queued using 'juju do action'. Is there a compelling reason to use UUID's to identify actions, versus using the string representation of the Tag? A UUID would require a command something like: juju status action:9e1e5aa0-5b9d-11e4-8ed6-0800200c9a66 which maybe we could shorten to: juju status action:9e1e5aa0 I would prefer something like: juju status action:mysq/0_a_3 which would be the string representation of the actions Tag. Is there a compelling reason to use UUID? Cheers, -- John Weldon -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/ mailman/listinfo/juju-dev -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Actions :: UUID vs. Tag on command line
I doubt this would work. There's no way in the transaction package for you to generate an id and reference that same id in other fields in one go. In other cases that's not an issue, but having a sequence of numbered actions where 10 is applied before 9 would be awkward. On Fri Oct 24 2014 at 4:10:30 PM John Weldon johnweld...@gmail.com wrote: That's a good question; The sequence relies on the same mechanism in state that is used to generate other sequences. I believe it's done in a transaction using the provided key (in this case the id of the unit). Cheers, -- John Weldon On Fri, Oct 24, 2014 at 11:07 AM, Gustavo Niemeyer gustavo.nieme...@canonical.com wrote: That might be okay, but is the sequence really respected? In other words, what happens if two independent clients attempt to submit an action for the same service? Will the two generated sequences reflect the order in which the actions are submitted to the units at the end of the pipeline? On Fri Oct 24 2014 at 4:05:03 PM John Weldon johnweld...@gmail.com wrote: Sure, that makes sense. Right now the Tag encodes a legitimate sequence. We should probably just clean up the representation so it doesn't expose the internals and just exposes the unit and action sequence number. -- John Weldon On Fri, Oct 24, 2014 at 10:58 AM, Gustavo Niemeyer gustavo.nieme...@canonical.com wrote: It was my mistake to call it a hash.. it may be just a random id, in hex form. Alternatively, use a service-specific sequence number so it's better suited to humans. In the latter case, the sequence number must realistically reflect the sequence in which the actions are submitted to units, otherwise it would be confusing. On Fri Oct 24 2014 at 3:51:04 PM John Weldon johnweld...@gmail.com wrote: Thanks Gustavo; I think a hash would be good too. I'll see what I can find in the juju code base around hash representations of id's, or come up with something. Any suggestions on how to generate and translate the hash are welcome too. Cheers, -- John Weldon On Fri, Oct 24, 2014 at 10:41 AM, Gustavo Niemeyer gustavo.nieme...@canonical.com wrote: The tag (which might be better named internal id) looks like an implementation detail which doesn't seem right to expose. I'd suggest either giving it a proper representation that the user can understand (a sequential action number, for example), or use a hash. I'd also not use a UUID, btw, but rather just a unique hash. On Fri Oct 24 2014 at 2:55:45 PM John Weldon johnweld...@gmail.com wrote: Hi; The current actions spec https://docs.google.com/a/canonical.com/document/d/14W1-QqB1pXZxyZW5QzFFoDwxxeQXBUzgj8IUkLId6cc/edit?usp=sharing indicates that the actions command line should return a UUID as the identifier for an action once it's been en-queued using 'juju do action'. Is there a compelling reason to use UUID's to identify actions, versus using the string representation of the Tag? A UUID would require a command something like: juju status action:9e1e5aa0-5b9d-11e4-8ed6-0800200c9a66 which maybe we could shorten to: juju status action:9e1e5aa0 I would prefer something like: juju status action:mysq/0_a_3 which would be the string representation of the actions Tag. Is there a compelling reason to use UUID? Cheers, -- John Weldon -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/ mailman/listinfo/juju-dev -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Actions :: UUID vs. Tag on command line
As a side note, and a bikeshed-prone rant which I won't embrace, naming it tag feels like a mistake. On Fri Oct 24 2014 at 4:13:14 PM William Reade william.re...@canonical.com wrote: On Fri, Oct 24, 2014 at 8:04 PM, John Weldon johnweld...@gmail.com wrote: Sure, that makes sense. Right now the Tag encodes a legitimate sequence. We should probably just clean up the representation so it doesn't expose the internals and just exposes the unit and action sequence number. Yeah, that works for me. Please don't expose tags in the UI -- as gustavo says, they're implementation details. The only critically important property of a tag is that it be a *unique* entity identifier for API use -- and that requirement is generally at odds with a pleasant UX. But, yes, if the user representation happens to have a clean 2-way mapping with the relevant tags, that makes life easier in some respects, and I certainly won't complain about that. Cheers William -- John Weldon On Fri, Oct 24, 2014 at 10:58 AM, Gustavo Niemeyer gustavo.nieme...@canonical.com wrote: It was my mistake to call it a hash.. it may be just a random id, in hex form. Alternatively, use a service-specific sequence number so it's better suited to humans. In the latter case, the sequence number must realistically reflect the sequence in which the actions are submitted to units, otherwise it would be confusing. On Fri Oct 24 2014 at 3:51:04 PM John Weldon johnweld...@gmail.com wrote: Thanks Gustavo; I think a hash would be good too. I'll see what I can find in the juju code base around hash representations of id's, or come up with something. Any suggestions on how to generate and translate the hash are welcome too. Cheers, -- John Weldon On Fri, Oct 24, 2014 at 10:41 AM, Gustavo Niemeyer gustavo.nieme...@canonical.com wrote: The tag (which might be better named internal id) looks like an implementation detail which doesn't seem right to expose. I'd suggest either giving it a proper representation that the user can understand (a sequential action number, for example), or use a hash. I'd also not use a UUID, btw, but rather just a unique hash. On Fri Oct 24 2014 at 2:55:45 PM John Weldon johnweld...@gmail.com wrote: Hi; The current actions spec indicates that the actions command line should return a UUID as the identifier for an action once it's been en-queued using 'juju do action'. Is there a compelling reason to use UUID's to identify actions, versus using the string representation of the Tag? A UUID would require a command something like: juju status action:9e1e5aa0-5b9d-11e4-8ed6-0800200c9a66 which maybe we could shorten to: juju status action:9e1e5aa0 I would prefer something like: juju status action:mysq/0_a_3 which would be the string representation of the actions Tag. Is there a compelling reason to use UUID? Cheers, -- John Weldon -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Actions :: UUID vs. Tag on command line
Both of these assumptions are incorrect. Please do not assume there's a single person managing an environment, and the fact the sequence is generated outside of the transaction that adds the action is a proof that actions will be arbitrarily executed rather than in the sequence suggested by the numbers. On Fri Oct 24 2014 at 4:21:30 PM John Weldon johnweld...@gmail.com wrote: Forgot to reply-all -- Forwarded message -- From: John Weldon johnweld...@gmail.com Date: Fri, Oct 24, 2014 at 11:19 AM Subject: Re: Actions :: UUID vs. Tag on command line To: Gustavo Niemeyer gustavo.nieme...@canonical.com On Fri, Oct 24, 2014 at 11:14 AM, Gustavo Niemeyer gustavo.nieme...@canonical.com wrote: I doubt this would work. There's no way in the transaction package for you to generate an id and reference that same id in other fields in one go. In other cases that's not an issue, but having a sequence of numbered actions where 10 is applied before 9 would be awkward. Interesting. 1. The sequence is generated in a separate transaction before being used. (state/sequence.go) So I don't think your concern about obtaining and using in one transaction will be an issue. 2. We have not had much discussion around strict ordering of actions being run in the order they were queued. My impression is that two different users interacting with the system at the same time is a bit of an edge case. -- John Weldon -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/ mailman/listinfo/juju-dev -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Actions :: UUID vs. Tag on command line
On Fri Oct 24 2014 at 4:30:38 PM John Weldon johnweld...@gmail.com wrote: Ordered execution wasn't addressed in the spec, and we haven't had much discussion about it. I'm not even sure how to enforce ordered execution unless we rely on the creation timestamp. Specifications are guidelines. If there are open issues in the specifications, it does not mean that it is okay to do anything in that sense, but rather than either it should be done in the obviously correct way, or that a conversation should be raised if the correct way is not obvious. If someone sends an action, and then sends another action, to me it's clear that the first action should be executed before the second action. If the implementation is not doing that, it should. If two people send two actions concurrently, by definition there's no order implied by their use of the system, and so it's impossible to guarantee which one will be executed first. Assuming we have a way to enforce ordered execution, and if that ordering is not using the sequence number that is generated, then does exposing that sequence number just introduce confusion? How do you feel about postgres action 103 executing before postgres action 102? I personally feel like it's a bug. i.e. are we back to just showing some sort of hash / hex sequence as the id to avoid implying an order by the sequence number? Either option sounds fine to me. I'm only suggesting that if you do use sequence numbers, you're implying a sequence, and people in general are used to being 35 years old only after they've been 34. -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Actions :: UUID vs. Tag on command line
For 2, it doesn't matter much if the timestamp is taken into account. The server may simply enqueue the action as it receives it and respond back only afterwards. This will guarantee read-your-writes consistency, and thus proper ordering assuming the server does use a queue rather than an unordered set. On Fri Oct 24 2014 at 4:44:03 PM John Weldon johnweld...@gmail.com wrote: Agreed completely; My take away - 1. Actions en-queued by the same client MUST execute in the order en-queued. 2. Actions en-queued by different clients SHOULD execute in timestamp order? 3. Action IDs should not mislead users by implying sequence that does not exist. 4. ergo Action id's will probably be reflected back to the user in some sort of a manageable hash or hex format -- John Weldon On Fri, Oct 24, 2014 at 11:38 AM, Gustavo Niemeyer gust...@niemeyer.net wrote: On Fri Oct 24 2014 at 4:30:38 PM John Weldon johnweld...@gmail.com wrote: Ordered execution wasn't addressed in the spec, and we haven't had much discussion about it. I'm not even sure how to enforce ordered execution unless we rely on the creation timestamp. Specifications are guidelines. If there are open issues in the specifications, it does not mean that it is okay to do anything in that sense, but rather than either it should be done in the obviously correct way, or that a conversation should be raised if the correct way is not obvious. If someone sends an action, and then sends another action, to me it's clear that the first action should be executed before the second action. If the implementation is not doing that, it should. If two people send two actions concurrently, by definition there's no order implied by their use of the system, and so it's impossible to guarantee which one will be executed first. Assuming we have a way to enforce ordered execution, and if that ordering is not using the sequence number that is generated, then does exposing that sequence number just introduce confusion? How do you feel about postgres action 103 executing before postgres action 102? I personally feel like it's a bug. i.e. are we back to just showing some sort of hash / hex sequence as the id to avoid implying an order by the sequence number? Either option sounds fine to me. I'm only suggesting that if you do use sequence numbers, you're implying a sequence, and people in general are used to being 35 years old only after they've been 34. -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Unit Tests Integration Tests
On Thu, Sep 11, 2014 at 4:06 PM, Mark Ramm-Christensen (Canonical.com) mark.ramm-christen...@canonical.com wrote: But they are not the ONLY reasons why they are valuable. There are plenty of others -- performance, test-code cleanliness/re-use, result granularity, etc. Performance is the second reason Roger described, and I disagree that mocking code is cleaner.. these are two orthogonal properties, and it's actually pretty easy to have mocked code being extremely confusing and tightly bound to the implementation. It doesn't _have_ to be like that, but this is not a reason to use it. Like any tools, developers can over-use, or mis-use them. But, if you don't use them at all, That's not what Roger suggested either. A good conversation requires properly reflecting the position held by participants. you often end up with what I call the binary test suite in which one coding error somewhere creates massive test failures. A coding error that creates massive test failures is not a problem, in my experience using both heavily mocking and heavily non-mocking code bases. It rarely goes into the repository in the first place, because it's a massive breakage, and when it does go in due to differences in environment, it's easy to spot the root of the failure because proper code is layered. (...) My belief is that you need both small, fast, targeted tests (call them unit tests) and large, realistic, full-stack tests (call them integration tests) and that we should have infrastructure support for both. Yep, but that's besides the point being made. You can do unit tests which are small, fast, and targeted, both with or without mocking, and without mocking they can be realistic, which is a good thing. If you haven't had a chance to see tests falsely passing with mocking, that's a good thing too.. you haven't abused mocking too much yet. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Unit Tests Integration Tests
On Thu, Sep 11, 2014 at 10:42 PM, Andrew Wilkins andrew.wilk...@canonical.com wrote: I basically agree with everything below, but strongly disagree that mocking implies you know exactly what the code is doing internally. A good interface I'm also in agreement about your points. But just so you understand where Roger is coming from, the term mocking is often [1] associated with a test style that does bind very closely to what the code does. But you're probably using the term more loosely for test doubles in general, and I'm all for not being pedantic, so yes, +1 to the intention of what you've said. [1] http://martinfowler.com/articles/mocksArentStubs.html gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Please don't use bash when there are syscalls available
Worth keeping in mind the usual gotcha: the API of syscall is different for different OSes. On Tue, Sep 9, 2014 at 5:45 PM, Nate Finch nate.fi...@canonical.com wrote: A user just complained that he can't bootstrap because Juju is parsing stderr text from flock, and his server isn't in English, so the error message isn't matching. https://github.com/juju/juju/blob/master/environs/sshstorage/storage.go#L254 Now, I think we all know that parsing error text is a bad idea, but I think I understand why it was done - it looks like flock the application only returns 1 on this failure, so it's not exactly a unique error code. However, flock the system call returns several different error codes, which are quite unique and easy to handle in a way that is not dependent on the language of the machine. It also happens to be already implemented in the syscalls package: http://golang.org/pkg/syscall/#Flock So let's fix this, and try not to call out to bash unless there's absolutely no other way. -Nate -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Commented-out tests?
On Fri, Aug 29, 2014 at 4:28 PM, Katherine Cox-Buday katherine.cox-bu...@canonical.com wrote: Hey all, I ran into some commented out tests while making a change: https://github.com/juju/juju/pull/630/files#r16874739 I deleted them since keeping things around that we might need later is the job of source control, not comments ;) If it was a relevant test, removing them generally means they're never coming back. The best course of action might be to Skip it or to use ExpectFailure, providing an appropriate reason string. This makes it visible that there are tests not being run or failing, while still making sure they at least build. Of course, if they're indeed never coming back, then just removing them for good is more honest. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: First customer pain point pull request - default-hook
On Wed, Aug 20, 2014 at 5:46 AM, Matthew Williams matthew.willi...@canonical.com wrote: if JUJU_HOOK_NAME == start //run start else if JUJU_HOOK_NAME == config-changed //run config-changed else if JUJU_HOOK_NAME == stop //run stop else //unknown hook exit 1 fi I'd expect the else to be exit 0. This is the same behavior you get when juju would execute a hook but it does not exist in the charm. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: First customer pain point pull request - default-hook
On Wed, Aug 20, 2014 at 11:08 AM, William Reade william.re...@canonical.com wrote: On Wed, Aug 20, 2014 at 10:46 AM, Matthew Williams matthew.willi...@canonical.com wrote: Gustavo's observation about hooks that the charm might no know about yet means that the else clause is absolutely required, I wonder if that's obvious to someone who's new to charming? I'm pretty much adamant that we shouldn't even run new hooks, or expose new tools, unless the charm explicitly declares it knows about them. But I do imagine that many implementations will want the else anyway: they don't need to provide an implementation for every single hook anyway. But we're talking about default-hook, which is supposed to run when things are missing? Actually, we should probably call this missing-hook as originally suggested, to make it more obvious that this is being called because some arbitrary hook was not found. It'll probably convey the importance handling unknowns in a sane way more clearly. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: First customer pain point pull request - default-hook
On Wed, Aug 20, 2014 at 11:16 AM, Nate Finch nate.fi...@canonical.com wrote: Anyone who has ever written a switch statement should be used to putting in a default clause for something I don't expect... I don't think it should be a big deal. Some charms mentioned in this thread miss the switch altogether. Given the conversation so far, it doesn't feel like we really understand how people are organizing their charms today, nor how they are supposed to be using the missing-hook. For example, you said in the opening message that Many charms these days only contain one real hook script, and the rest are all just symlinks to the real one., and I'm yet to see a charm with *one* hook alone. Marco had a noble offer that we should accept: The majority, if not all, of charms that currently implement this pattern do so by either using charm-helpers or by having a giant if/else case statement at the bottom of the hook which maps which code should execute with each hook that has invoked the symlink'd file. I can take a survey of current charms which use symlinks to see if any don't fit this pattern. Yes, it would be good to have proper data on what charms are doing today, and how they are supposed to work in that new world. It would also be good to understand what using charm-helpers means. The charms discussed above would _not_ work well with a missing-hook implementation that dispatched on every hook. They would have to be adapted to it. Multiple people also mentioned in this thread that maybe it should not dispatch on all hooks. What does that mean? Which hooks would it dispatch on, and where is the line? Why? On Wed, Aug 20, 2014 at 11:50 AM, Nate Finch nate.fi...@canonical.com wrote: I would expect a lot of people will implement their charms as a single script (especially given the number of charms we've seen implemented that way even with minimal support for it). If the special hook file is called default-hook, it makes those single-script charms seem like less of a hack than if the single file is called missing-hook. It would also makes more sense to a new charm author, I think. It's not a hack.. it's subtle, and that's the reason why it should be called missing-hook. It _is_ subtle. People must be aware that there is a multitude of events dispatched to that one executable, potentially with events they do not expect, and they must be aware that by creating a different hook they will prevent that one executable from receiving that event. That's what missing-hook conveys to me. If you think that's too subtle, maybe we need a different proposal. One possibility is to give the charm author the ability to specify the name of the default/missing hook file in the charm metadata... this could serve You mean the same way we have a configuration file in Go that defines how we want our main() function to be called? How reasonable does that feel? gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: First customer pain point pull request - default-hook
On Wed, Aug 20, 2014 at 3:45 PM, Nate Finch nate.fi...@canonical.com wrote: Here's a proposal that is much simpler: we add a flag to the charm metadata, called something like single_hook. When single_hook is true, all hook events run a file called default-hook (or whatever we want to call it, I don't really care). $JUJU_HOOK_NAME will be set with the name of the hook that is running. That's it. What the charm authors do after the hook file gets run is up to them. That sounds reasonable. We could make both the hook name and the charm metadata flag be single-hook. But does it solve people's problems? Would people that today use half of the hooks symlinked and half of them without symlinks transition to that model, or is symlinking more convenient? What about people using charm helpers without a dispatch table, such as the case Aaron raised in this thread? Their charms would be broken (or will eventually be broken) without a dispatch table. Would they transition or would they stick to current practices? In the bug's comments, there's discussion about a lack of discoverability for what hooks the charm has... but honestly, if you need to know what the hooks do, you have to read the code anyway. Hopefully knowing what hooks a charm has shouldn't be necessary to use the charm (if using Juju requires you to read a charm's code... we're doing something wrong). We're also doing something wrong if knowing what a hook is supposed to do requires reading the code. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: First customer pain point pull request - default-hook
On Wed, Aug 20, 2014 at 5:05 PM, Nate Finch nate.fi...@canonical.com wrote: I think to answer most of these questions, we need more information about what the existing charms do, and input from the charmers themselves. Here's the info from Marco: http://pastebin.ubuntu.com/8100649/ Thanks. Looking at some entries from that list I can definitely see how single-hook would be useful, and it looks like it would also work well with the defined semantics. Numbers: 56/162 charms use symlinks 6 of those are only partially symlinked 50 of those use symlinks for all hooks Given those numbers, and the pattern described above, I'd definitely try to have the enforced single hook model you described last, which must be explicitly enabled to work, and where everything is run only through it when it is indeed enabled. Easier to implement, and to understand as well. Addressing Aaron's remark, the hook might be called dispatch so it that conveys the intended semantics rather than its uniqueness, and the metadata flag dispatch-hook: bool. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: First customer pain point pull request - default-hook
On Tue, Aug 19, 2014 at 9:07 AM, William Reade william.re...@canonical.com wrote: On Mon, Aug 18, 2014 at 9:33 PM, Gustavo Niemeyer gust...@niemeyer.net wrote: I don't think I fully understand the proposal there. To have such a something-changed hook, we ought to have a better mechanism to tell *what* actually changed. In other words, we have a number of hooks that imply a state transition or a specific notification (install, start, config-changed, leader-elected coming, etc). Simply calling out the charm saying stuff changed feels like a bad interface, both in performance terms (we *know* what changed) and in user experience (how do people use that!?). The issue is that as charms increase in sophistication, they seem to find it harder and harder to meaningfully map specific changes onto specific actions. Whether or not to react to a change in one relation depends on the values of a bunch of other units in other relations, to the extent that any individual relation change can have arbitrarily far-reaching consequences, and it ends up being easier to simply write something that maps directly from complete-available-state to desired-config. I have never seen myself a single charm that completely ignores all the action cues to simply re-read the whole state from the ground up, and we've just heard in this thread people claiming that even the charms that use a single hook via symlinks still rely on a dispatching table based on what action is happening, so I'm not ready to accept that claim at face value without some actual data. What percentage of the charms we have completely ignore the actions that are taking place when making decisions? * leader-deposed will completely lack hook tools: we can't run a default-hook there unless we know for sure that the implementation doesn't depend on any hook tools (in general, this is unlikely). Why? People can still run hook tools in leader-deposed, and they will not work. The situation is no different with default-hook: they are just two files in the same directory. Run one instead of the other. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: First customer pain point pull request - default-hook
On Tue, Aug 19, 2014 at 12:41 PM, William Reade william.re...@canonical.com wrote: (out of interest, if started/stopped state were communicated to you any other way, would you still need these?) If you communicate events in a different way, you obviously won't need your previous way of communicating events. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: First customer pain point pull request - default-hook
On Tue, Aug 19, 2014 at 1:10 PM, Aaron Bentley aaron.bent...@canonical.com wrote: True. At that point, the pattern is not a win, but it's not much of a loss. Changing the web site relation is extremely uncommon, but operations which do require server restarts are quite common. So making an exception for the web site relation can be seen as a micro-optimization. Restarting a process and killing all on-going activity is a big deal more often than not, for realistic services. True, I didn't call out the exceptions for the charmworld charm. For completeness, the exceptions in charmworld are: Yeah, it definitely depends on knowing the events still. On the other hand, it doesn't depend on knowing the events for database relation, search engine relation and configuration changes. The point I was trying to convey is not that you can merge or ignore certain events. The system was designed so that this was possible in the first place. The point is rather that the existing event system is convenient and people rely on it, so I don't buy that a something-changed hook is what most people want at this point. At the same time, that's not an argument _against_ it either. If you're happy with your design, and that'd help you, and William thinks this can be conveniently implemented, I'm all for making people's lives easier. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: First customer pain point pull request - default-hook
On Tue, Aug 19, 2014 at 6:58 PM, Matthew Williams matthew.willi...@canonical.com wrote: Something to be mindful of is that we will shortly be implementing a new hook for metering (likely called collect-metrics). This hook differs slightly to the others in that it will be called periodically (e.g. once every hour) with the intention of sending metrics for that unit to the state server. I'm not sure it changes any of the details in this feature or the pr - but I thought you should be aware of it Yeah, that's a good point. I'm wonder how reliable the use of default-hook will be, as it's supposed to run whenever any given hook doesn't exist, so charms using that feature should expect _any_ hook to be called there, even those they don't know about, or that don't even exist yet. The charms that symlink into a single hook seem to be symlinking a few things, not everything. It may well turn out that default-hook will lead to brittle charms. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: First customer pain point pull request - default-hook
Rather than passing it as the first argument, I suggest introducing an environment variable: $JUJU_HOOK_NAME. This would be set irrespective of how the hook is being called, so that the same hook can be used both as a symlink and as a default-hook, unchanged. It also means further spawned processes get a chance to tell the context they're running under. On Fri, Aug 15, 2014 at 5:36 PM, Nate Finch nate.fi...@canonical.com wrote: Just wanted to let people know that Moonstone is ramping up on the customer pain points, even ahead of the full spec and prioritization. I had talked to Jorge and Marco about what they thought was important, and they pointed out a couple of low hanging fruit. This was one of them. Many charms these days only contain one real hook script, and the rest are all just symlinks to the real one. (because no one wants to write 20 scripts) This is kind of a pain in the ass for charm writers, and doesn't work well on Windows (Windows symlink support is terrible). So, why not just have a default hook that gets called if the real hook isn't there? That's what I implemented today: https://github.com/juju/juju/pull/528 There's new hook in town: default-hook. If it exists and a hook gets called that doesn't have a corresponding hook file, default-hook gets called with the name of the original hook as its first argument (arg[1]). That's it. If/when this PR is accepted, Marco is planning to update charmhelpers to make it automatically recognize when the default-hook is called, and get the hook name from arg[1] instead of arg[0], so current scripts wouldn't even need to change - they'd just need the new charmhelpers, rename the one true script to default-hook, and delete all their symlinks. Bam. Moonstone is very excited to be working to make Juju easier for charm developers, and we'll see more improvements coming next week. -Nate -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: First customer pain point pull request - default-hook
I don't think I fully understand the proposal there. To have such a something-changed hook, we ought to have a better mechanism to tell *what* actually changed. In other words, we have a number of hooks that imply a state transition or a specific notification (install, start, config-changed, leader-elected coming, etc). Simply calling out the charm saying stuff changed feels like a bad interface, both in performance terms (we *know* what changed) and in user experience (how do people use that!?). I understand the underlying problem William is trying to solve but the current proposal doesn't seem like a complete solution on its own, and it also seems to change the existing understanding of the model completely. The proposed default-hooks is a trivial change to the existing well known workflow. On Sun, Aug 17, 2014 at 2:30 AM, John Meinel j...@arbash-meinel.com wrote: I'd just like to point out that William has thought long and hard about this problem, and what semantics make the most sense (does it get called for any hook, does it always get called, does it only get called when the hook doesn't exist, etc). I feel like had some really good decisions on it: https://docs.google.com/a/canonical.com/document/d/1V5G6v6WgSoNupCYcRmkPrFKvbfTGjd4DCUZkyUIpLcs/edit# default-hook sounds (IMO) like it may run into problems where we do logic based on whether a hook exists or not. There are hooks being designed like leader-election and address-changed that might have side effects, and default-hook should (probably?) not get called for those. I'd just like us to make sure that we actually think about (and document) what hooks will fall into this, and make sure that it always makes sense to rebuild the world on every possible hook (which is how charm writers will be implementing default-hook, IMO). John =:- On Sat, Aug 16, 2014 at 1:02 AM, Aaron Bentley aaron.bent...@canonical.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 14-08-15 04:36 PM, Nate Finch wrote: There's new hook in town: default-hook. If it exists and a hook gets called that doesn't have a corresponding hook file, default-hook gets called with the name of the original hook as its first argument (arg[1]). That's it. Nice! Thank you. Aaron -BEGIN PGP SIGNATURE- Version: GnuPG v1 iQEcBAEBAgAGBQJT7nVvAAoJEK84cMOcf+9h90UH/RMVabfJp4Ynkueh5XQiS6mD TPWwY0FVHfpAWEIbnQTQpnmkhMzSOKIFy0fkkXkEx4jSUt6I+iNYXdu8T77mA38G 7IZ7HAi+dAzRCrGTIZHsextrs5VpxhdzFJYOxL+TN5VUWYt+U+awSPFn0MlUZfAC 5aUuV3p3KjlHByLNT7ob3eMzR2mwylP+AS/9UgiojbUOahlff/9y83dYqkCDYzih C2rlwf0Wal12svu70ifggGKWcnF/eiwSm4TQjJsfMdCfw0gSg4ICgmIbWQ78OytJ AM4UBk1/Ue94dUm3YP+lcgAqJCC9GW5HksCFN74Qr+4xcnuqYoCJJxpU5fBOTls= =5YwW -END PGP SIGNATURE- -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: getting rid of all-machines.log
On Thu, Aug 14, 2014 at 1:35 PM, Nate Finch nate.fi...@canonical.com wrote: On Thu, Aug 14, 2014 at 12:24 PM, Gustavo Niemeyer gustavo.nieme...@canonical.com wrote: Why support two things when you can support just one? Just to be clear, you really mean why support two existing and well known things when I can implement a third thing, right? Yes, that is exactly what I mean. Okay, on that basis and without any better rationale than 12factor says I can do anything I'd be tempted to request further analysis on the problem if the decision was on my hands. There are more interesting problems to solve than redoing what already exists. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: getting rid of all-machines.log
On Thu, Aug 14, 2014 at 3:14 PM, Nate Finch nate.fi...@canonical.com wrote: I didn't bring up 12 factor, it's irrelevant to my argument. Is there someone else sending messages under your name? On Thu, Aug 14, 2014 at 12:23 PM, Nate Finch nate.fi...@canonical.com wrote: The front page of 12factor.net says offering maximum portability between execution environments ... that's exactly what I'm going for. I'm trying to make our product simpler and easier to maintain. That is all. If there's another cross-platform solution that we can use, I'd be happy to consider it. We have to change the code to support Windows. I'd rather the diff be +50 -150 than +75 -0. I don't know how to state it any simpler than that. How about simply allowing people to select their own rsyslog target? gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Intentionally introducing failures into Juju
Ah, and one more thing: when developing the chaos-injection mechanism in the mgo/txn package, I also added both a chance parameter for either killing or slowing down a given breakpoint. It sounds like it would be useful for juju's mechanism too. If you kill every time, it's hard to tell whether the system would know how to retry properly. Killing or slowing down just sometimes, or perhaps the first 2 times out of every 3, for example, would enable the system to recover itself, and an external agent to ensure it continues to work properly. On Wed, Aug 13, 2014 at 11:25 AM, Gustavo Niemeyer gustavo.nieme...@canonical.com wrote: That's a nice direction, Menno. The main thing that comes to mind is that it sounds quite inconvenient to turn the feature on. It may sound otherwise because it's so easy to drop files at arbitrary places in our local machines, but when dealing with a distributed system that knows how to spawn its own resources up, suddenly the just write a file becomes surprisingly boring and race prone. What about: juju inject-failure [--unit=unit] [--service=service] failure name? juju deploy [--inject-failure=name] ... On Wed, Aug 13, 2014 at 7:17 AM, Menno Smits menno.sm...@canonical.com wrote: There's been some discussion recently about adding some feature to Juju to allow developers or CI tests to intentionally trigger otherwise hard to induce failures in specific parts of Juju. The idea is that sometimes we need some kind of failure to happen in a CI test or when manually testing but those failures can often be hard to make happen. For example, for changes Juju's upgrade mechanics that I'm working on at the moment I would like to ensure that an upgrade is cleanly aborted if one of the state servers in a HA environment refuses to start the upgrade. This logic is well unit tested but there's nothing like seeing it actually work in a real environment to build confidence - however, it isn't easy to make a state server misbehave in this way. To help with this kind of testing scenario, I've created a new top-level package called wrench which lets us drop a wrench in the works so to speak. It's very simple with one main API which can be called from judiciously chosen points in Juju's execution to decide whether some failure should be triggered. The module looks for files in $jujudatadir/wrench (typically /var/lib/juju/wrench) on the local machine. If I wanted to trigger the upgrade failure described above I could drop a file in that directory on one of the state servers named say machine-agent with the content: refuse-upgrade Then in some part of jujud's upgrade code there could be a check like: if wrench.IsActive(machine-agent, refuse-upgrade) { // trigger the failure } The idea is this check would be left in the code to aid CI tests and future manual tests. You can see the incomplete wrench package here: https://github.com/juju/juju/pull/508 There are a few issues to nut out. 1. It needs to be difficult/impossible for someone to accidentally or maliciously activate this feature, especially in production environments. I have almost finished (but not pushed to Github) some changes to the wrench package which make it strict about the ownership and permissions on the wrench files. This should make it harder for the wrong person to drop files in to the wrench directory. The idea has also been floated to only enable this functionality in non-stable builds. This certainly gives a good level of protection but I'm slightly wary of this approach because it makes it impossible for CI to take advantage of the wrench feature when testing stable release builds. I'm happy to be convinced that the benefit is worth the cost. Other ideas on how to better handle this are very welcome. 2. The wrench functionality needs to be disabled during unit test runs because we don't want any wrench files a developer may have lying around to affect Juju's behaviour during test runs. The wrench package has a global on/off switch so I plan on switching it off in BaseSuite's setup or similar. 3. The name is a bikeshedding magnet :) Other names that have been bandied about for this feature are chaos and spanner. I don't care too much so if there's a strong consensus for another name let's use that. I chose wrench over spanner because I believe that's the more common usage in the US and because Spanner is a DB from Google. Let's not get carried away... All comments, ideas and concerns welcome. - Menno -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Port ranges - restricting opening and closing ranges
Agreed, but I also agree that the error on split ranges is a good simplification to get an implementation in place, and it also doesn't sound super useful, so it sounds okay to fail to begin with. The other cases are easy to handle, though. On Wed, Aug 6, 2014 at 8:26 AM, Kapil Thangavelu kapil.thangav...@canonical.com wrote: agreed. to be clear .. imo, close-port shouldn't error unless there's a type mismatch on inputs. ie none of the posited scenarios in this thread should result in an error. -k On Tue, Aug 5, 2014 at 8:34 PM, Gustavo Niemeyer gust...@niemeyer.net wrote: On Tue, Aug 5, 2014 at 4:18 PM, roger peppe rogpe...@gmail.com wrote: close ports 80-110 - error (mismatched port range?) I'd expect ports to be closed here, and also on 0-65536. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Port ranges - restricting opening and closing ranges
How many port ranges are typically made available? One.. Two? Sounds like a trivial problem. In terms of concurrency, there are issues either way. Someone can open a port while it is being closed, and whether that works or not depends purely on timing. gustavo @ http://niemeyer.net On Aug 6, 2014 9:41 AM, roger peppe roger.pe...@canonical.com wrote: On 5 August 2014 19:34, Gustavo Niemeyer gust...@niemeyer.net wrote: On Tue, Aug 5, 2014 at 4:18 PM, roger peppe rogpe...@gmail.com wrote: close ports 80-110 - error (mismatched port range?) I'd expect ports to be closed here, and also on 0-65536. I'm not sure. An advantage of requiring that exactly the same ports must be closed as were opened, you can use the port range as a key, which makes for a very simple (and trivially concurrent-safe) implementation in a mongo collection. I'd suggest that this compromise is worth it. We could always make an initial special case for 0-65535 too, if desired. -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Port ranges - restricting opening and closing ranges
Why would any application well designed open thousands of ports individually rather than a range? Sounds like an unreasonable use case. I also don't get your point about concurrency. You don't seem to have addressed the point I brought up that opening or closing ports concurrently today already presents undefined behavior. gustavo @ http://niemeyer.net On Aug 6, 2014 2:53 PM, roger peppe roger.pe...@canonical.com wrote: On 6 August 2014 10:32, Gustavo Niemeyer gust...@niemeyer.net wrote: How many port ranges are typically made available? One.. Two? Sounds like a trivial problem. Some applications might open thousands of individual ports. It would be nice if it worked well in that case too. In terms of concurrency, there are issues either way. Someone can open a port while it is being closed, and whether that works or not depends purely on timing. When we've got several units sharing a port space, we'll want to keep a unique owner for each port range. That's trivial if the reference can be keyed by the port range, but not as straightforward if the lookup is two-phase. What we don't want is two units in the same machine to be able to have the same port open at the same time. I suppose we could rely on the fact that hooks do not execute simultaneously, but it would be preferable in my view to keep those concerns separate. In my view, always close the range you've opened is an easy to explain rule, and makes quite a few things simpler, without being overly restrictive. gustavo @ http://niemeyer.net On Aug 6, 2014 9:41 AM, roger peppe roger.pe...@canonical.com wrote: On 5 August 2014 19:34, Gustavo Niemeyer gust...@niemeyer.net wrote: On Tue, Aug 5, 2014 at 4:18 PM, roger peppe rogpe...@gmail.com wrote: close ports 80-110 - error (mismatched port range?) I'd expect ports to be closed here, and also on 0-65536. I'm not sure. An advantage of requiring that exactly the same ports must be closed as were opened, you can use the port range as a key, which makes for a very simple (and trivially concurrent-safe) implementation in a mongo collection. I'd suggest that this compromise is worth it. We could always make an initial special case for 0-65535 too, if desired. -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Port ranges - restricting opening and closing ranges
gustavo @ http://niemeyer.net On Aug 6, 2014 3:03 PM, roger peppe roger.pe...@canonical.com wrote: On 6 August 2014 13:57, Gustavo Niemeyer gust...@niemeyer.net wrote: Why would any application well designed open thousands of ports individually rather than a range? Sounds like an unreasonable use case. I don't know. Ok. So let's please move on. I don't see the complexity of listing a few things (even if it is a thousand) and removing them. It's certainly much better than removing a thousand ports individually. I also don't get your point about concurrency. You don't seem to have addressed the point I brought up that opening or closing ports concurrently today already presents undefined behavior. The result is undefined for a unit (a port open can fail if another one already has the port open) Again, let's not argue anymore then. There's no real problem being created or solved either way. -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Port ranges - restricting opening and closing ranges
On Tue, Aug 5, 2014 at 4:18 PM, roger peppe rogpe...@gmail.com wrote: close ports 80-110 - error (mismatched port range?) I'd expect ports to be closed here, and also on 0-65536. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: help please: mongo/mgo panic
Alright, the guess last night was correct, and the candidate fix as well. I've managed to reproduce the problem by stressing out the scenario described with 4 concurrent runners running the following two operations, meanwhile the chaos mechanism injects random slowdowns in various critical points: []txn.Op{{ C: accounts, Id: 0, Update: M{$inc: M{balance: 1}}, }, { C: accounts, Id: 1, Update: M{$inc: M{balance: 1}}, }} To reach the bug, the stress test also has to run half of the transactions in this order, and the other half with these same operations in the opposite order, so that dependency cycles are created between the transactions. Note that the txn package guarantees that operations are always executed in the order provided in the transaction. The fix and the complete test is available in this change: https://github.com/go-mgo/mgo/commit/3bc3ddaa The numbers there are lower to run in a reasonable mount of time, but to give some confidence on the fix and the code in general, I've run this test for 100k transactions being concurrently executed with no problems. Also, to give a better perspective of the sort of outcome that the logic for concurrent runners produce, this output was generated by that test while running for 100 transactions: http://paste.ubuntu.com/7906618/ The tokens like a) in these lines are the unique identifier for a given transaction runner. Note how every single operation is executed in precise lock-step despite the concurrency and the ordering issues, even assigning the same revision to both documents since they were created together. Also, perhaps most interestingly, note the occurrences such as: [LOG] 0:00.180 b) Applying 53d92a4bca654539e703_7791e1dc op 0 (update) on {accounts 0} with txn-revno 2: DONE [LOG] 0:00.186 d) Applying 53d92a4bca654539e703_7791e1dc op 1 (update) on {accounts 1} with txn-revno 2: DONE Note the first one is b) while the second one is d), which means there are two completely independent runners, in different goroutines (might as well be different machines), collaborating towards the completion of a single transaction. So, I believe this is sorted. Please let me know how it goes there. On Wed, Jul 30, 2014 at 4:14 AM, Gustavo Niemeyer gustavo.nieme...@canonical.com wrote: Okay, I couldn't resist investigating a bit. I've been looking at the database dump from earlier today and it's smelling like a simpler bug in the txn package, and I might have found the cause already. Here is a quick walkthrough while debugging the problem, to also serve as future aid in similar quests. Enabling full debug for the txn package with SetDebug and SetLogger, and doing a ResumeAll to flush all pending transactions, we can quickly get to the affected document and transaction: 2014/07/30 02:19:23 Resuming all unfinished transactions 2014/07/30 02:19:23 Resuming 53d6057930009a01ba0002e7 from prepared 2014/07/30 02:19:23 a) Processing 53d6057930009a01ba0002e7_dcdbc894 2014/07/30 02:19:23 a) Rescanning 53d6057930009a01ba0002e7_dcdbc894 2014/07/30 02:19:23 a) Rescanned queue with 53d6057930009a01ba0002e7_dcdbc894: has prereqs, not forced 2014/07/30 02:19:23 a) Rescanning 53d6057930009a01ba0002eb_98124806 2014/07/30 02:19:23 a) Rescanned queue with 53d6057930009a01ba0002eb_98124806: has prereqs, not forced 2014/07/30 02:19:23 a) Rescanning 53d6057930009a01ba0002ee_a83bd775 2014/07/30 02:19:23 a) Rescanned document {services ntp} misses 53d6057930009a01ba0002ee_a83bd775 in queue: [53d6057930009a01ba0002eb_98124806 53d6057930009a01ba0002ea_4ca6ed41 53 d6057c30009a01ba0002fd_4d8d9123 53d6057e30009a01ba000301_ba0b61dd 53d6057e30009a01ba000303_a26cb429] 2014/07/30 02:19:23 a) Reloaded 53d6057930009a01ba0002ee_a83bd775: prepared panic: rescanned document misses transaction in queue So this error actually means something slightly different from what I pointed out in the bug. The transaction runner state machine creates transactions in the preparing state, and then moves it over to prepared when all affected documents were tagged with the transaction id+nonce. So what this means is that there is a transaction in progress in the prepared state, while the actual document misses the id in its local queue, which is an impossible situation unless the document was fiddled with, there was corruption, or a bug in the code. So, let's have a look at the affected documents. First, the document being changed: db.services.findOne({_id: ntp}) http://paste.ubuntu.com/7902134/ We can see a few transactions in the queue, but the one raising the issue is not there as reported by the error. And this is full transaction raised by the error: db.txns.findOne({_id: ObjectId(53d6057930009a01ba0002ee)}) http://paste.ubuntu.com/7902095/ One interesting thing we can do from here is verifying
Re: help please: mongo/mgo panic
We've got a database dump yesterday, which gives me something to investigate. I'll spend some time on this tomorrow (today) and report back. On Wed, Jul 30, 2014 at 1:34 AM, Menno Smits menno.sm...@canonical.com wrote: All, Various people have been seeing the machine agents panic with the following message: panic: rescanned document misses transaction in queue The error message comes from mgo but the actual cause is unknown. There's plenty of detail in the comments for the LP bug that's tracking this. If you have any ideas about a possible cause or how to debug this further please weigh in. https://bugs.launchpad.net/juju-core/+bug/1318366 Thanks, Menno -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Mongo experts - help need please
On Fri, Jul 25, 2014 at 2:37 AM, Ian Booth ian.bo...@canonical.com wrote: The tests passed for me every time also, with and without independent sessions. If I loaded my machine to max out CPU usage to 100%, then the tests (different ones each run) would fail intermittently but reproducibly every time with session copy, but I could not induce even one failure without session copying. As I mentioned, it sounds like a concurrency or timing issue, which isn't really surprising given that the code at hand is indeed time sensitive, and that session.Copy will alter significantly the timing characteristics of the test. This is at the top of the test file: // worstCase is used for timeouts when timing out // will fail the test. Raising this value should // not affect the overall running time of the tests // unless they fail. worstCase = testing.LongWait // justLongEnough is used for timeouts that // are expected to happen for a test to complete // successfully. Reducing this value will make // the tests run faster at the expense of making them // fail more often on heavily loaded or slow hardware. justLongEnough = testing.ShortWait // fastPeriod specifies the period of the watcher for // tests where the timing is not critical. fastPeriod = 10 * time.Millisecond // slowPeriod specifies the period of the watcher // for tests where the timing is important. slowPeriod = 1 * time.Second gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Mongo experts - help need please
On Fri, Jul 25, 2014 at 5:29 AM, Stuart Bishop stuart.bis...@canonical.com wrote: On 25 July 2014 12:05, Gustavo Niemeyer gustavo.nieme...@canonical.com wrote: The bug Ian cites and is trying to work around has sessions failing with an i/o error after some time (I'm guessing resource starvation in MongoDB or TCP networking issues). session.Copy() is pulling things from a pool, so it might be handing out sessions doomed to fail with exactly the same issue. The connections in the pool could even be perfectly functional when they went in, with no way at the go level of knowing they have failed without trying them. That's not actually the bug Ian is asking information about in this thread. The reason why the timeouts happen is well understood: MongoDB has a fixed timeout of 10 minutes, and mgo right now does not concurrently ping a socket that was reserved for a session. Using a single session forever and never calling Refresh on it will surely timeout if it stays unused for that long. The solution is simple: call Refresh at a control point (where that is depends on the application shape) or Close a copy of the session and let the pool internally deal with it, and do handle any errors when they happen. If this is the case, then Ian would need to handle the failure by ensuring the failed connection does not go back in the pool and grabbing a new one (the defered Close() will return it I think). And repeating until it works, or until the pool has been exhausted and we know Mongo is actually down rather than just having a polluted pool. There's no reason to do that. The pool can deal with connection errors and timeouts, and collects bad sockets appropriately. Trying to ensure a bad socket never comes out of the pool is also a bad path. It's impossible to guarantee that a socket obtained from mgo or any other database driver is indeed in perfect state. Failures can happen the nanosecond after any tests are made. The reliable way is to handle errors appropriately, fallback to a sane path, and retry from there. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Mongo experts - help need please
On Fri, Jul 25, 2014 at 1:02 AM, Ian Booth ian.bo...@canonical.com wrote: We've transitioned to using Session.Copy() to address the situation whereby Juju would create a mongo collection instance and then continue to make db calls against that collection without realising the underlying socket may have become disconnected. This resulted in Juju components failing, logging i/o timeout errors talking to mongo, even though mongo itself was still up and running. Sounds sane, as I indicated in previous discussions about the topic in these last two weeks and also about a year ago when we covered that. Serializing every single request to a concurrent server via a single database connection seems like a pretty bad idea for anything but simplistic servers. As an aside - I'm wondering whether the mgo driver shouldn't transparently catch an i/o error associated with a dead socket and retry using a fresh connection rather than imposing that responsibility on the caller? The evidence so far indicates that this will likely not happen. The current design was purposefully put in place so that harsh connection errors are not swept under the rug, and this seems to be working well so far. I'd rather not have juju proceeding over a harsh problem such as a master re-election midway through the execution of an algorithm without any indication that the failure has happened, let alone silently retry operations that in most cases are not idempotent. That said, the goal is of course not to make the developer's life miserable. All the driver wants is an acknowledgement that the error was perceived and taken care of. This is done trivially by calling: session.Refresh() Done. The driver will happily drop the error notice, and proceed with further operations, blocking if waiting for a re-election to take place is necessary. That said, as stated above using a single session for _everything_ might not be a good idea for other reasons. (...) If session.Copy() doesn't work here, what's the approach to use to ensure the watcher just doesn't become dead because the underlying socket dies? Or how can we make the session.Copy() approach work always even when the host machine is under high load? Or maybe watcher code is fine and the tests are wrong? This feels very much like a concurrency or timing issue. You might also be misunderstanding what session.Copy does.. it's not so magic. If session.Copy truly prevented the watcher from working, it wouldn't work at all either way. Every independent process that connects to the database and does a change is monitored by watchers that live in different sessions. The tests are quite simple: I'm not able to observe the test failure you mention after hacking it to use independent sessions: http://paste.ubuntu.com/7852418/ gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: series-agnostic charm URLs
On Wed, Jul 23, 2014 at 7:35 AM, roger peppe rogpe...@gmail.com wrote: We want to store charm URLs in mongo-db that are agnostic whether the series is specified or not. For example, in a bundle, a service is free to specify a series in the charm name or not. That sounds slightly surprising. How do we plan to define what the bundle actually means? While having one or two types to represent the concept may be argued back and forth, there's an underlying concept that is important: one form is a loose wildcard that has to be resolved depending on context before being useful, and was originally designed to be used in command lines and the such, while the other is a more formal specification (must have a schema, must have series). Accepting the loosely defined form in a bundle seems surprising, even if it just means not having a series, given that deploying the bundle would hopefully be somewhat deterministic in terms of which distributions are being used. I'd like to suggest that we remove the Reference type and use the URL type throughout, allowing it to have an unspecified series where the string form does not specify a series. This means that the URL type would be an exact reflection of the string form of a charm URL. As noted above, a Reference may not have a schema as well, so this suggestion seems to imply that foo becomes a valid URL. Maybe having just URL could be made cleaner, though. This should be judged based on a more detailed proposal. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: series-agnostic charm URLs
On Wed, Jul 23, 2014 at 9:13 AM, Richard Harding rick.hard...@canonical.com wrote: This is driven by requirements from ecosystem and users where bundles define a 'solution'. A mongodb-cluster bundle doesn't need to be updated every time a new revision comes out, or even if a new series comes out. It is a usable solution regardless. Bundles can be as specific as they wish to be, however requiring them to define charms specifically reduces their reusability and causes us to be less flexible. When you design a system there's always a tension between what people need and what they think they need. Speaking of a different area close to our hearts, programming languages such as Perl evolved with the author hearing user requests.. developers, even fairly experienced ones, tend to want to pack as much power on as few key strokes as possible, and a language that has a very high rate of meaning per key stroke is often deemed as an expressive and powerful programming language. That feeling presumes that there is a high cost in typing a bit more, but as time passes we're learning that the semantic load has a more relevant cost on itself, and simpler but consistent primitives often yield better results. Going back to bundles, not having to update a bundle when a new, entirely different, release of Ubuntu comes out, is of course much more expressive, and people love expression, but carries with it a relevant semantic load. It also means neither we nor anybody else has any idea about what people actually get when they deploy a bundle, and whether the bundle will even work tomorrow once a new major upgrade is pushed to the repository. Our focus should not be to encourage that, but to help people express what they mean clearly and easily. If they want a new release of the bundle with a slightly different meaning, that should be trivial, but it should not be trivial to express lack of clarity. We also have to worry about historical usage as we've always supported the vague behaviour and many of the current of bundles take advantage of it. Yes, bundles were very organically developed. But I won't re-raise that rant. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: series-agnostic charm URLs
On Wed, Jul 23, 2014 at 9:59 AM, roger peppe roger.pe...@canonical.com wrote: The charm URL in a bundle means exactly what it would mean if you typed it in a juju deploy command. That is, it is dependent on the charms available at bundle deploy time. I would fix that instead. I do believe having just URL would be significantly cleaner. What area would you like to see more detail on? The code review, but it doesn't have to be me judging it. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Enhancing our IRC bot?
Great timing, Kate. I was recently asked to take care of mup's deployment again, and I'm about to put live its third incarnation, reviving a hack I started back in 2011 to port the ancient Erlang bot I wrote too many years ago into a Go version. My goal, among other things, is to make plugin writing a lot easier, so this kind of problem fits well. I'm just finishing a few details and will send some notes soon. On Wed, Jul 23, 2014 at 11:55 AM, Katherine Cox-Buday katherine.cox-bu...@canonical.com wrote: Hey all, I thought my first post to the list would be something relatively innocuous :) Have we ever considered enhancing our IRC bot to report CI status? Maybe start off with important notifications such as job failures? It might bring more attention to the health of trunk, and IRC is already a major communication hub. Interested in your thoughts! - Katherine -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Charm store API proposal, new version
On Tue, Jul 15, 2014 at 7:05 PM, Richard Harding rick.hard...@canonical.com wrote: It is listed under known clients in the spec, and we mentioned your request down below. What we lack is your specific use cases as no one working on the spec is knowledgeable about how you're using the api. Besides what others have said, requiring everyone to not only review their own usage of the existing public APIs, but to justify their cases in a convincing way, as an attempt to prevent you from breaking the existing use cases is a pretty bad approach to API compatibility. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: RFC: mongo _id fields in the multi-environment juju server world
On Mon, Jul 7, 2014 at 10:09 AM, roger peppe roger.pe...@canonical.com wrote: I had assumed that because every client needs to see every transaction there would likely be no benefit to sharding the log, although technically you could shard on transaction id. I'd be Clients don't need to see every transaction. Only those that affect the documents they are acting on. Thanks for pointing this out. If we manage to hugely scale juju using mongodb I will be very happy. I still think we should do some measurements to convince us that we actually have some hope of doing so though. My own measurements left me less than convinced of the possibility, although it's been a while since I did them. When you measured a sharded setup, what was the outcome? gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: RFC: mongo _id fields in the multi-environment juju server world
On Mon, Jul 7, 2014 at 2:03 PM, roger peppe roger.pe...@canonical.com wrote: The latter might turn out to be quite awkward, though there's probably a nice solution I don't see. Suppose we've got three environments, A, B and C. We have transactions that span {A, B}, {B, C} and {C, A}. How can we choose a consistent shard key for all those transactions? What is a consistent shard key and why does it matter? Okay, so the measurements that left you unconvinced that sharding might help to scale up were not using sharding. If we struggle to meet the requirements for a single environment, we're unlikely to meet them when we're running several environments per shard, which is surely necessary if we're to scale up. That's unsound reasoning for the context. It implies that to be able to meet a load demand with many serving machines we must be able to meet the load demand with a single serving machine. Not true. I hope it can work for us. I really do. I do as well. I just worry that without actually doing some measurement in advance, we may spend a lot of time working on this stuff and find that it was all for nought because we're fundamentally bottlenecked somewhere we didn't anticipate. By all means, please do measure and collect as much data as necessary to have a good design. We won't see any performance improvements without a reasonable understanding of how the system works and performs. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: move towards using gopkg.in
On Mon, Jul 7, 2014 at 6:00 PM, Ian Booth ian.bo...@canonical.com wrote: I'm somewhat wary of depending on an another unknown third party website being That's hilarious. I haven't been pushing for its usage on juju, and I'm still not the one actively pushing it, but that's a pretty bad argument to raise here. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: move towards using gopkg.in
On Mon, Jul 7, 2014 at 7:18 PM, Ian Booth ian.bo...@canonical.com wrote: It wasn't mean to be funny. I'm unsure why it's a bad argument. It's quite prudent to ensure that critical infrastructure on which our development depends meets expectations with regard to uptime, reliability etc (a case in point being the recent issue with an out of date certificate or so I was told). Sorry if the question caused any offence. I raised the question totally independent of that fact that someone within Canonical had set up the site. You can't both say that it is totally independent from someone next to you being responsible for it, and that it's about being an unknown third party. If your worries are about reliability, there is public track record with the uptime since it was put online (http://stats.pingdom.com/r29i3cfl66c0), and that uptime is supported by replicated deployments across separate cities with automatic failover. Any other concerns? gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: move towards using gopkg.in
On Mon, Jul 7, 2014 at 8:49 PM, David Cheney david.che...@canonical.com wrote: I don't want to introduce another thing to break CI, we already pull from github which is bad enough, but going via gopkg.in introduces an additional point of failure which can further reduce the already bullet ridden credibility of our CI. Again, gopkg.in sits in a reliable deployment, with a provable track record. I also don't want to start introducing versioned import paths into Juju without serious discussion of how to prevent two different versions of a package transitively. go list -f '{{range .Deps}}{{printf %s\n .}}{{end}}' | grep gopkg.in | sort -u | sed 's/\.v[0-9]\+$/\.vN/' | uniq -c | sed '/ 1 /d' I am NOT LGTM on any change that introduces gopkg.in redirected import paths until the issue above is resolved. Okay, that's done. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: RFC: mongo _id fields in the multi-environment juju server world
On Fri, Jul 4, 2014 at 6:01 AM, roger peppe roger.pe...@canonical.com wrote: There is another possiblity: we could just use a different collection name prefix for each environment. There is no hard limit on the number of collections in mongo (see http://docs.mongodb.org/manual/reference/limits/). For sharding and for good space management in general it's better to have data in a collection that gets automatically managed by the cluster. It's also much simpler to deal with in general, even if it does require code changes to get started. - for a small environment, table indexes remain small and lookups fast even though the total number of entries might be huge. Same as above: when it gets _huge_ you need sharding either way, and it's easier and more efficient to manage a single collection than 10k. - each environment could have a separate mongo txn log, so one busy environment that's constantly adding transactions will not necessarily slow down all the others. There is, in general, no need for sequential consistency between environments. With txn there's no sequential consistency even within the same environment, if you're touching different documents. - database isolation between environments is an advantage when things go wrong - it's easier to fix or delete individual environments if their tables are isolated from one another. Sure, it prevents bad mistakes caused by not taking the environment id in consideration, but deleting foo:* is just as easy. I suggest that, at the least, taking this approach would be a quick road to making the state work with multiple environments. It would not preclude a move to changing to use composite keys in the future. We already know it's a bad idea today. Let's please not do that mistake. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: RFC: mongo _id fields in the multi-environment juju server world
On Fri, Jul 4, 2014 at 10:32 AM, roger peppe roger.pe...@canonical.com wrote: It won't be possible to shard the transaction log. Why not? The thing I'm trying to get across is: until we know one way or another, I believe it would be better to choose the (much) simpler option and use the (potential weeks of) dev time for other things. We know it's a bad idea. Besides everything else I mentioned, there are _huge_ MongoDB databases out there being that depend on sharding to scale.. we're talking hundreds of machines. It seems very naive to go with a model that loses the benefits of all the lessons the MongoDB development team learned with those use cases, and the work they have done to support them well. We have been there in Canonical. Ask folks about the CouchDB story. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Port ranges - restricting opening and closing ranges
+1 to Mark's point. Handling exact matches is much easier, and does not prevent a fancier feature later, if there's ever the need. On Thu, Jun 26, 2014 at 3:38 PM, Mark Ramm-Christensen (Canonical.com) mark.ramm-christen...@canonical.com wrote: My belief is that as long as the error messages are clear, and it is easy to close 8000-9000 and then open 8000-8499 and 8600-9000, we are fine.Of course it is nicer if we can do that automatically for you, but I don't see why we can't add that later, and I think there is a value in keeping a port-range as an atomic data-object either way. --Mark Ramm On Thu, Jun 26, 2014 at 2:11 PM, Domas Monkus domas.mon...@canonical.com wrote: Hi, me and Matthew Williams are working on support for port ranges in juju. There is one question that the networking model document does not answer explicitly and the simplicity (or complexity) of the implementation depends greatly on that. Should we only allow units to close exactly the same port ranges that they have opened? That is, if a unit opens the port range [8000-9000], can it later close ports [8500-8600], effectively splitting the previously opened port range in half? Domas -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Thoughts to keep in mind for Code Review
Agreed, but for a slightly different reason. The suggestion is to annotate the patch with the reason for the change, rather than the code itself, which might indeed lead to a different kind of comment. While this might be useful, one of the interesting outcomes of code reviewing is that it forces the final logic to go through different eyes and mindsets. The I don't get it is not always a bad thing in a review.. it's rather the reason why simplifications and entirely different approaches are suggested. Many times I consciously avoid reading an on-going discussion in the review before doing my own review, precisely so I can get a fresh perspective on the code before getting to know everyone else's. Then, with inline reviewing saying Please tell me why you did this is very cheap on both ends. On Wed, Jun 25, 2014 at 1:42 AM, Ian Booth ian.bo...@canonical.com wrote: -1 on annotations. If you need to annotate to make it clearer then that should be done as code comments so the next poor soul who reads the code has a clue of what's been done On 25/06/14 14:20, John Meinel wrote: An interesting article from IBM: http://www.ibm.com/developerworks/rational/library/11-proven-practices-for-peer-review/ There is a pretty strong bias for we found these results and look at how our tool makes it easier to follow these guidelines, but the core results are actually pretty good. I certainly recommend reading it and keeping some of it in mind while you're both coding and reviewing. (Particularly how long should code review take, and how much code should be put up for review at a time.) Another trick that we haven't made much use of is to annotate the code we put up for review. We have the summary description, but you can certainly put some inline comments on your own proposal if you want to highlight areas more clearly. John =:- -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Thoughts to keep in mind for Code Review
Thanks, John. Several nice ideas there. I especially like the data backing the first few points.. it provides evidence to something we intuitively understand. I also wrote some points about this same topic, but from a slightly different perspective, last year: http://blog.labix.org/2013/02/06/ethics-for-code-reviewers On Wed, Jun 25, 2014 at 1:20 AM, John Meinel j...@arbash-meinel.com wrote: An interesting article from IBM: http://www.ibm.com/developerworks/rational/library/11-proven-practices-for-peer-review/ There is a pretty strong bias for we found these results and look at how our tool makes it easier to follow these guidelines, but the core results are actually pretty good. I certainly recommend reading it and keeping some of it in mind while you're both coding and reviewing. (Particularly how long should code review take, and how much code should be put up for review at a time.) Another trick that we haven't made much use of is to annotate the code we put up for review. We have the summary description, but you can certainly put some inline comments on your own proposal if you want to highlight areas more clearly. John =:- -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: This is why we should make go get work on trunk
go is the default build tool, and the vast majority of go projects work out of the box with go get. If we cannot make it work, that's fine, but looking at other projects that cannot get it to work is no excuse. If you guys can make it work, even if we continue to support godep(s), by all means do it. Not only it's a better welcome for Go developers, but it also means these pieces can more easily be used in other projects too, without having to import the whole build system. On Fri, Jun 6, 2014 at 6:11 PM, Kapil Thangavelu kapil.thangav...@canonical.com wrote: just as it fails for many other projects.. etcd, docker, serf, consul, etc... most larger projects are going to run afoul of trying to do cowboy dependency management and adopt one of the extant tools for managing deps and have a non standard install explained to users in its readme, else its vendoring its deps. -k On Fri, Jun 6, 2014 at 5:05 PM, Nate Finch nate.fi...@canonical.com wrote: (Resending since the list didn't like my screenshots) https://twitter.com/beyang/statuses/474979306112704512 https://github.com/juju/juju/issues/43 Any tooling that exists for go projects is going to default to doing go get. Developers at all familiar with go, are going to use go get. People are going to do go get github.com/juju/juju and it's going to fail to build, and that's a terrible first impression. Yes, we can update the README to tell people to run godeps after running go get, and many people are not going to read it until after they get the error building. Here's my suggestion: We make go get work on trunk and still use godeps (or whatever) for repeatable builds of release branches. There should never be a time when tip of trunk and all dependent repos don't build. This is exceedingly easy to avoid. Go crypto (which I believe is what is failing above) is one of the few repos we rely on that isn't directly controlled by us. We should fork it so we can control when it updates (since the people maintaining it seem to not care about making breaking API changes). -Nate -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: GitHub issues
The comment was made with the understanding that this was your original plan, and the point is to measure engagement before closing it down, or you'll never know whether it makes any difference for juju specifically. Also, isn't Launchpad able to track issues originally filed on other trackers? That used to be one of its big selling points for distro work. gustavo @ http://niemeyer.net On Jun 4, 2014 10:43 PM, Ian Booth ian.bo...@canonical.com wrote: Actually the original plan was not to enable Github's issue tracker and continue using Launchpad's. Having 2 issue trackers is not optimal and will create too much management overhead and wasted effort. We are continuing to use Launchpad's milestones for scoping and planning releases etc and of course this all ties in with Launchpad's issue tracker. So I'd prefer to stick with the plan and disable Githubs's tracker. This was meant to be done when the repo was set up. On 05/06/14 00:23, Gustavo Niemeyer wrote: I would keep them around for a while and try to observe how the community reacts to the availability. If people don't care, then just closing it sounds fine. If you start to get engagement there, might be worth going over the trouble of supporting users that live in that ecosystem. My experience has been that I got significantly more engagement, including bugs, once moving over projects to github. On Wed, Jun 4, 2014 at 10:13 AM, Curtis Hovey-Canonical cur...@canonical.com wrote: On Wed, Jun 4, 2014 at 6:36 AM, Andrew Wilkins andrew.wilk...@canonical.com wrote: What are our options? Is it simplest just to disable GitHub issues, and have the lander pick up fixes lp:NN and add a comment to the bug in Launchpad? I think this is the easiest path. -- Curtis Hovey Canonical Cloud Development and Operations http://launchpad.net/~sinzui -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: not rebasing after PR?
FWIW, I pretty much never rebase in my usual development workflow. I'm surprised to hear it became a norm somehow. On Thu, Jun 5, 2014 at 2:06 PM, roger peppe rogpe...@gmail.com wrote: I'd love to ditch rebasing if it was reasonable to do so. It just adds overhead to an already tiresome procedure. On 5 June 2014 16:22, Nate Finch nate.fi...@canonical.com wrote: I am far from a git expert, but it sounds like we can get a bzr-like overview of merges to trunk if we give git the right command. This is from the canonical-tech discussion: (from Dimitri John Ledkov) On Thu, Jun 5, 2014 at 2:26 PM, Ian Booth ian.bo...@canonical.com wrote: (from Nate Finch) As for bzr versus git, I honestly don't see much of a difference. I know there are things that bzr does better than git, but they're not features I really ever used, so I don't miss them. What about all the complications, hassle, and extra overhead with the need to rebase all the time due to git's logging model? There's just no need for that in bzr so the workflow is *much* simpler [0]. bzr defaults to showing just the first parent only, but you can see all the glory details with $ bzr log -n 0. git defaults to glory details, but you can get equivalent to bzr default view as well, e.g. compare output of: $ git log --oneline --graph --decorate with $ git log --oneline --graph --decorate --first-parent If one consistently merges in, individual branches only, git will generate the same graph history as bzr and will be able to present it the same way bzr would. This sounds like it might solve some of the problems we're worrying about that get caused by rebasing, such as losing comments etc. It sounds like this might be a usable workflow: commit several times to your feature branch. rebase into a single commit submit pull request comment on pull request commit patches to pull request merge pull request as-is (with extra commits after submit) This mashes all your pre-PR commits into one, so hides some commit spam that way, but then keeps the post-PR commits, to preserve comments. It sounds like we can still get a list of just the merges from git, to exclude all the commits during code review. This sounds like the best of both worlds (or as close as we can get) and removes one more step (rebasing after code review changes), which seems like a good thing. Thoughts? -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: GitHub issues
I would keep them around for a while and try to observe how the community reacts to the availability. If people don't care, then just closing it sounds fine. If you start to get engagement there, might be worth going over the trouble of supporting users that live in that ecosystem. My experience has been that I got significantly more engagement, including bugs, once moving over projects to github. On Wed, Jun 4, 2014 at 10:13 AM, Curtis Hovey-Canonical cur...@canonical.com wrote: On Wed, Jun 4, 2014 at 6:36 AM, Andrew Wilkins andrew.wilk...@canonical.com wrote: What are our options? Is it simplest just to disable GitHub issues, and have the lander pick up fixes lp:NN and add a comment to the bug in Launchpad? I think this is the easiest path. -- Curtis Hovey Canonical Cloud Development and Operations http://launchpad.net/~sinzui -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Juju, mongo 2.6 and labix.org/v2/mgo issue
It's indeed being updated. The frequent sprints haven't been helping, but I'm hoping to have a new release out next week. gustavo @ http://niemeyer.net On May 28, 2014 8:19 AM, Ian Booth ian.bo...@canonical.com wrote: Hi all I'm testing Juju with Mongo 2.6 to evaluate how that affects our remaining intermittent unit test failures. I've compiled a copy of Mongo 2.6 and have been able to bootstrap an environment with no issues. Great so far. However, the tests aren't happy. eg the tests in agent/mongo fail as do a bunch of others. It seems Mongo 2.4 - 2.6 has changed he way admin users are created. In Juju, we have a EnsureAdminUser() function. It does this: session.DB(admin).AddUser(p.User, p.Password, false) That fails with: not authorized for upsert on admin.system.users Fine, so the AddUser API doc in the mgo driver says to use UpsertUser for mongo 2.4 or greater: session.DB(admin).UpsertUser( mgo.User{Username: p.User, Password: p.Password, Roles:[]mgo.Role{mgo.RoleUserAdminAny}}) It still fails the same way. So I reverted to calling the createUser command directly as per the Mongo 2.6 docs: session.DB(admin).Run(bson.D{ {createUser, p.User}, {pwd, p.Password}, {roles, []mgo.Role{mgo.RoleUserAdminAny}}}, nil) The above works for the initially failing tests in agent/mongo. I haven't re-run the entire suite again though. It may be further tweaks are required. I can easily continue using the last construct above, but it *seems* that the mgo driver may need updating? Am I missing something? -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Ensuring tests pass on gccgo
On Wed, May 21, 2014 at 10:43 PM, Ian Booth ian.bo...@canonical.com wrote: We are working to make all juju-core unit tests pass using gccgo. In case you didn't already know, there's a common issue which has caused a lot of the failures to date. Here's a quick heads up on how to deal with it. golang-go and gcc-go have different map implementations which results in ordering differences, affecting things like range etc (simplistically put, gcc-go's map ordering is random whereas currently golang-go is somewhat deterministic). This is changing in the main compiler as well, in Go 1.3: http://tip.golang.org/doc/go1.3#map So it'll become even less deterministic there as well. Now of course, maps are unordered but what we sometimes do in the code is to use a map to hold some data (maybe to eliminate duplicates) and then expose that data via a slice or array. If we then do a c.Assert(v1, gc.DeepEquals, v2), it will fail on gcc-go, since the order of items in the 2 slices is different, even though the values are the same. If that's really the case, it's definitely a bug in gccgo. gocheck's DeepEquals is implemented in terms of reflect.DeepEqual, which should not care about the map order. In the standard library of the main compiler, it clearly does not: for _, k := range v1.MapKeys() { if !deepValueEqual(v1.MapIndex(k), v2.MapIndex(k), visited, depth+1) { return false } } So gocheck's DeepEquals is fine for such map tests, assuming no bugs in the underlying implementation. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Implementing Juju Actions
On Thu, Mar 27, 2014 at 12:05 PM, James Solomon binary...@gmail.com wrote: I'd like to clarify what I'm understanding here: we are to implement the new commands alongside deploy and set as verbs belonging to the Charm code. And these commands are implemented separately from the /cmd code tree (I guess the Command and RunCommand interfaces are for the juju run code discussed above.) That's almost right. It does need something analogous to the set command, and that is in fact sitting right next to the set configuration command. This is the do command in juju do ..., and is not a verb belonging to the charm code. In addition to that, it needs action-get and action-set commands, analogous to config-get and config-set, and that is available to the charm hooks. That's surprising, FWIW -- on that side note, one scalable alternative to parallel SSH for remote exec is ZeroMQ, which is really effective in We already have a comprehensive mechanism to distribute requests to the unit agents. The main surprise is that it's not being used in this case. That said, if we are to discuss this, let's please start a new thread as this is a completely independent subject. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: arresting bootstrap teardown
How about --keep-on-error? On Mon, Mar 24, 2014 at 3:00 PM, roger peppe rogpe...@gmail.com wrote: If anyone, like me, has been frustrated when debugging bootstrap failures and having the bootstrap machine torn down immediately on failure, a quick and relatively easy workaround for that is to kill -STOP the juju bootstrap process while it's doing the ssh commands. You'll continue to see the ssh commands execute, but the parent process will stop when they finish, allowing you time to ssh into the bootstrap machine and inspect it. kill -CONT to allow the process to complete its cleanup. -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Upcoming import change for loggo
On Wed, Mar 5, 2014 at 5:12 PM, Nate Finch nate.fi...@canonical.com wrote: For the record, I'm not a fan of duplicating the package name of anything in the standard library. Obviously, sometimes collisions will happen if a new package is added to the standard library, but it seems like a bad idea to do it on purpose. When you're deep in the middle of a file, and you see log.Printf() That looks like a pretty interesting example of when a matching package name *is* a good idea. If I was able to just switch a logging package and be able to have things working seamlessly, I'd love it. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Go Style Guide
On Thu, Feb 20, 2014 at 5:31 PM, Nate Finch nate.fi...@canonical.com wrote: One thing that I thought was very interesting was using import dot to get around circular references for tests. I actually hit this exact problem just yesterday. https://code.google.com/p/go-wiki/wiki/Style#Import_Dot I prefer to import the package by its own name, even when there are no circular dependencies. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Go Style Guide
On Thu, Feb 20, 2014 at 6:00 PM, Nate Finch nate.fi...@canonical.com wrote: Well, nevermind. That's just terrible. It's just black box testing the same as any external tests, except obfuscated because you're not using the package name. I don't know why you'd ever want to do that. Right, exactly. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: New juju-mongodb package
Thanks for pushing this, James. It would be good to have the mongo binary available and working as well, also under that juju-specific namespace. This is the console client, and will be useful to connect to the local juju database when debugging issues. On Thu, Nov 28, 2013 at 11:46 AM, James Page james.p...@canonical.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Hi Folks I've started working on the new, stripped down, juju specific MongoDB package that we have been discussing over the last few weeks. I'm proposing a package structure like this: ./usr/lib/juju/bin/mongos ./usr/lib/juju/bin/mongod No users will be created; its just the binaries; upstart and general system configuration such as creating users will be the responsibility of juju. The mongod and mongos binaries will be provided in a juju namespaced location to avoid conflicting with the standard mongodb package; v8 will be linked statically using the embedded copy of v8 in the mongodb source code - this avoids exposing v8 generally in main and allows the security team to manage mongodb/v8 in the context of its use with juju, rather than in more broad general use. The plan is that we will apply for a minor release exception for this package, and that if need be we can update to a new major release (2.6 for example) at some point in the future without impacting the rest of the distro by bumping the standard mongodb package. The total compressed package size is about 7MB - expanding to about 23MB on disk. I still need todo some work on getting the embedded v8 copy to build for armhf (MongoDB upstream strip this out) - arm64 has been discussed but that's going to need some work upstream to enable v8 for this architecture. Other bugs pertinent MongoDB/juju usage would include: https://bugs.launchpad.net/juju-core/+bug/1208430 I'm pretty sure that running mongodb not as root will be part of the security team signoff on the MIR review. Cheers James - -- James Page Technical Lead Ubuntu Server Team james.p...@canonical.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.15 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBCAAGBQJSl0lPAAoJEL/srsug59jDwrkQAKQuDJsm5I8YPeIwzTv+4/Gd mcIQHcPer3EHaMB9/4zjWg8EoNuU0dj52PCjEEbI6zBZek1VqqxYLbsLxdzoo6x1 SAVulHOCG/oVcjL/XVQ7EYTodZrXOEAYKlAOOxI2pj9ea6XoQVI3i4SsZdMCNUyr +CBUKrxH8YPcM3mXIyJBw0qbiHFLJQsywC1gZnsLTomox2Ob+eIk+n/CH2d0tMv3 DyD5c5GAypnHzmrsiteJuPu01OqMaYsiltRWaFFEzuV7C8eIVW4uRFowC+xX8a0i UZ6FTUri4l9F4OahoJdVyjTofgQuis6pa/uKQWp7AUA+40JB/uMKApNV1xJ1772c 8YXVYNVYEc9r+x/LzuqFyHt1BEdqYgDt4ZG7mYR0AwW4p4eK7wAFRElH8JPVdEQ9 9iDFV/FFFCcWdVKRKSUuDCdv3bCNnEOLZU2qD0db9IPUNfGGeedkrSJAXGxBiCkg 9OBxp4xylrbU4gw9tJDmicctIXG6N+n/XMlsDj5FkWqmGAq3HqdXw6VCPJ48X7+m ZHnXoQniR0eoh021xxFmb+f4eTG6U9YY8oyefMERLliVD/a26qQ6VQDa0M1mJig1 OQzbJoaXbuJulew7B+sFI0ltMtx6CVhtyXobKATEKMrzs5GcqT15B7+K5W0Uisca Wp2ySVEZGOc3Sv3fobBC =XBGC -END PGP SIGNATURE- -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Deleting code from goyaml
davecheney wallyworld_: i fixed the bug, tests all pass davecheney by deleting code davecheney i'm not sure how gustavo will like that :) wallyworld_ davecheney: ah, ok. good luck :-) For the record, please don't delete apparently unused logic from the *c.go files in goyaml, unless you went deep into the subject and justified accordingly in the proposal. There is certainly a non-trivial number of uncovered paths, because these files were ported from the C libyaml. For that reason, goyaml will definitely have uncovered paths, not only because we may be lacking paths, but also because we may be lacking the feature itself at the moment (for example, multi-document parsing). We should evolve towards having more tests and more of these features covered, instead of nuking the logic without proper analysis that it was unnecessary in C also. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Deleting code from goyaml
I don't think the facts I brought up were clear, independently from what the MP does (For the record ...). On Thu, Nov 14, 2013 at 9:48 AM, Ian Booth ian.bo...@canonical.com wrote: There was no deleted code the the mp that I saw: https://code.launchpad.net/~dave-cheney/goyaml/goyaml/+merge/195162 Dave may have been referring on irc to an earlier iteration of his work. His approach was also discussed at the Juju team meeting, and unless I mis-remember, there was broad approval of the approach taken. On 14/11/13 21:33, Gustavo Niemeyer wrote: davecheney wallyworld_: i fixed the bug, tests all pass davecheney by deleting code davecheney i'm not sure how gustavo will like that :) wallyworld_ davecheney: ah, ok. good luck :-) For the record, please don't delete apparently unused logic from the *c.go files in goyaml, unless you went deep into the subject and justified accordingly in the proposal. There is certainly a non-trivial number of uncovered paths, because these files were ported from the C libyaml. For that reason, goyaml will definitely have uncovered paths, not only because we may be lacking paths, but also because we may be lacking the feature itself at the moment (for example, multi-document parsing). We should evolve towards having more tests and more of these features covered, instead of nuking the logic without proper analysis that it was unnecessary in C also. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: High Availability command line interface - future plans.
On Fri, Nov 8, 2013 at 8:31 AM, John Arbash Meinel j...@arbash-meinel.com wrote: I would probably avoid putting such an emphasis on any machine can be a manager machine. But that is my personal opinion. (If you want HA you probably want it on dedicated nodes.) Resource waste holds juju back for the small users. Being able to share a state server with other resources does sound attractive from that perspective. It may be the difference between running 3 machines or 6. I would probably also remove the machine if the only thing on it was the management. Certainly that is how people want us to do juju remove-unit. If there are other units in the same machine, we should definitely not remove the machine on remove-unit. The principle sounds the same with state servers. The main problem with this is that it feels slightly too easy to add just 1 machine and then not actually have HA (mongo stops allowing writes if you have a 2-node cluster and lose one, right?) +1 gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: High Availability command line interface - future plans.
These are *very* good points, Mark. Taking them to heart will definitely lead into a good direction for the overall feature development. It sounds like we should avoid using a management command for anything in juju, though. Most things in juju are about management one way or the other, so juju management becomes very unclear and hard to search for. Instead, the command might be named after what we've been calling them: juju add-state-server -n 2 For implementation convenience sake, it would be okay to only ever accept -n 2 when this is first released. I can also imagine the behavior of this command resembling add-unit in a few aspects, since a state server is in fact code that just needs a home to run in. This may yield other common options across them, such as machine selection. On Fri, Nov 8, 2013 at 6:47 AM, Mark Canonical Ramm-Christensen mark.ramm-christen...@canonical.com wrote: I have a few high level thoughts on all of this, but the key thing I want to say is that we need to get a meeting setup next week for the solution to get hammered out. First, conceptually, I don't believe the user model needs to match the implementation model. That way lies madness -- users care about the things they care about and should not have to understand how the system works to get something basic done. See: http://www.amazon.com/The-Inmates-Are-Running-Asylum/dp/0672326140 for reasons why I call this madness. For that reason I think the path of adding a --jobs flag to add-machine is not a move forward. It is exposing implementation detail to users and forcing them into a more complex conceptual model. Second, we don't have to boil the ocean all at once. An ensure-ha command that sets up additional server nodes is better than what we have now -- nothing. Nate is right, the box need not be black, we could have an juju ha-status command that just shows the state of HA. This is fundamentally different than changing the behavior and meaning of add-machines to know about juju jobs and agents and forcing folks to think about that. Third, we I think it is possible to chart a course from ensure-ha as a shortcut (implemented first) to the type of syntax and feature set that Kapil is talking about. And let's not kid ourselves, there are a bunch of new features in that proposal: * Namespaces for services * support for subordinates to state services * logging changes * lifecycle events on juju jobs * special casing the removal of services that would kill the environment * special casing the stats to know about HA and warn for even state server nodes I think we will be adding a new concept and some new syntax when we add HA to juju -- so the idea is just to make it easier for users to understand, and to allow a path forward to something like what Kapil suggests in the future. And I'm pretty solidly convinced that there is an incremental path forward. Fourth, the spelling ensure-ha is probably not a very good idea, the cracks in that system (like taking a -n flag, and dealing with failed machines) are already apparent. I think something like Nick's proposal for add-manager would be better. Though I don't think that's quite right either. So, I propose we add one new idea for users -- a state-server. then you'd have: juju management --info juju management --add juju management --add --to 3 juju management --remove-from I know this is not following the add-machine format, but I think it would be better to migrate that to something more like this: juju machine --add --Mark Ramm On Thu, Nov 7, 2013 at 8:16 PM, roger peppe roger.pe...@canonical.com wrote: On 6 November 2013 20:07, Kapil Thangavelu kapil.thangav...@canonical.com wrote: instead of adding more complexity and concepts, it would be ideal if we could reuse the primitives we already have. ie juju environments have three user exposed services, that users can add-unit / remove-unit etc. they have a juju prefix and therefore are omitted by default from status listing. That's a much simpler story to document. how do i scale my state server.. juju add-unit juju-db... my provisioner juju add-unit juju-provisioner. I have a lot of sympathy with this point of view. I've thought about it quite a bit. I see two possibilities for implementing it: 1) Keep something like the existing architecture, where machine agents can take on managerial roles, but provide a veneer over the top which specially interprets service operations on the juju built-in services and translates them into operations on machine jobs. 2) Actually implement the various juju services as proper services. The difficulty I have with 1) is that there's a significant mismatch between the user's view of things and what's going on underneath. For instance, with a built-in service, can I: - add a subordinate service to it? - see the relevant log file in the usual place for a unit? - see its
Re: High Availability command line interface - future plans.
On Fri, Nov 8, 2013 at 9:39 AM, Nate Finch nate.fi...@canonical.com wrote: If you only have 3 machines, do you really need HA from juju? You don't have HA from your machines that are actually running your service. Why not? I have three machines.. Yeah, same here. I still think we need a turn on HA mode command that'll bring you to 3 servers. It doesn't have to be the swiss army knife that we said before... just something to go from non-HA to valid HA environment. This looks fine: juju add-state-server -n 2 It's easy to error if current + n is not a good number. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: High Availability command line interface - future plans.
We'll end up with a command that adds a state server, with a replica of the database and an API server. That's the notion of state server we've been using all along, and sounds quite reasonable, easy to explain and understand. On Fri, Nov 8, 2013 at 10:15 AM, roger peppe roger.pe...@canonical.com wrote: On 8 November 2013 12:03, Gustavo Niemeyer gust...@niemeyer.net wrote: Splitting API and db at some point sounds sensible, but it may be easy and convenient to think about a state server as API+db for the time being. I'd prefer to start with a command name that implies that possibility; otherwise we'll end up either with a command that doesn't describe what it actually does, or more very similar commands where one could be sufficient. Hence my discomfort with add-state-server as a command name. -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: High Availability command line interface - future plans.
juju add-state-server --api-only-please-thanks On Fri, Nov 8, 2013 at 11:43 AM, roger peppe roger.pe...@canonical.com wrote: On 8 November 2013 13:33, Gustavo Niemeyer gust...@niemeyer.net wrote: We'll end up with a command that adds a state server, with a replica of the database and an API server. That's the notion of state server we've been using all along, and sounds quite reasonable, easy to explain and understand. And when we want to split API and db, as you thought perhaps might be sensible at some point, what then? -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: High Availability command line interface - future plans.
On Fri, Nov 8, 2013 at 12:04 PM, roger peppe roger.pe...@canonical.com wrote: On 8 November 2013 13:51, Gustavo Niemeyer gust...@niemeyer.net wrote: juju add-state-server --api-only-please-thanks And if we want to allow a machine that runs the environment-manager workers but not the api server or mongo server (not actually an unlikely thing given certain future possibilities) then add-state-server is a command that doesn't necessarily add a state server at all... That thought was the source of my doubt. The fact you can organize things a thousand ways doesn't mean we should offer a thousand knobs. A state server is a good abstraction for there are management routines running there. You can define what that means, as long as you don't let things fall down when N/2-1 machines fall down. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: High Availability command line interface - future plans.
It doesn't feel like the difference between juju ensure-ha --prefer-machines 11,37 and juju add-state-server --to 11,37 is worth the amount of reasoning there. I'm clearly in favor of the latter, but I wouldn't argue so much for it. On Fri, Nov 8, 2013 at 2:00 PM, William Reade william.re...@canonical.com wrote: I'm concerned that we're (1) rehashing decisions made during the sprint and (2) deviating from requirements in doing so. In particular, abstracting HA away into management manipulations -- as roger notes, pretty much isomorphic to the jobs proposal -- doesn't give users HA so much as it gives them a limited toolkit with which they can more-or-less construct their own HA; in particular, allowing people to use an even number of state servers is strictly a bad thing [0], and I'm extremely suspicious of any proposal that opens that door. Of course, some will argue that mongo should be able to scale separately from the api servers and other management tasks, and this is a worthy goal; but in this context it sucks us down into the morass of exposing different types of management on different machines, and ends up approaching the jobs proposal still closer, in that it requires users to assimilate a whole load of extra terminology in order to perform a conceptually simple function. Conversely, ensure-ha (with possible optional --redundancy=N flag, defaulting to 1) is a simple model that can be simply explained: the command's sole purpose is to ensure that juju management cannot fail as a result to the simultaneous failure of =N machines. It's a *user-level* construct that will always be applicable even in the context of a more sophisticated future language (no matter what's going on with this complicated management/jobs business, you can run that and be assured you'll end up with at least enough manager machines to fulfil the requirement you clearly stated in the command line). I haven't seen anything that makes me think that redesigning from scratch is in any way superior to refining what we already agreed upon; and it's distracting us from the questions of reporting and correcting manager failure when it occurs. I assert the following series of arguments: * users may discover at any time that they need to make an existing environment HA, so ensure-ha is *always* a reasonable user action * users who *don't* need an HA environment can, by definition, afford to take the environment down and reconstruct it without HA if it becomes unimportant * therefore, scaling management *down* is not the highest priority for us (but is nonetheless easily amenable to future control via the ensure-ha command -- just explicitly set a lower redundancy number) * similarly, allowing users to *directly* destroy management machines enables exciting new failure modes that don't really need to exist * the notion of HA is somewhat limited in worth when there's no way to make a vulnerable environment robust again * the more complexity we shovel onto the user's plate, the less likely she is to resolve the situation correctly under stress * the most obvious, and foolproof, command for repairing HA would be ensure-ha itself, which could very reasonably take it upon itself to replace manager nodes detected as down -- assuming a robust presence implementation, which we need anyway, this (1) works trivially for machines that die unexpectedly and (2) allows a backdoor for resolution of weird situations: the user can manually shutdown a misbehaving manager out-of-band, and run ensure-ha to cause a new one to be spun up in its place; once HA is restored, the old machine will no longer be a manager, no longer be indestructible, and can be cleaned up at leisure * the notion is even more limited when you can't even tell when something goes wrong * therefore, HA state should *at least* be clearly and loudly communicated in status * but that's not very proactive, and I'd like to see a plan for how we're going to respond to these situations when we detect them * the data accessible to a manager node is sensitive, and we shouldn't generally be putting manager nodes on dirty machines; but density is an important consideration, and I don't think it's confusing to allow preferred machines to be specified in ensure-ha, such that *if* management capacity needs to be added it will be put onto those machines before finding clean ones or provisioning new ones * strawman syntax: juju ensure-ha --prefer-machines 11,37 to place any additional manager tasks that may be required on the supplied machines in order of preference -- but even this falls far behind the essential goal, which is make HA *easy* for our users. * (ofc, we should continue not to put units onto manager machines by default, but allow them when forced with --to as before) I don't believe that any of this precludes more sophisticated management of juju's internal functions *when* the need becomes
Re: Scale Testing: Now with profiling!
On Mon, Nov 4, 2013 at 12:04 PM, John Arbash Meinel j...@arbash-meinel.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2013-11-04 17:52, roger peppe wrote: There's no point in salting the agent passwords, and we can't easily change things to salt the user passwords until none of the command line tools talk directly to mongo, so I'm +1 on john's patch for now. We can absolutely salt both. *Salt* is all about reading the salt from what you've stored in the DB and using that to compute the hash. It is simply to prevent rainbow attacks (precompute the hash of 1M common user passwords and compare it to the content in the DB.) Roger was talking about the agent passwords, which you described as having passwords that are nice long random strings. There's no common user password in that case. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Notes from Scale testing
On Wed, Oct 30, 2013 at 6:23 AM, John Arbash Meinel j...@arbash-meinel.com wrote: I'm trying to put together a quick summary of what I've found out so far with testing juju in an environment with thousands (5000+) agents. Great testing, John. 2) Agents seem to consume about 17MB resident according to 'top'. That should mean we can run ~450 agents on an m1.large. Though in my testing I was running ~450 and still had free memory, so I'm guessing there might be some copy-on-write pages (17MB is very close to the size of the jujud binary). Yeah, RSS is not straightforward to measure correctly. The crude readings are pretty much always overestimated. 4) If I bring up the units one by one (for i in `seq 500`; do for j in `seq 10` do juju add-unit --to $j ; time wait; done), it ends up triggering O(N^2) behavior in the system. Each unit agent seems to have a watcher for other units of the same service. So when you add 1 unit, it wakes up all existing units to let them know about it. In theory this is on a 5s rate limit (only 1 wakeup per 5 seconds). In practice it was taking 3s per add unit call [even when requesting them in parallel]. I think this was because of the load on the API server of all the other units waking up and asking for details at the same time. In theory answering all watching questions should eventually be very cheap, if it isn't right now. The yes/no data for several thousand units easily fits all in memory, and the API servers also learn about changes as they go through to the database, so there's no big reason to touch the database for these operations. Caching will also play a larger role inside the API servers as juju moves towards scalability. - From what I can tell, all units take out a watch on their service so that they can monitor its Life and CharmURL. However, adding a unit to a service triggers a change on that service, even though Life and CharmURL haven't changed. If we split out Watching the units-on-a-service from the lifetime and URL of a service, we could avoid the thundering N^2 herd problem while starting up a bunch of units. Though UpgradeCharm is still going to thundering herd. Where is N^2 coming from? It then seems to restart the Unit agent, which goes through the steps of making all the same requests again. (Get the Life of my Unit, get the Life of my service, get the UUID of this environment, etc., there are 41 requests before it gets to APIAddress) Ugh! I would be fine doing max(1, NumCPUs()-1) or something similar. I'd rather do it inside jujud rather than in the cloud-init script, because computing NumCPUs is easier there. But we should have *a* way to scale up the central node that isn't just scaling out to more API servers. NumCPUs sounds like a fine initial setting. I certainly think we need a way to scale Mongo as well. If it is just 1 CPU per connection then scaling horizontally with API servers should get us around that limit. It's one thread per connection. 10) Allowing juju add-unit -n 100 --to X did make things a lot easier to bring up. Though it still takes a while for the request to finish. It felt like the api call triggered work to start happening in the background which made the current api call take longer to finally complete. (as in, minutes once we had 1000 units). It doesn't sound like optimizing for a huge volume of immediate units is an important goal, other than for streamlining that kind of testing. I have very rarely observed use cases that do that, or that have resources available at all for doing that. Even large providers will generally deny requests for large deltas at once, due to their controlled growth schedules. It sounds more relevant to have juju able to cope with these loads effectively and timely once they do reach such scales. Not everything in there is worth landing in trunk (rudimentary API caching, etc). That's all I can think of for now, though I think there is more to be explored. Again, well done. gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Control different relation sequence
Exactly, that's what I would probably do as well. Once you are within a relation you want to wait for further actions, dump the $JUJU_RELATION_ID into a file and wait until you want to wake it up again. Hooks are guaranteed to be run in series, so you don't have to worry about concurrency issues around the file. On Wed, Sep 4, 2013 at 12:01 AM, Mike Sam mikesam...@gmail.com wrote: Thanks. Does this mean that the charm should cache the relation id's in a text file or something? On Tue, Sep 3, 2013 at 7:33 PM, Gustavo Niemeyer gustavo.nieme...@canonical.com wrote: The relation-set command accepts a -r parameter which takes the relation id to act upon. You can pick the relation id of an executing hook from the JUJU_RELATION_ID environment variable. This way you can act across relations. Hopefully this will be better documented at some point. On Tue, Sep 3, 2013 at 11:23 PM, Mike Sam mikesam...@gmail.com wrote: Thanks Gustavo but I did not quite get your point. The problem is that for the new unit for service A, the dependent hooks are on two different independent relationships. I mean I can control when the new unit of Service A has properly established a relation with all the units of service B on say relation x_relation_changed, but how do I make all the units of service C to now trigger the y_relation_changed hook of the Service A unit because the unit is ready to process them? How do I make y_relation_changed hook to get triggered AGAIN (in case it has already been triggered but ignored because relation with service B was not done setting up) when x_relation_changed see fit? Would you please explain your point is the Service A, B, C context of my example? On Tue, Sep 3, 2013 at 6:38 PM, Gustavo Niemeyer gustavo.nieme...@canonical.com wrote: Hi Mike, You cannot control the sequence in which the hooks are executed, but you have full control of what you do when the hooks do execute. You can choose to send nothing to the other side of the relation until its time to report that a connection may now be established, and when you do change the relation, the remote hook will run again to report the change. On Tue, Sep 3, 2013 at 10:17 PM, Mike Sam mikesam...@gmail.com wrote: Imagine a unit needs to be added to an existing service like service A. Service A is already in relations with other services like Service B and Service C on different requires. For the new unit on Service A to work, it needs to first process the relation_joined and relation_changed with the units of service B before it could process relation_joined and relation_changed with the units of service C. Is there a way to enforce such desired sequence relationship establishment at the charm level? In other words, I do not think we can control the hook execution sequence of different relationships officially but then I am wondering how can we do a situation like above nicely? Thanks, Mike -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- gustavo @ http://niemeyer.net -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Control different relation sequence
Hi Mike, You cannot control the sequence in which the hooks are executed, but you have full control of what you do when the hooks do execute. You can choose to send nothing to the other side of the relation until its time to report that a connection may now be established, and when you do change the relation, the remote hook will run again to report the change. On Tue, Sep 3, 2013 at 10:17 PM, Mike Sam mikesam...@gmail.com wrote: Imagine a unit needs to be added to an existing service like service A. Service A is already in relations with other services like Service B and Service C on different requires. For the new unit on Service A to work, it needs to first process the relation_joined and relation_changed with the units of service B before it could process relation_joined and relation_changed with the units of service C. Is there a way to enforce such desired sequence relationship establishment at the charm level? In other words, I do not think we can control the hook execution sequence of different relationships officially but then I am wondering how can we do a situation like above nicely? Thanks, Mike -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- gustavo @ http://niemeyer.net -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev