So after poking around and trying to enable debugging, I did come across
this line:
2014-06-14 10:04:08 INFO juju.mongo session.go:2191 queryError:
&mgo.queryError{Err:"not authorized for query on local.system.replset",
ErrMsg:"", Assertion:"", Code:16550, AssertionCode:0,
LastError:(*mgo.LastError)(nil)}
2014-06-14 10:04:08 DEBUG juju.mongo session.go:1143 Closing session
0xc21018a160

014-06-14 10:06:11 DEBUG juju.mongo session.go:2246 Query 0xc21022de00
document unmarshaled: &replicaset.Status{Name:"",
Members:[]replicaset.MemberStatus(nil)}
2014-06-14 10:06:11 INFO juju.mongo session.go:2191 queryError:
&mgo.queryError{Err:"", ErrMsg:"unauthorized", Assertion:"", Code:0,
AssertionCode:0, LastError:(*mgo.LastError)(nil)}
2014-06-14 10:06:11 ERROR juju.worker.peergrouper worker.go:138 peergrouper
loop terminated: cannot get replica set status: cannot get replica set
status: unauthorized
2014-06-14 10:06:11 DEBUG juju.mongo socket.go:350 Socket 0xc21017d540 to
10.0.3.1:37017: serializing op: &mgo.queryOp{collection:"admin.$cmd",
query:bson.D{bson.DocElem{Name:"ping", Value:1}}, skip:0, limit:-1,
selector:interface {}(nil), flags:0x0, replyFunc:(mgo.replyFunc)(0x602080),
options:mgo.queryWrapper{Query:interface {}(nil), OrderBy:interface
{}(nil), Hint:interface {}(nil), Explain:false, Snapshot:false,
ReadPreference:bson.D(nil)}, hasOptions:false, serverTags:[]bson.D(nil)}
...
2014-06-14 10:06:11 ERROR juju.worker runner.go:218 exited "peergrouper":
cannot get replica set status: cannot get replica set status: unauthorized
2014-06-14 10:06:11 INFO juju.worker runner.go:252 restarting "peergrouper"
in 3s

IIRC Ian did some changes wrt how we handle users in an attempt to make our
code work with Mongo 2.6 (that changed UpsertUser vs AddUser vs whatever
else we were using). I think it is supposed to work in other versions of
Mongo, but I'm guessing it isn't quite as compatible as it is supposed to
be.

John
=:->



On Sat, Jun 14, 2014 at 8:53 AM, John Meinel <j...@arbash-meinel.com> wrote:

> I'm running a test now to see if I can set up HA manually.
> One very surprising thing is that I ran "juju bootstrap" from a Trusty
> machine, and it gave me a Precise bootstrap node. I thought we were trying
> to default to the latest LTS when possible. Did some behavior change there?
> (I'm wondering if somewhere we changed from a single hardcoded value to
> taking a list of all possible LTS targets, and that ended up with us
> picking Precise first.)
>
> I did manage to reproduce the bug, on machine-1 I see an endless series of
> 2014-06-14 04:45:27 INFO juju.mongo open.go:90 dialled mongo successfully
>
> Like, at 5 minutes in I have >1000 lines of "I successfully connected",
> and *no* failure messages indicating why we are trying again.
>
> I'll post more of my findings to the bug.
>
> As a Juju process level thing, when people are changing things around HA,
> are you actually running up a live system and seeing it work before you
> submit your changes to Trunk?
>
> John
> =:->
>
>
>
> On Fri, Jun 13, 2014 at 10:42 PM, Curtis Hovey-Canonical <
> cur...@canonical.com> wrote:
>
>> CI is regularly failing because HA and upgrade tests timeout. They do
>> not complete. I have extended timeouts from 5 minutes to 15, but the
>> tests still fail. I appended -devel to some tests to remove there
>> vote. I think that was a mistake...the problem is juju, not the cloud.
>> I reported 2 bugs about upgrade-juju and HA
>>
>> HA performance degradation
>> https://bugs.launchpad.net/juju-core/+bug/1329544
>>
>> major performance degradation upgrading juju
>> https://bugs.launchpad.net/juju-core/+bug/1329899
>>
>> I suspect there is a root cause for booth bugs. We saw performance
>> deteriorate becuase mongodb is doing more work. Maybe HA and upgrades
>> need to tak 30 minutes, or an hours because mongo cannot do what we
>> once required to happen in 5 minutes.
>>
>> On Fri, Jun 13, 2014 at 7:12 AM, CI & CD Jenkins
>> <aaron.bentley+c...@canonical.com> wrote:
>> > Build: #1479 Revision: gitbranch:master:github.com/juju/juju ead2e2d6
>> Version: 1.19.4
>> >
>> > Failed tests
>> > functional-ha-recovery build #357
>> http://juju-ci.vapour.ws:8080/job/functional-ha-recovery/357/console
>> > hp-upgrade-precise-amd64 build #1324
>> http://juju-ci.vapour.ws:8080/job/hp-upgrade-precise-amd64/1324/console
>>
>>
>> --
>> Curtis Hovey
>> Canonical Cloud Development and Operations
>> http://launchpad.net/~sinzui
>>
>> --
>> Juju-dev mailing list
>> Juju-dev@lists.ubuntu.com
>> Modify settings or unsubscribe at:
>> https://lists.ubuntu.com/mailman/listinfo/juju-dev
>>
>
>
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev

Reply via email to