Re: Devel is broken, we cannot release

2014-07-15 Thread Nate Finch
I don't think we need to stop the world to get these things fixed.  It is
the responsibility of the team leads to make sure someone's actively
working on fixes for regressions.  If they're not getting fixed, it's our
fault.  We should have one of the team leads pick up the regression and
assign someone to work on it, just like any other high priority bug.



On Mon, Jul 14, 2014 at 3:05 PM, Curtis Hovey-Canonical 
cur...@canonical.com wrote:

 Devel has been broken for weeks because of regressions. We cannot
 release devel. The stable 1.20.0 that we release is actually older
 than it appears because we had to search CI for an older revision that
 worked.

 We have a systemic problem: once a regression is introduced, it blocks
 the release for weeks, and we build on top of the regression. We often
 see many regressions.The regression mutate as people merge more
 branches.

 The current two regressions are:
 * win juju client still broken with unknown
   from  2014-06-27 which has varied as a compilation
   problem or panic during execution.
   https://bugs.launchpad.net/juju-core/+bug/1335328

 * FAIL: managedstorage_test trusty ppc64
   from 2014-06-30 which had a secondary bug that broke compilation.
   https://bugs.launchpad.net/juju-core/+bug/1336089

 I think the problem is engineers are focused on there feature. They
 don't see the fallout from their changes. They may hope the fix will
 arrive soon, and that maybe someone else will fix it.

 I propose a change in policy. When a there is a regression in CI, no
 new branches can be merged except those that link to the blocking bug.
 This will encourage engineers to fix the regression. One way to fix
 the regression is to identify and revert the commit that broken CI.


 --
 Curtis Hovey
 Canonical Cloud Development and Operations
 http://launchpad.net/~sinzui

 --
 Juju-dev mailing list
 Juju-dev@lists.ubuntu.com
 Modify settings or unsubscribe at:
 https://lists.ubuntu.com/mailman/listinfo/juju-dev

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Devel is broken, we cannot release

2014-07-15 Thread Wayne Witzel
If we aren't stopping the line when our CI is in the red, then what is the
point of even having CI at all? If we are not prepared to adjust the
culture of our development. To truly halt forward progress in favor of
chasing down regressions then I struggle to find the benefit that CI and
testing is giving us at all. Just confirming that master is broken and we
are still ignoring it? If master is broken, we all our broken. No
development you are doing should proceed that is based on a broken master.
That work cannot at any point be considered in good working condition. A
problem in master is everyone's problem.

Bugs that are found throughout the normal operations and usage of juju
should be assigned a priority and queued, but regression is a sign of a
greater problem that should be resolved immediately. Allowing regressions
to not stop the line is taking the stance that we don't care about
deterioration in our code base.


On Tue, Jul 15, 2014 at 9:37 AM, Nate Finch nate.fi...@canonical.com
wrote:

 I don't think we need to stop the world to get these things fixed.  It is
 the responsibility of the team leads to make sure someone's actively
 working on fixes for regressions.  If they're not getting fixed, it's our
 fault.  We should have one of the team leads pick up the regression and
 assign someone to work on it, just like any other high priority bug.



 On Mon, Jul 14, 2014 at 3:05 PM, Curtis Hovey-Canonical 
 cur...@canonical.com wrote:

 Devel has been broken for weeks because of regressions. We cannot
 release devel. The stable 1.20.0 that we release is actually older
 than it appears because we had to search CI for an older revision that
 worked.

 We have a systemic problem: once a regression is introduced, it blocks
 the release for weeks, and we build on top of the regression. We often
 see many regressions.The regression mutate as people merge more
 branches.

 The current two regressions are:
 * win juju client still broken with unknown
   from  2014-06-27 which has varied as a compilation
   problem or panic during execution.
   https://bugs.launchpad.net/juju-core/+bug/1335328

 * FAIL: managedstorage_test trusty ppc64
   from 2014-06-30 which had a secondary bug that broke compilation.
   https://bugs.launchpad.net/juju-core/+bug/1336089

 I think the problem is engineers are focused on there feature. They
 don't see the fallout from their changes. They may hope the fix will
 arrive soon, and that maybe someone else will fix it.

 I propose a change in policy. When a there is a regression in CI, no
 new branches can be merged except those that link to the blocking bug.
 This will encourage engineers to fix the regression. One way to fix
 the regression is to identify and revert the commit that broken CI.


 --
 Curtis Hovey
 Canonical Cloud Development and Operations
 http://launchpad.net/~sinzui

 --
 Juju-dev mailing list
 Juju-dev@lists.ubuntu.com
 Modify settings or unsubscribe at:
 https://lists.ubuntu.com/mailman/listinfo/juju-dev



 --
 Juju-dev mailing list
 Juju-dev@lists.ubuntu.com
 Modify settings or unsubscribe at:
 https://lists.ubuntu.com/mailman/listinfo/juju-dev




-- 
Wayne Witzel III
wayne.wit...@canonical.com
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Devel is broken, we cannot release

2014-07-15 Thread William Reade
On Tue, Jul 15, 2014 at 2:51 PM, Wayne Witzel wayne.wit...@canonical.com
wrote:

 If we aren't stopping the line when our CI is in the red, then what is the
 point of even having CI at all? If we are not prepared to adjust the
 culture of our development. To truly halt forward progress in favor of
 chasing down regressions then I struggle to find the benefit that CI and
 testing is giving us at all. Just confirming that master is broken and we
 are still ignoring it? If master is broken, we all our broken. No
 development you are doing should proceed that is based on a broken master.
 That work cannot at any point be considered in good working condition. A
 problem in master is everyone's problem.

 Bugs that are found throughout the normal operations and usage of juju
 should be assigned a priority and queued, but regression is a sign of a
 greater problem that should be resolved immediately. Allowing regressions
 to not stop the line is taking the stance that we don't care about
 deterioration in our code base.


+100 to this. Regressions are a Big Deal and should be treated as such;
leaving other merges queued until such a time as the regression is fixed
(or backed out for rework) is entirely reasonable (and I think we've got
enough evidence that the alternative really doesn't fly effectively).

Cheers
William




 On Tue, Jul 15, 2014 at 9:37 AM, Nate Finch nate.fi...@canonical.com
 wrote:

 I don't think we need to stop the world to get these things fixed.  It is
 the responsibility of the team leads to make sure someone's actively
 working on fixes for regressions.  If they're not getting fixed, it's our
 fault.  We should have one of the team leads pick up the regression and
 assign someone to work on it, just like any other high priority bug.



 On Mon, Jul 14, 2014 at 3:05 PM, Curtis Hovey-Canonical 
 cur...@canonical.com wrote:

 Devel has been broken for weeks because of regressions. We cannot
 release devel. The stable 1.20.0 that we release is actually older
 than it appears because we had to search CI for an older revision that
 worked.

 We have a systemic problem: once a regression is introduced, it blocks
 the release for weeks, and we build on top of the regression. We often
 see many regressions.The regression mutate as people merge more
 branches.

 The current two regressions are:
 * win juju client still broken with unknown
   from  2014-06-27 which has varied as a compilation
   problem or panic during execution.
   https://bugs.launchpad.net/juju-core/+bug/1335328

 * FAIL: managedstorage_test trusty ppc64
   from 2014-06-30 which had a secondary bug that broke compilation.
   https://bugs.launchpad.net/juju-core/+bug/1336089

 I think the problem is engineers are focused on there feature. They
 don't see the fallout from their changes. They may hope the fix will
 arrive soon, and that maybe someone else will fix it.

 I propose a change in policy. When a there is a regression in CI, no
 new branches can be merged except those that link to the blocking bug.
 This will encourage engineers to fix the regression. One way to fix
 the regression is to identify and revert the commit that broken CI.


 --
 Curtis Hovey
 Canonical Cloud Development and Operations
 http://launchpad.net/~sinzui

 --
 Juju-dev mailing list
 Juju-dev@lists.ubuntu.com
 Modify settings or unsubscribe at:
 https://lists.ubuntu.com/mailman/listinfo/juju-dev



 --
 Juju-dev mailing list
 Juju-dev@lists.ubuntu.com
 Modify settings or unsubscribe at:
 https://lists.ubuntu.com/mailman/listinfo/juju-dev




 --
 Wayne Witzel III
 wayne.wit...@canonical.com

 --
 Juju-dev mailing list
 Juju-dev@lists.ubuntu.com
 Modify settings or unsubscribe at:
 https://lists.ubuntu.com/mailman/listinfo/juju-dev


-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Devel is broken, we cannot release

2014-07-15 Thread Nate Finch
I think that's a fair assessment.

Perhaps the easiest fix is to install a switch QA could throw to change the
required merge message to something like !!ThisFixesCI!!


On Tue, Jul 15, 2014 at 9:57 AM, William Reade william.re...@canonical.com
wrote:

 On Tue, Jul 15, 2014 at 2:51 PM, Wayne Witzel wayne.wit...@canonical.com
 wrote:

 If we aren't stopping the line when our CI is in the red, then what is
 the point of even having CI at all? If we are not prepared to adjust the
 culture of our development. To truly halt forward progress in favor of
 chasing down regressions then I struggle to find the benefit that CI and
 testing is giving us at all. Just confirming that master is broken and we
 are still ignoring it? If master is broken, we all our broken. No
 development you are doing should proceed that is based on a broken master.
 That work cannot at any point be considered in good working condition. A
 problem in master is everyone's problem.

 Bugs that are found throughout the normal operations and usage of juju
 should be assigned a priority and queued, but regression is a sign of a
 greater problem that should be resolved immediately. Allowing regressions
 to not stop the line is taking the stance that we don't care about
 deterioration in our code base.


 +100 to this. Regressions are a Big Deal and should be treated as such;
 leaving other merges queued until such a time as the regression is fixed
 (or backed out for rework) is entirely reasonable (and I think we've got
 enough evidence that the alternative really doesn't fly effectively).

 Cheers
 William




 On Tue, Jul 15, 2014 at 9:37 AM, Nate Finch nate.fi...@canonical.com
 wrote:

 I don't think we need to stop the world to get these things fixed.  It
 is the responsibility of the team leads to make sure someone's actively
 working on fixes for regressions.  If they're not getting fixed, it's our
 fault.  We should have one of the team leads pick up the regression and
 assign someone to work on it, just like any other high priority bug.



 On Mon, Jul 14, 2014 at 3:05 PM, Curtis Hovey-Canonical 
 cur...@canonical.com wrote:

 Devel has been broken for weeks because of regressions. We cannot
 release devel. The stable 1.20.0 that we release is actually older
 than it appears because we had to search CI for an older revision that
 worked.

 We have a systemic problem: once a regression is introduced, it blocks
 the release for weeks, and we build on top of the regression. We often
 see many regressions.The regression mutate as people merge more
 branches.

 The current two regressions are:
 * win juju client still broken with unknown
   from  2014-06-27 which has varied as a compilation
   problem or panic during execution.
   https://bugs.launchpad.net/juju-core/+bug/1335328

 * FAIL: managedstorage_test trusty ppc64
   from 2014-06-30 which had a secondary bug that broke compilation.
   https://bugs.launchpad.net/juju-core/+bug/1336089

 I think the problem is engineers are focused on there feature. They
 don't see the fallout from their changes. They may hope the fix will
 arrive soon, and that maybe someone else will fix it.

 I propose a change in policy. When a there is a regression in CI, no
 new branches can be merged except those that link to the blocking bug.
 This will encourage engineers to fix the regression. One way to fix
 the regression is to identify and revert the commit that broken CI.


 --
 Curtis Hovey
 Canonical Cloud Development and Operations
 http://launchpad.net/~sinzui

 --
 Juju-dev mailing list
 Juju-dev@lists.ubuntu.com
 Modify settings or unsubscribe at:
 https://lists.ubuntu.com/mailman/listinfo/juju-dev



 --
 Juju-dev mailing list
 Juju-dev@lists.ubuntu.com
 Modify settings or unsubscribe at:
 https://lists.ubuntu.com/mailman/listinfo/juju-dev




 --
 Wayne Witzel III
 wayne.wit...@canonical.com

 --
 Juju-dev mailing list
 Juju-dev@lists.ubuntu.com
 Modify settings or unsubscribe at:
 https://lists.ubuntu.com/mailman/listinfo/juju-dev



-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Devel is broken, we cannot release

2014-07-15 Thread Curtis Hovey-Canonical
On Tue, Jul 15, 2014 at 1:29 AM, John Meinel j...@arbash-meinel.com wrote:
 It seems worthy to just run go test github.com/juju/... as our CI testing,
 isn't it? (i.e., run all unit tests across all packages that we write on all
 platforms), rather than *just* github.com/juju/juju.

Ah!. That looks easy. We could add a test like this in a day.

 I don't think we run into the combinatorial problem here (if we can run all
 of the github.com/juju/juju tests, than we aren't really adding much to run
 the rest of the dependencies as well).



 I think having a full bootstrap, deploy, upgrade, destroy on all platforms
 is necessary as a functional test, I'm not sure that we need to cross
 product it with on-all-environments. (which *would* start to run into
 combinatorial problems)

We are doing some combinatorial testing because we need to ensure
every series+arch combination works. In Vegas Sprint, we settled on
unittests and lxc tests as the best way to identify issues with arch
or series. We test:

  precise + amd64
  utopic + amd64
  trusty + amd64
  trusty + i386
  trusty + ppc64
  trusty + arm64

Cloud tests always do a deploy and an upgrade because both scenarios
use simple streams, which are also under test. CI is testing
juju-release-tools too since juju is isn't very useful unless it is
packaged and tools are published to the CPCs.

There is a large class of function tests and some integration tests
with other software that we need to add this cycle.

 I have the feeling, though, that better CI might be making some developers
 a bit more lax and doing less direct testing themselves, because they expect
 that CI will catch things they don't.

I don't feel this. I think the problem is the complexity of Juju.
Mongo changes for HA broken the backup-restore feature, I think these
are different areas of expertise that needed better coordination.

 I like the stop-the-line-when-CI-is-broken, as long as we have reliable ways
 to stop it. Given the timescales we're working on, I'd probably be ok with
 having it be a manual thing, so that when Azure decides to rev their API and
 break everything that used to work, we aren't immediately crippled. Maybe we
 can identify a subset of CI that is reliable (or high priority) enough that
 it really is automatically stop-the-line worthy. (Trusty unit tests, PPC
 unit tests, local provider, ec2 tests come to mind.)

Cloud failures are not regressions in juju code. I spend a day or more
a week tweaking CI to give Juju the best chance of success. I might
change a test, or write a script that cleans up the resources in
cloud/host.

Since I am taking time to give juju more chances to pass, I delay
reporting the bugs. 5 revisions might merge while I prove that juju is
really broken. Since the defect can mutate with the extra commits. it
isn't easy to identify the 1 or more revisions that are at fault.

When we report a ci regression it is something we genuinely believe
to work when we retest an old revision. I do provide a list of commits
that can be investigated.

As for automating a stop the line policy, we might be fine with a
small hack to the git-merge-juju job to check for commits that claim
to fix a regression, when not the case, the job fails early with the
reason that we are waiting for a specific fix. Rollback is always an
option.



-- 
Curtis Hovey
Canonical Cloud Development and Operations
http://launchpad.net/~sinzui

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Devel is broken, we cannot release

2014-07-15 Thread John Meinel
My concern was with the speed of response. I'm happy to have a QA Team
switch that must be fixed (with an associated email to juju - dev so
everyone knows why their patch won't land). I *would* like us to be
tracking stuff like how long we go into regression mode, etc.

I think ideally the process would be automated, but our current CI seems to
need a fair amount of manual filtering.

On Jul 15, 2014 6:35 PM, Curtis Hovey-Canonical cur...@canonical.com
wrote:


 We are doing some combinatorial testing because we need to ensure
 every series+arch combination works. In Vegas Sprint, we settled on
 unittests and lxc tests as the best way to identify issues with arch
 or series. We test:

   precise + amd64
   utopic + amd64
   trusty + amd64
   trusty + i386
   trusty + ppc64
   trusty + arm64


That looks M+N to me. (All series amd64 + trust for all arches). The MxN
would be all series x all arches.

...
  I have the feeling, though, that better CI might be making some
developers
  a bit more lax and doing less direct testing themselves, because they
expect
  that CI will catch things they don't.

 I don't feel this. I think the problem is the complexity of Juju.
 Mongo changes for HA broken the backup-restore feature, I think these
 are different areas of expertise that needed better coordination.

I think there was also some Auth changes that meant we couldn't bootstrap
at all.
I really like that CI caught it. I wonder if it had to get that far.


  I like the stop-the-line-when-CI-is-broken, as long as we have reliable
ways
  to stop it. Given the timescales we're working on, I'd probably be ok
with
  having it be a manual thing, so that when Azure decides to rev their
API and
  break everything that used to work, we aren't immediately crippled.
Maybe we
  can identify a subset of CI that is reliable (or high priority) enough
that
  it really is automatically stop-the-line worthy. (Trusty unit tests, PPC
  unit tests, local provider, ec2 tests come to mind.)

 Cloud failures are not regressions in juju code. I spend a day or more
 a week tweaking CI to give Juju the best chance of success. I might
 change a test, or write a script that cleans up the resources in
 cloud/host.

 Since I am taking time to give juju more chances to pass, I delay
 reporting the bugs. 5 revisions might merge while I prove that juju is
 really broken. Since the defect can mutate with the extra commits. it
 isn't easy to identify the 1 or more revisions that are at fault.

 When we report a ci regression it is something we genuinely believe
 to work when we retest an old revision. I do provide a list of commits
 that can be investigated.

 As for automating a stop the line policy, we might be fine with a
 small hack to the git-merge-juju job to check for commits that claim
 to fix a regression, when not the case, the job fails early with the
 reason that we are waiting for a specific fix. Rollback is always an
 option.



I absolutely support trying to find ways to help keep CI blue (green). It's
definitely the background I come from and a culture I want us to have.
I think a difficulty is figuring out who/what is responsible and the slow
turn around to unblocking everything. If we make what we think is the fix
even if it is just reverting a change, doesn't it take hours to run CI
again and even then some bits may fail spuriously/for a different reason?
If we need manual intervention on both ends that means a stop-the-line
takes us out of working order for 24 hours. I'm just trying to explore the
consequences. I really do think we need good feedback into keeping CI happy.

John
=:-

 --
 Curtis Hovey
 Canonical Cloud Development and Operations
 http://launchpad.net/~sinzui

 --
 Juju-dev mailing list
 Juju-dev@lists.ubuntu.com
 Modify settings or unsubscribe at:
https://lists.ubuntu.com/mailman/listinfo/juju-dev
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Devel is broken, we cannot release

2014-07-15 Thread Tim Penhey
On 16/07/14 01:57, William Reade wrote:
 On Tue, Jul 15, 2014 at 2:51 PM, Wayne Witzel
 wayne.wit...@canonical.com mailto:wayne.wit...@canonical.com wrote:
 
 If we aren't stopping the line when our CI is in the red, then what
 is the point of even having CI at all? If we are not prepared to
 adjust the culture of our development. To truly halt forward
 progress in favor of chasing down regressions then I struggle to
 find the benefit that CI and testing is giving us at all. Just
 confirming that master is broken and we are still ignoring it? If
 master is broken, we all our broken. No development you are doing
 should proceed that is based on a broken master. That work cannot at
 any point be considered in good working condition. A problem in
 master is everyone's problem.
 
 Bugs that are found throughout the normal operations and usage of
 juju should be assigned a priority and queued, but regression is a
 sign of a greater problem that should be resolved immediately.
 Allowing regressions to not stop the line is taking the stance that
 we don't care about deterioration in our code base.
 
 
 +100 to this. Regressions are a Big Deal and should be treated as such;
 leaving other merges queued until such a time as the regression is fixed
 (or backed out for rework) is entirely reasonable (and I think we've got
 enough evidence that the alternative really doesn't fly effectively).

Stop the line doesn't mean that everyone stops work, only that trunk
stops accepting new merges until the critical issue is fixed.

This could mean that one or more people actually work on the critical
issue, and others continue with other work, but there are no other trunk
commits until the bug is fixed.  This means that everyone on the team is
aware of the blocker and the progress to fix it because it directly
effects their ability to land their work. This means that there is
internal team pressure as well, and normally this ends up being more
offers to help, review and get the code landed ASAP so it unblocks people.

In the past, we had the landing bot tweaked so it was put into
emergency only mode which meant that normal merges were ignored and
only approved critical landings were accepted until the mode was
changed.  This seems like it could be relatively easy to do.

Perhaps a flag like $$ci-fix-merge$$ ?? /me handwaves

Tim


-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Devel is broken, we cannot release

2014-07-14 Thread Curtis Hovey-Canonical
Devel has been broken for weeks because of regressions. We cannot
release devel. The stable 1.20.0 that we release is actually older
than it appears because we had to search CI for an older revision that
worked.

We have a systemic problem: once a regression is introduced, it blocks
the release for weeks, and we build on top of the regression. We often
see many regressions.The regression mutate as people merge more
branches.

The current two regressions are:
* win juju client still broken with unknown
  from  2014-06-27 which has varied as a compilation
  problem or panic during execution.
  https://bugs.launchpad.net/juju-core/+bug/1335328

* FAIL: managedstorage_test trusty ppc64
  from 2014-06-30 which had a secondary bug that broke compilation.
  https://bugs.launchpad.net/juju-core/+bug/1336089

I think the problem is engineers are focused on there feature. They
don't see the fallout from their changes. They may hope the fix will
arrive soon, and that maybe someone else will fix it.

I propose a change in policy. When a there is a regression in CI, no
new branches can be merged except those that link to the blocking bug.
This will encourage engineers to fix the regression. One way to fix
the regression is to identify and revert the commit that broken CI.


-- 
Curtis Hovey
Canonical Cloud Development and Operations
http://launchpad.net/~sinzui

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Devel is broken, we cannot release

2014-07-14 Thread Wayne Witzel
+1

I've experienced this type of policy and it leads to few things. More
tests, better tests, and better self reviews and developer QA.

I believe borrowing the other ideas from lean and agile but not having the
big stop button when a defect is found is an unsustainable approach to
development and with the recent growth in the number of people actively
working on the code base we are now experiencing that first had.
On Jul 14, 2014 3:06 PM, Curtis Hovey-Canonical cur...@canonical.com
wrote:

 Devel has been broken for weeks because of regressions. We cannot
 release devel. The stable 1.20.0 that we release is actually older
 than it appears because we had to search CI for an older revision that
 worked.

 We have a systemic problem: once a regression is introduced, it blocks
 the release for weeks, and we build on top of the regression. We often
 see many regressions.The regression mutate as people merge more
 branches.

 The current two regressions are:
 * win juju client still broken with unknown
   from  2014-06-27 which has varied as a compilation
   problem or panic during execution.
   https://bugs.launchpad.net/juju-core/+bug/1335328

 * FAIL: managedstorage_test trusty ppc64
   from 2014-06-30 which had a secondary bug that broke compilation.
   https://bugs.launchpad.net/juju-core/+bug/1336089

 I think the problem is engineers are focused on there feature. They
 don't see the fallout from their changes. They may hope the fix will
 arrive soon, and that maybe someone else will fix it.

 I propose a change in policy. When a there is a regression in CI, no
 new branches can be merged except those that link to the blocking bug.
 This will encourage engineers to fix the regression. One way to fix
 the regression is to identify and revert the commit that broken CI.


 --
 Curtis Hovey
 Canonical Cloud Development and Operations
 http://launchpad.net/~sinzui

 --
 Juju-dev mailing list
 Juju-dev@lists.ubuntu.com
 Modify settings or unsubscribe at:
 https://lists.ubuntu.com/mailman/listinfo/juju-dev

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Devel is broken, we cannot release

2014-07-14 Thread Ian Booth
 
 I think the problem is engineers are focused on there feature. They
 don't see the fallout from their changes. They may hope the fix will
 arrive soon, and that maybe someone else will fix it.
 
 I propose a change in policy. When a there is a regression in CI, no
 new branches can be merged except those that link to the blocking bug.
 This will encourage engineers to fix the regression. One way to fix
 the regression is to identify and revert the commit that broken CI.
 

Agree in principal. However, we have seen some issues on CI whereby the
unreliability of the underlying cloud has caused failures. So long as the issue
identified indeed has a root cause that we can fix in juju itself, then we
should block landings to trunk until it is fixed.


-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Devel is broken, we cannot release

2014-07-14 Thread Ian Booth
 
 * FAIL: managedstorage_test trusty ppc64
   from 2014-06-30 which had a secondary bug that broke compilation.
   https://bugs.launchpad.net/juju-core/+bug/1336089
 

This bug brings up another issue.
The code concerned has now been moved off to a juju sub project - blobstorage.
So the juju-core ppc64 tests will no longer fail.

Martin is in the process of setting up Jenkins landing jobs for all the sub
projects (there are several). But there won't initially be ppc64 variants of
these jobs. So it will be possible for juju-core to now pass ppc64 testing even
though sub projects it depends on may not.

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Devel is broken, we cannot release

2014-07-14 Thread Tim Penhey
On 15/07/14 15:48, Ian Booth wrote:

 * FAIL: managedstorage_test trusty ppc64
   from 2014-06-30 which had a secondary bug that broke compilation.
   https://bugs.launchpad.net/juju-core/+bug/1336089

 
 This bug brings up another issue.
 The code concerned has now been moved off to a juju sub project - blobstorage.
 So the juju-core ppc64 tests will no longer fail.
 
 Martin is in the process of setting up Jenkins landing jobs for all the sub
 projects (there are several). But there won't initially be ppc64 variants of
 these jobs. So it will be possible for juju-core to now pass ppc64 testing 
 even
 though sub projects it depends on may not.

Surely this just means that we need real end to end tests on all
supported architectures, right?

Tim


-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Devel is broken, we cannot release

2014-07-14 Thread Ian Booth


On 15/07/14 14:17, Tim Penhey wrote:
 On 15/07/14 15:48, Ian Booth wrote:

 * FAIL: managedstorage_test trusty ppc64
   from 2014-06-30 which had a secondary bug that broke compilation.
   https://bugs.launchpad.net/juju-core/+bug/1336089


 This bug brings up another issue.
 The code concerned has now been moved off to a juju sub project - 
 blobstorage.
 So the juju-core ppc64 tests will no longer fail.

 Martin is in the process of setting up Jenkins landing jobs for all the sub
 projects (there are several). But there won't initially be ppc64 variants of
 these jobs. So it will be possible for juju-core to now pass ppc64 testing 
 even
 though sub projects it depends on may not.
 
 Surely this just means that we need real end to end tests on all
 supported architectures, right?
 

In theory. The number of combinations will be large and I'm not sure we
currently have the capacity to do that?

But the issue also it that functional tests may well pass even though some
particular unit tests fail.

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev