John,

Actually the goal of my letter is not to promote new integration scheme. Just to remind that we need to put some efforts to internal process review and optimization.

But, see answers below (inline):

Integration method I mentioned often used in open source projects,
because it doesn't require any special infrastructure for external commiters. The only necessary thing to do safe commit is a write access to integration (-gate) workspace.

On 2012-01-30 06:35, John Coomes wrote:
We have chosen a model:

build->test->integrate

but we may consider different approach:

integrate->build->test->[backout if necessary]

In that model, you can never rely on the repository having any degree
of stability.  It may not even build at a given moment.

What happens today if Developer A and Developer B changes the same line of a source?

What happens today if Developer A changes some_func() but Developer B
rely on some_func() ?

We would get a fault *after* all integration tests and SQE file one more nightly bug. To the time someone investigate it and give the fix, bad code will be distributed to all dev workspaces.


   Developer (A) integrate his changeset to an integration workspace
   Bot takes snapshot and start building/testing
   Developer (B) integrate his changeset to an integration workspace
   Bot takes snapshot and start building/testing

   if Job A failed, bot lock integration ws, restore it to pre-A state,
   apply B-patch. unlock ws.

Don't forget the trusting souls that pulled from the integration repo
after A inflicted the breakage:  they each waste time cleaning up a
copy of A's mess.

Nobody pulls from -gate repository today and nobody expected to do it.
-gate to ws merge continues as usual.

To remove faulty changeset we need about fifteen minutes for whole jdk at worst.

-Dmitry


-John

On 2012-01-29 23:52, Kelly O'Hair wrote:

On Jan 29, 2012, at 10:23 AM, Georges Saab wrote:


I'm missing something. How can everybody using the exact same system
scale to 100's of developers?

System = distributed build and test of OpenJDK

Ah ha...   I'm down in the trenches dealing with dozens of different
OS's arch's variation machines.
You are speaking to a higher level, I need to crawl out of the basement.


Developers send in jobs
Jobs are distribute across a pool of (HW/OS) resources
The resources may be divided into pools dedicated to different tasks
(RE/checkin/perf/stress)
The pools are populated initially according to predictions of load and
then increased/rebalanced according to data on actual usage
No assumptions made about what exists on the machine other than HW/OS
The build and test tasks are self sufficient, i.e. bootstrap themselves
The bootstrapping is done in the same way for different build and test
tasks

Understood. We have talked about this before.  I have also been on the
search for the Holy Grail. ;^)
This is why I keep working on JPRT.


The only scaling aspect that seems at all challenging is that the
current checkin system is designed to serialize checkins in a way that
apparently does not scale -- here there are some decisions to be made
and tradeoffs but this is nothing new in the world of Open community
development (or any large team development for that matter)

The serialize checkins issue can be minimized some by using distributed
SCMs (Mercurial, Git, etc)
and using separate forests (fewer developers per source repository means
fewer merge/sync issues)
and having an integrator merge into a master. This has proven to work in
many situations but it
also creates delivery to master delays, especially if the integration
process is too heavyweight.

The JDK projects has been doing this for a long time, I'm sure many
people have opinions as to how
successful it is or isn't.

It is my opinion that merges/syncs are some of the most dangerous things
you can do to a source base,
and anything we can do to avoid them is usually goodness, I don't think
you should scale this without some
very great care.



And that one system will naturally change over time too, so unless
you are able to prevent all change
to a system (impossible with security updates etc) every use of that
'same system' will be different.

Yes, but it is possible to control this update and have a staging
environment so you know that a HW/OS update will not break the
existing successful build when rolled out to the build/test farm.

Possible but not always easy. The auto updating of everything has
increased significantly over the years,
making it harder to control completely.

I've been doing this build&test stuff long enough to never expect
anything to be 100% reliable.
Hardware fails, software updates regress functionality, networks become
unreliable, humans trip over
power cords, virus scanners break things, etc. It just happens, and
often, it's not very predictable or reproducible.
You can do lots of things to minimize issues, but at some point you just
have to accept a few risks because
the alternative just isn't feasible or just can't happen with the
resources we have.

-kto




--
Dmitry Samersoff
Java Hotspot development team, SPB04
* There will come soft rains ...


--
Dmitry Samersoff
Java Hotspot development team, SPB04
* There will come soft rains ...

Reply via email to