Re: [RFE] Add minimal universal release management capabilities to GIT

2017-10-27 Thread nicolas . mailhot


- Mail original -
De: "Jacob Keller" 

Hi Jacob,


> I think that this could easily be built by a separate script which provides 
> git release command line and uses tags under the hood in a 
> well formed way.

True, the difficulty is not technical, the whole scheme is basic and KISS.

> I wouldn't say that the method outlined here works for all projects but I do 
> agree it's fairly common and could work for many projects

I would be very surprised if there was a strong technical reason that stopped 
any project from adopting this scheme. Like I already wrote, Linux packaging 
tools work by converting public release naming to this scheme (with some 
additional twists, mostly there to help conversion of terminally broken, 
tooling-hostile and usually legacy projects, not worth the pain to import in 
new tooling).

> I think most large projects already use annotated tags and tho they have 
> their own format it works pretty well. 

Raw tags are useless as release ids for tooling so everyone is forced to invent 
something else as soon as the project complexity passes a threshold (that's the 
point were there is no choice but to redefine tags, not the point were it 
starts being useful). I've already detailed why their laxity makes them useless.

> Showing a tool that could help projects create more standardized release tags 
> would be helpful.
> 
> I think such a tool could already be built, scripted to create annotated tags 
> with a well formed name. I don't think you necessarily
> need to have this in core git, tho I do see that your main goal is to 
> piggyback on git itselfs popularity

I see little hope for such a tool. Reimplementation is too trivial and 
convention drift only starts to be acutely painful past a certain size. At that 
size you're almost certain to have already started using a custom 
implementation, with refactoring costs impeding switching to a generic tool.

Basically, it can only be done with a good probability of success by 
piggybacking on something that already federates a large number of Git users:
– Git itself, which is the correct most productive and least painful place for 
everyone involved
– one of the big Git-based forges, ie GitHub or GitLab. I'd expect it would be 
very tempting for one of those to make something that would effectively be a 
better Git than upstream Git, the usual embrace and extend effect.
– development language ecosystems (Python, Ruby, Go, etc). There are already 
many premises of such work since build automation needs ids that can be 
processed by tools.

The problem with letting forges or language ecosystems sort it is that you'll 
end up with functionally equivalent implementations, but divergent 
implementation details that end up wasting everyone's time. Like, decimal 
separator differences, deb vs rpm, car driving side, we humans managed to 
create the same clusterfuck time and time again. And much swearing every time 
you have a project that requires bridging those divergences.

It would be worth it if the divergence and competition helped new 
ground-breaking schemes to emerge but really, look at it, it's not rocket 
science. Everyone has been using about this scheme for decades with little 
changes. The remaining differences are slowly being eroded by the wish to 
automate everything.

Regards,

-- 
Nicolas Mailhot



Re: [RFE] Add minimal universal release management capabilities to GIT

2017-10-24 Thread Jacob Keller


On October 21, 2017 6:56:51 AM PDT, nicolas.mail...@laposte.net wrote:
>
>
>- Mail original -
>De: "Stefan Beller" 
>

>> git tags ?
>
>Too loosely defined to be relied on by project-agnostic tools. That's
>what most tools won't ever try to use those. Anything you will define
>around tags as they stand is unlikely to work on the project of someone
>else

I think that this could easily be built by a separate script which provides git 
release command line and uses tags under the hood in a well formed way. I 
wouldn't say that the method outlined here works for all projects but I do 
agree it's fairly common and could work for many projects

I think most large projects already use annotated tags and tho they have their 
own format it works pretty well. 

Showing a tool that could help projects create more standardized release tags 
would be helpful.

I think such a tool could already be built, scripted to create annotated tags 
with a well formed name. I don't think you necessarily need to have this in 
core git, tho I do see that your main goal is to piggyback on git itselfs 
popularity

Thanks
Jake
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [RFE] Add minimal universal release management capabilities to GIT

2017-10-23 Thread nicolas . mailhot


- Mail original -
De: "Randall S. Becker" 

>> Git is a wonderful tool, which has transformed how software is created, and 
>> made code sharing and reuse, a lot easier (both
>> between human and software tools).


>> Please please please add release handling and versioning capabilities to Git 
>> itself. Without it some enthusiastic
>> Git adopters are on a fast trajectory to unmanageable hash soup states, even 
>> if they are not realising it yet, because
>> the deleterious side effects of giving up on releases only get clear with 
>> time.
>> Here is what such capabilities could look like (people on this list can 
>> probably invent something better, I don't care as
>>clong as something exists).


> Nicolas makes some interesting points, and I do suggest looking at the 
> original post, but there are more factors to consider > when dealing with 
> production-grade releases in regulatory environments. And my sincere 
> apologies for what, even in my eyes
> looks like a bit of a soap-box rant. No slight intended, Nicolas.

Hi Randall. I plead guilty for the rant part, I've spent way too many nightly 
hours recently untangling Git projects that couldn't be bothered with stating 
requirements any other way than with a list of commit hashes, when they 
bothered (in one case a dev didn't even realise had broken some of his other 
projects when changing code, that's how bad the "a commit hash is a sufficient 
coordination point" situation is now getting).

> Possibly most importantly, there are serious distinctions between what is 
> built via CI, what is released, and what is
> installed.

Perhaps I should clarify, my post was about "source" release id since git is a 
"source" code manager. You do need a different id to identify release builds 
(packaged or not). However, any sane system will derive the build id from the 
source id, since most software properties (options, functionnalities, security 
problems) are directly caused by what's in the source itself, and changing them 
requires going back to the source.

> In a specific way, source and release commits are required to be time 
> reversible in production, whereby if an installation
> fails, there exist in many environments requirements to be able to fully undo 
> the install action. This is often different
> from the environment artifacts which can be time-forward constrained and 
> reversible only in extreme situations.

Yes the reversibility is often very theorical, basically requiring losing any 
change and reverting to a pre-change data dump. Automation is getting to the 
point where it' simpler to push a new fixed build than reverting to previous 
state. But that requires solid handover between source-oriented (git), 
build-oriented, deployment-oriented and audit-oriented tools.

>> So nothing terribly complex, just a lot a small helpers to make releasing 
>> easier, less tedious, and cheaper for developers,
>> that formalize, automate, and make easier existing practices of mature 
>> software projects, making them accessible to
>> smaller projects. They would make releasing more predictable and reliable 
>> for people deploying the code, and easier
>> to consume by higher-level cross-project management tools. That would 
>> transform the deployment stage of software
>> just like Git already transformed early code writing and autotest stages.

> Possibly, but primarily for source releases.

Sure, you need to start somewhere, and git's job is managing sources, so source 
releases are part of its theoritical scope.

Best regards,

-- 
Nicolas Mailhot


Re: [RFE] Add minimal universal release management capabilities to GIT

2017-10-23 Thread nicolas . mailhot
- Mail original -
De: "Kaartic Sivaraam" >

> Heads up, I'm gonna play the devil's advocate a little, here.

Be my guest, you're not alone.

On Sat, 2017-10-21 at 15:56 +0200, nicolas.mail...@laposte.net wrote:
> No that is not up to the hash function. First because hashes are too
> long to be manipulated by humans, and second no hash will ever
> capture human intent. You need an explicit human action to mark "I
> want others to use this particular state of my project because I am
> confident it is solid".
> 

> I would say you're just limiting your thoughts. There's no strict rule
> saying hash functions should be "incomprehensible" to humans or that
> different hashes should be "uncomparable". No one's going to stop
> someone from creating a hypothetical hash function that's totally
> ordered (!) unless you violate the basis of a "hash". (surprise,
> there's already attempts at it,

I'm not terribly interested in future tech incompatible with existing git 
projects, or with the multiple levels of release numbers projects seem to like.

Besides, even assuming a perfect magical hash, it is not going to replace the 
human act of choosing a project state others can consolidate on. You can search 
and replace numbers with magic ordered hash in my original message if you like, 
the rest does not change.

>> Except, the releasing happens outside git, it's still fairly manual. 

> You seem to be more frustrated by "manual" work.

Git users are frustrated with manual work. They've been "spoilt" by a wonderful 
tool, and want all software engineering actions to be automated as much as 
possible, including releases. They don't want to setup custom project specific 
helpers around Git, they don't want to deal with the project specific custom 
helpers others may have invented for generic actions, they want built-in 
commands or nothing.

Just take any random sampling of actual Git projects on Github and you'll see 
masses of people not doing releases or referencing other projects by random 
commit hash ("random" in the sense there's no coordination on what hash to 
choose, it may work or not, once it works locally the commit hash reference 
sediments in the project build env because it is too dangerous to change, there 
is no cooperation to focus on particular commits to make sure they are solid, 
and that's a basic project need, as evidenced by the "RHEL" "stable kernel" 
"Firefox ESR" "Ubuntu LTS" initiatives). 

>> All I'm proposing is to integrate the basic functions in git to
>> simplify the life of those projects and help smaller projects that
>> want completely intergrated git workflows.

> Wait, aren't you just trying to make git more than a "version control
> system"? I don't think it's encouraged a lot here given that there have
> patches that have not been accepted because they might make git
> opinionated and the result "doesn't quite fit into the broad picture of
> a version control system"

No I'm not trying to make git more than a "version control system". That's why 
I don't propose including dependency handling or build rules. However, git 
needs to hand project state to other tools, every other tool but CI works on 
discrete states (releases), you need a robust process to identify those states 
or the handover becomes a pain point. It is starting to failing when devs do 
not want to identify project states by anything except git objects (a is 
happening now) and those objects are not adapted to deployment and system-wide 
softwate management.

Just look at
https://grafeas.io/docs/concepts/what-is-grafeas/overview.html

(brand new Google project)

It is all about 'component identifiers' ie release ids. Most of those 
components now live in Git. Why does Google need half a dozen different ways to 
identify software sourced from git? Because Git does not provide good release 
tracking capabilities. Yet a lot of developers of those projects will insist 
the only correct id is a Git commit hash (conspicuously missing from Google's 
examples, because commit hashes are a terrible release identifier).

It is a huge problem when all the actors gravitating around a software project 
start disagreeing on basic stuff such as how to identify a project state. It is 
a huge problem when you start maintaining complex rosetta stone infrastructures 
just because the software layer that should define conventions, refuses to do 
it, and upper layers are forced to make choices that end up different. There is 
no inherent advantage to driving left or right. There is a huge advantage when 
everyone drives the same side.

>> Yes and it is so fun to herd hundreds of management tools with
>> different conventions and quirks. About as much fun as managing
>> dozens of scm before most projects settled on git. All commonalities
>> need to migrate in the common git layer to simplify management and
>> release id is the first of those.

> It's better to have a "good" (generic) release management tool that
> does what you 

Re: [RFE] Add minimal universal release management capabilities to GIT

2017-10-22 Thread Kaartic Sivaraam
Heads up, I'm gonna play the devil's advocate a little, here.

On Sat, 2017-10-21 at 15:56 +0200, nicolas.mail...@laposte.net wrote:
> No that is not up to the hash function. First because hashes are too
> long to be manipulated by humans, and second no hash will ever
> capture human intent. You need an explicit human action to mark "I
> want others to use this particular state of my project because I am
> confident it is solid".
> 

I would say you're just limiting your thoughts. There's no strict rule
saying hash functions should be "incomprehensible" to humans or that
different hashes should be "uncomparable". No one's going to stop
someone from creating a hypothetical hash function that's totally
ordered (!) unless you violate the basis of a "hash". (surprise,
there's already attempts at it,

https://stackoverflow.com/q/28043857/5614968


)


> Except, the releasing happens outside git, it's still fairly manual. 

You seem to be more frustrated by "manual" work. I suspect why you
can't automate that. Given all the work done during a release of "Git",
(https://public-inbox.org/git/xmqqr2tygvp4@gitster.mtv.corp.google.
com/)
may be the maintainer could possibly give some good advise on this.

+cc Junio

> All I'm proposing is to integrate the basic functions in git to
> simplify the life of those projects and help smaller projects that
> want completely intergrated git workflows.
> 

Wait, aren't you just trying to make git more than a "version control
system"? I don't think it's encouraged a lot here given that there have
patches that have not been accepted because they might make git
opinionated and the result "doesn't quite fit into the broad picture of
a version control system"

cf. https://public-inbox.org/git/20170711233827.23486-1-sbel...@google.com/

cf. 
https://public-inbox.org/git/cagz79kyarf6r-vx1-lm4x_anlmrxc3vnd2acqmnqq3j6y-s...@mail.gmail.com/


> Yes and it is so fun to herd hundreds of management tools with
> different conventions and quirks. About as much fun as managing
> dozens of scm before most projects settled on git. All commonalities
> need to migrate in the common git layer to simplify management and
> release id is the first of those.

It's better to have a "good" (generic) release management tool that
does what you ask (probably with some help from git) than try to turn
Git into one (which is not possible without making Git opinionated,
more on that later). I guess there should already be one that meets
your expectation and you probably just have to discover it.

Further, if there's no "generic" release management tool in existence,
I suspect that because there's no such thing a "generic release
management strategy" and it always depends on context (or) create one
on your own in the spirit of letting "git" handle just "version
control" and letting your "genereic" tool handle your concerns. Who
knows, if you have developed a good enough "generic" tool it might be
used widely for "release management" just as a lot of projects starting
using Git for "version control". (I still suspect that there should be
one that already exists)


> > git tags ?
> 
> Too loosely defined to be relied on by project-agnostic tools. That's
> what most tools won't ever try to use those. Anything you will define
> around tags as they stand is unlikely to work on the project of
> someone else
> 

They are loosely defined because you can't define them "tightly" and if
you try to it would make Git opinionated !?

> > > 5. a command, such as "git release", allow a human with control of the 
> > > repo to set an explicit release version to a commit. 
> > 
> > This sounds fairly specific to an environment that you are in, maybe
> > write git-release for your environment and then open source it. The
> > world will love it (assuming they have the same environment and
> > needs).
> 
> If you take the time to look at it it is not specific, it is generic.
> 

I would say that you might haven't looked broadly enough.

1) If it's generic, why isn't there any "generic" release management
tool?

2) if it's possible to create a "generic" release management tool and
it just doesn't exist yet, why not try to create instead of trying to
integrate release management into Git ? (you could make it depend on
git, of course)

> You need to identify software during
> its whole lifecycle, and the id needs to start in the scm, because
> that's where the lifecycle starts.

It might not for everyone!

-- 
Kaartic


RE: [RFE] Add minimal universal release management capabilities to GIT

2017-10-21 Thread Randall S. Becker
-Original Message-
From: git-ow...@vger.kernel.org [mailto:git-ow...@vger.kernel.org] On Behalf 
of.mail...@laposte.net
On October 20, 2017 6:41 AM, nicolas wrote:
To: git@vger.kernel.org
Subject: [RFE] Add minimal universal release management capabilities to GIT

>Git is a wonderful tool, which has transformed how software is created, and 
>made code sharing and reuse, a lot easier (both between human and software 
>tools).


> Please please please add release handling and versioning capabilities to Git 
> itself. Without it some enthusiastic
> Git adopters are on a fast trajectory to unmanageable hash soup states, even 
> if they are not realising it yet, because
> the deleterious side effects of giving up on releases only get clear with 
> time.
> Here is what such capabilities could look like (people on this list can 
> probably invent something better, I don't care as long as something exists).


Nicolas makes some interesting points, and I do suggest looking at the original 
post, but there are more factors to consider when dealing with production-grade 
releases in regulatory environments. And my sincere apologies for what, even in 
my eyes looks like a bit of a soap-box rant. No slight intended, Nicolas.

Possibly most importantly, there are serious distinctions between what is built 
via CI, what is released, and what is installed. Some of these can be answered 
addressed directly by git, but others require convention, or a meta-system 
spanning platforms. I will abbreviate some of this:

Commits being used to initiate CI cycles are typically based on source commit 
ids (Jenkins, as an example uses this as an initiator). In Open Source 
environments, where source is specifically released, this is a perfectly 
reasonable release point requiring no more than the commit id itself. 
Committers tend to add tags for convention to make identification convenient, 
and git describe is really helpful here for generating identifying information 
(I state the obvious here). This is the beginning of wisdom, not the end (to 
mis-paraphrase).

Release commits, which are not explicitly in a one-to-one relationships with 
source commits, are a different matter. Suppose the target of your Jenkins 
build creates a release of objects packaged in some useful form. The release 
and source commits are somehow related in your repository of record (loads of 
ways to do this). However, in multi-platform situations, you are in a 
many-to-one situation, obviously since the changes of the release's hash 
matching between two platform builds approaches zero. Nonetheless, the 
release's commit id is relevant to what gets installed, but it is not 
sufficient for human identification purposes. The tag comes in nicely here, and 
hopefully is propagated from the dependent source commit. This 
release-to-source commit derivation is implicitly required in some regulatory 
environments (financial institutions, FDA, FAA, as examples where this exists 
for some systems).

But once you are in a production (or QA) environment, the actual install 
package contains artifacts from a release and from the environment into which 
the release is being installed and activated. The artifacts themselves can be 
highly dynamic and changeable on a radically different and independent schedule 
from the code drop. I advocate keeping those in separate repositories and they 
make for hellacious merge/distribution rules - particularly if the environments 
are radically different in structure, platform, and capability. The 
relationship between commits here is if anything specifically mutable. In a 
specific way, source and release commits are required to be time reversible in 
production, whereby if an installation fails, there exist in many environments 
requirements to be able to fully undo the install action. This is often 
different from the environment artifacts which can be time-forward constrained 
and reversible only in extreme situations. This separate, at least in my 
experience, tends to drive how releases are managed in production shops.

> So nothing terribly complex, just a lot a small helpers to make releasing 
> easier, less tedious, and cheaper for developers,
> that formalize, automate, and make easier existing practices of mature 
> software projects, making them accessible to
> smaller projects. They would make releasing more predictable and reliable for 
> people deploying the code, and easier
> to consume by higher-level cross-project management tools. That would 
> transform the deployment stage of software
> just like Git already transformed early code writing and autotest stages.

Possibly, but primarily for source releases. Release management and the related 
practices are production functions that do not map particularly well (by 
assertion) to the git command set or functionality. As an underlying mechanism 
to manage the production artifacts, git 

Re: [RFE] Add minimal universal release management capabilities to GIT

2017-10-21 Thread nicolas . mailhot


- Mail original -
De: "Stefan Beller" 

>> Unfortunately Git is so good more and more developers start to procrastinate 
>> on any activity that happens outside of GIT,
>> starting with cutting releases. The meme "one only needs a git commit hash" 
>> is going strong, even infecting institutions
>> like lwn and glibc (https://lwn.net/SubscriberLink/736429/e5a8ccc85cc8/)

> For release you would want to include more than just "the code" into
> the hash, such as compiler versions, environment variables, the phase
> of the moon, what have you, that may impact the release build.

Yes and no. Yes because you do want to limit failure cases, and no because it's 
very easy to overspecify and block code reuse possibilities. Anyway I don't see 
a strong consensus on how to do those yet, they are very language-specific, and 
the first step is being able to identify other code you depend on which 
requires some sort of release id, which is what my message was about. You can't 
build any compatibility matrix, before being able to name the dimensions of the 
matrix.

> It sounds to me as if you assume that if X, Y, Z were numbers (or
> rather had some order), this can be easily deduced.

It's a lot more easy to use "option foo was introduced in version 2.3.4 and 
takes Y parameters" than "option foo was introduced in commit hash 
#, you have version hash 
$$", good luck.

> The output of git-describe ought to be sufficient for an ordering
> scheme to rely on?

That relies on git access to the repo of every bit of code your computer runs. 
This is not practical past the deployment phase. For deployment the ordering 
needs to be extracted from all the git data so you only need to manipulate 
short human and tool-friendly ids. You need low coupling not the strong 
coupling of git repo access.

>> — hashes are not ranked. You can not guess, looking at a hash, if it 
>> corresponds to a project stability point, or is in a
>> middle of a refactoring sequence, where things are expected to break. 
>> Evaluating every hash of every project you use
>> quickly becomes prohibitive, with the only possible strategy being to just 
>> use the latest commit at a given time and pra
>> (and if you are lucky never never update afterwards unless you have lots of 
>> fixing and testing time to waste).

> That is up to the hash function. One could imagine a hash function
> that generates bit patterns that you can use to obtain an order from.

No that is not up to the hash function. First because hashes are too long to be 
manipulated by humans, and second no hash will ever capture human intent. You 
need an explicit human action to mark "I want others to use this particular 
state of my project because I am confident it is solid".

>> – commit mixing is broken by design.

> In Git terms a repository is the whole universe.
> If you want relationships between different projects, you need to
> include these projects e.g. via subtree or submodules.
> It scales even up to linux distributions (e.g.
> https://github.com/gittup/gittup, includes nethack!)

This is still pre-deployment phase. And I wouldn't qualify this as "full linux 
distro", it's very small scale. If anything it demonstrated than even on a 
smallish perimeter relying on git alone as it stands today is too hard (3 
updates in the whole 2017 year!).

>> One can not adapt the user of a piece of code to changes in this piece of 
>> code before those changes are committed in the
>> first place. There will always be moments where the latest commit of a 
>> project, is incompatible with the latest commit of
>> downsteam users of this project. It is not a problem in developer 
>> environments and automated testers, where you want things >> to break early 
>> and be fixed early. It is a huge problem when you follow the same early 
>> commit push strategy for actual
>> production code, where failures are not just a red light in a build farm 
>> dashboard, but have real-world consequences. And
>> the more interlinked git repositories you pile on one another, the higher 
>> the probability is two commits won't work with
>> one another with failures cascading down

> That is software engineering in general, I am not sure how Git relates
> to this? Any change that you make (with or without utilizing Git) can
> break the downstream world.

It's a lot easier to manage when you have discrete release synchronisation 
point and not just a flow of commits

>> – commits are a bad inter-project synchronisation point. There are too many 
>> of them, they are not ranked, everyone is
>> choosing a different commit to deploy, that effectively kills the network 
>> effects that helped making traditional releases
>> solid (because distributors used the same release state, and could share 
>> feedback and audit results).

> There are different strategies. Relevant open source projects (kernel,
> glibc, git) are 

Re: [RFE] Add minimal universal release management capabilities to GIT

2017-10-20 Thread Stefan Beller
On Fri, Oct 20, 2017 at 3:40 AM,   wrote:
> Hi,
>
> Git is a wonderful tool, which has transformed how software is created, and 
> made code sharing and reuse, a lot easier (both between human and software 
> tools).
>
> Unfortunately Git is so good more and more developers start to procrastinate 
> on any activity that happens outside of GIT, starting with cutting releases. 
> The meme "one only needs a git commit hash" is going strong, even infecting 
> institutions like lwn and glibc 
> (https://lwn.net/SubscriberLink/736429/e5a8ccc85cc8/)

For release you would want to include more than just "the code" into
the hash, such as compiler versions, environment variables, the phase
of the moon, what have you, that may impact the release build.

> However, the properties that make a hash commit terrific at the local 
> development level, also make it suboptimal as a release ID:
>
> – hashes are not ordered. A human can not guess the sequencing of two hashes, 
> nor can a tool, without access to Git history. Just try to handle "critical 
> security problem in project X, introduced with version Y and fixed in Z" when 
> all you have is some git hashes. hashing-only introduces severe frictions 
> when analysing deployment states.

It sounds to me as if you assume that if X, Y, Z were numbers (or
rather had some order), this can be easily deduced.
The output of git-describe ought to be sufficient for an ordering
scheme to rely on?
However the problem with deployments is that Y might be v1.8.0.1 and Z
might be v2.1.2.0 and X (that you are running) is v2.10.2.0.

> — hashes are not ranked. You can not guess, looking at a hash, if it 
> corresponds to a project stability point, or is in a middle of a refactoring 
> sequence, where things are expected to break. Evaluating every hash of every 
> project you use quickly becomes prohibitive, with the only possible strategy 
> being to just use the latest commit at a given time and pray (and if you are 
> lucky never never update afterwards unless you have lots of fixing and 
> testing time to waste).

That is up to the hash function. One could imagine a hash function
that generates bit patterns that you can use to obtain an order from.
SHA-1 that Git uses is not such a hash, but rather a supposedly secure
hash. One hash value looks like white noise, such that the entropy of
a SHA-1 object name can be estimated with 160 bits.

> – commit mixing is broken by design.

In Git terms a repository is the whole universe.
If you want relationships between different projects, you need to
include these projects e.g. via subtree or submodules.
It scales even up to linux distributions (e.g.
https://github.com/gittup/gittup, includes nethack!)

> One can not adapt the user of a piece of code to changes in this piece of 
> code before those changes are committed in the first place. There will always 
> be moments where the latest commit of a project, is incompatible with the 
> latest commit of downsteam users of this project. It is not a problem in 
> developer environments and automated testers, where you want things to break 
> early and be fixed early. It is a huge problem when you follow the same early 
> commit push strategy for actual production code, where failures are not just 
> a red light in a build farm dashboard, but have real-world consequences. And 
> the more interlinked git repositories you pile on one another, the higher the 
> probability is two commits won't work with one another with failures 
> cascading down

That is software engineering in general, I am not sure how Git relates
to this? Any change that you make (with or without utilizing Git) can
break the downstream world.

> – commits are too granular. Even assuming one could build an automated 
> regression farm powerful enough to build and test instantaneously every 
> commit, it is not possible to instantaneously push those rebuilds to every 
> instance where this code is deployed (even with infinite bandwidth, infinite 
> network reach and infinite network availability).

With infinite resources it would be possible, as the computers are
also infinitely fast. ;)

> Computers would be spending their time resetting to the latest build of one 
> component or another, with no real work being done. So there will always be a 
> distance, between the latest commit in a git repo, and what is actually 
> deployed. And we've seen bare hashes make evaluating this distance difficult
>
> – commits are a bad inter-project synchronisation point. There are too many 
> of them, they are not ranked, everyone is choosing a different commit to 
> deploy, that effectively kills the network effects that helped making 
> traditional releases solid (because distributors used the same release state, 
> and could share feedback and audit results).

There are different strategies. Relevant open source projects (kernel,
glibc, git) are pretty good at not breaking the downstream users with
every 

[RFE] Add minimal universal release management capabilities to GIT

2017-10-20 Thread nicolas . mailhot
Hi,

Git is a wonderful tool, which has transformed how software is created, and 
made code sharing and reuse, a lot easier (both between human and software 
tools).

Unfortunately Git is so good more and more developers start to procrastinate on 
any activity that happens outside of GIT, starting with cutting releases. The 
meme "one only needs a git commit hash" is going strong, even infecting 
institutions like lwn and glibc 
(https://lwn.net/SubscriberLink/736429/e5a8ccc85cc8/)

However, the properties that make a hash commit terrific at the local 
development level, also make it suboptimal as a release ID:

– hashes are not ordered. A human can not guess the sequencing of two hashes, 
nor can a tool, without access to Git history. Just try to handle "critical 
security problem in project X, introduced with version Y and fixed in Z" when 
all you have is some git hashes. hashing-only introduces severe frictions when 
analysing deployment states.

— hashes are not ranked. You can not guess, looking at a hash, if it 
corresponds to a project stability point, or is in a middle of a refactoring 
sequence, where things are expected to break. Evaluating every hash of every 
project you use quickly becomes prohibitive, with the only possible strategy 
being to just use the latest commit at a given time and pray (and if you are 
lucky never never update afterwards unless you have lots of fixing and testing 
time to waste).

– commit mixing is broken by design. One can not adapt the user of a piece of 
code to changes in this piece of code before those changes are committed in the 
first place. There will always be moments where the latest commit of a project, 
is incompatible with the latest commit of downsteam users of this project. It 
is not a problem in developer environments and automated testers, where you 
want things to break early and be fixed early. It is a huge problem when you 
follow the same early commit push strategy for actual production code, where 
failures are not just a red light in a build farm dashboard, but have 
real-world consequences. And the more interlinked git repositories you pile on 
one another, the higher the probability is two commits won't work with one 
another with failures cascading down

– commits are too granular. Even assuming one could build an automated 
regression farm powerful enough to build and test instantaneously every commit, 
it is not possible to instantaneously push those rebuilds to every instance 
where this code is deployed (even with infinite bandwidth, infinite network 
reach and infinite network availability). Computers would be spending their 
time resetting to the latest build of one component or another, with no real 
work being done. So there will always be a distance, between the latest commit 
in a git repo, and what is actually deployed. And we've seen bare hashes make 
evaluating this distance difficult

– commits are a bad inter-project synchronisation point. There are too many of 
them, they are not ranked, everyone is choosing a different commit to deploy, 
that effectively kills the network effects that helped making traditional 
releases solid (because distributors used the same release state, and could 
share feedback and audit results).

One could mitigate those problems in a Git management overlay (and, indeed, 
many try). The problem of those overlays is that they have variable maturity 
levels, make incompatible choices, cut corners, are not universal like Git, 
making building anything on top of them of dubious value, with quick fallback 
to commit hashes, which *are* universal among Git repos. Release handling and 
versioning really needs to happen in Git itself to be effective.

Please please please add release handling and versioning capabilities to Git 
itself. Without it some enthusiastic Git adopters are on a fast trajectory to 
unmanageable hash soup states, even if they are not realising it yet, because 
the deleterious side effects of giving up on releases only get clear with time.

Here is what such capabilities could look like (people on this list can 
probably invent something better, I don't care as long as something exists).

1. "release versions" are first class objects that can be attached to a commit 
(not just freestyle tags that look like versions, but may be something else 
entirely). Tools can identify release IDs reliably.

2. "release versions" have strong format constrains, that allow humans and 
tools to deduce their ordering without needing access to something else (full 
git history or project-specific conventions). The usual string of numbers 
separated by dots is probably simple and universal enough (if you start to 
allow letters people will try to use clever schemes like alpha or roman 
numerals, that break automation). There needs to be at least two numbers in the 
string to allow tracking patchlevels.

3. several such objects can be attached to a commit (a project may wish to 
promote a minor