On Tue, 30 Dec 2014 02:09:42 +0100, Bernd Eckenfels wrote:
That thread gets deep. :)

I just wanted to comment on "releasing only
source is faster because of less checks". I disagree with that, most
release delay/time is due to preparation work. Failed (binary) checks
are typically for a reason which would also be present in the source
(especially the POM), so it does not really reduce the number of
rework.

RM is a streamlined procedure: so, if you do (say) 10 steps rather
than 15, it will objectively take less time, and this is compounded
by the additional tests which should (ideally) be performed by the
reviewers. [Thus delaying the release.]

(At least not in most cases, so two votes will actually make us
more work not less).

The additional work exactly amounts to sending _one_ additional mail.

Then, as I noted,
 * some releases will be done as before (same work)
 * some releases will be "source only" (less work)
 * some releases will be two-steps, possibly performed by two different
   people (i.e. less work for each RM)

Of course, each release means some work has to be done; then IIUC your
point, the fewer releases the better. :-}


Regards,
Gilles


Gruss
Bernd



 Am Tue, 30 Dec 2014 02:05:29
+0100 schrieb Gilles <gil...@harfang.homelinux.org>:

On Mon, 29 Dec 2014 10:54:59 +0000, sebb wrote:
> On 29 December 2014 at 10:36, Gilles <gil...@harfang.homelinux.org>
> wrote:
>> On Sun, 28 Dec 2014 20:21:32 -0700, Phil Steitz wrote:
>>>
>>> On 12/28/14 11:46 AM, Gilles wrote:
>>>>
>>>> Hi.
>>>>
>>>> On Sun, 28 Dec 2014 09:43:34 +0100, Luc Maisonobe wrote:
>>>>>
>>>>> Le 28/12/2014 00:22, sebb a écrit :
>>>>>>
>>>>>> On 27 December 2014 at 22:19, Gilles
>>>>>> <gil...@harfang.homelinux.org> wrote:
>>>>>>>
>>>>>>> On Sat, 27 Dec 2014 17:48:05 +0000, sebb wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 24 December 2014 at 15:11, Gilles
>>>>>>>> <gil...@harfang.homelinux.org> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, 24 Dec 2014 15:52:12 +0100, Luc Maisonobe wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Le 24/12/2014 15:04, Gilles a écrit :
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, 24 Dec 2014 09:31:46 +0100, Luc Maisonobe wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Le 24/12/2014 03:36, Gilles a écrit :
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, 23 Dec 2014 14:02:40 +0100, luc wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is a [VOTE] for releasing Apache Commons Math 3.4
>>>>>>>>>>>>>> from release
>>>>>>>>>>>>>> candidate 3.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Tag name:
>>>>>>>>>>>>>> MATH_3_4_RC3 (signature can be checked from git using
>>>>>>>>>>>>>> 'git tag
>>>>>>>>>>>>>> -v')
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Tag URL:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> <https://git-wip-us.apache.org/repos/asf?p=commons-math.git;a=commit;h=befd8ebd96b8ef5a06b59dccb22bd55064e31c34>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is there a way to check that the source code referred to
>>>>>>>>>>>>> above
>>>>>>>>>>>>> was the one used to create the JAR of the ".class"
>>>>>>>>>>>>> files. [Out of curiosity, not suspicion, of course...]
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, you can look at the end of the META-INF/MANIFEST.MS
>>>>>>>>>>>> file embedded
>>>>>>>>>>>> in the jar. The second-to-last entry is called
>>>>>>>>>>>> Implementation-Build.
>>>>>>>>>>>> It
>>>>>>>>>>>> is automatically created by maven-jgit-buildnumber-plugin
>>>>>>>>>>>> and contains
>>>>>>>>>>>> the SHA1 identifier of the last commit used for the
>>>>>>>>>>>> build. Here, is is
>>>>>>>>>>>> befd8ebd96b8ef5a06b59dccb22bd55064e31c34, so we can check
>>>>>>>>>>>> it really
>>>>>>>>>>>> corresponds to the expected status of the git repository.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Can this be considered "secure", i.e. can't this entry in
>>>>>>>>>>> the MANIFEST
>>>>>>>>>>> file be modified to be the checksum of the repository but
>>>>>>>>>>> with the
>>>>>>>>>>> .class
>>>>>>>>>>> files being substitued with those coming from another
>>>>>>>>>>> compilation?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Modifying anything in the jar (either this entry within the
>>>>>>>>>> manifest or
>>>>>>>>>> any class) will modify the jar signature. So as long as
>>>>>>>>>> people do check
>>>>>>>>>> the global MD5, SHA1 or gpg signature we provide with our
>>>>>>>>>> build, they
>>>>>>>>>> are safe to assume the artifacts are Apache artifacts.
>>>>>>>>>>
>>>>>>>>>> This is not different from how releases are done with
>>>>>>>>>> subversion as the
>>>>>>>>>> source code control system, or even in C or C++ as the
>>>>>>>>>> language. At one
>>>>>>>>>> time, the release manager does perform a compilation and
>>>>>>>>>> the fellow
>>>>>>>>>> reviewers check the result. There is no fullproof process
>>>>>>>>>> here, as
>>>>>>>>>> always when security is involved. Even using an automated
>>>>>>>>>> build and
>>>>>>>>>> automatic signing on an Apache server would involve trust
>>>>>>>>>> (i.e. one
>>>>>>>>>> should assume that the server has not been tampered with,
>>>>>>>>>> that the build
>>>>>>>>>> process really does what it is expected to do, that the
>>>>>>>>>> artifacts put to
>>>>>>>>>> review are really the one created by the automatic process
>>>>>>>>>> ...).
>>>>>>>>>>
>>>>>>>>>> Another point is that what we officially release is the
>>>>>>>>>> source, which
>>>>>>>>>> can be reviewed by external users. The binary parts are
>>>>>>>>>> merely a
>>>>>>>>>> convenience.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> That's an interesting point to come back to since it looks
>>>>>>>>> like the
>>>>>>>>> most time-consuming part of a release is not related to the
>>>>>>>>> sources!
>>>>>>>>>
>>>>>>>>> Isn't it conceivable that a release could just be a commit
>>>>>>>>> identifier
>>>>>>>>> and a checksum of the repository?
>>>>>>>>>
>>>>>>>>> If the binaries are a just a convenience, why put so much
>>>>>>>>> effort in it?
>>>>>>>>> As a convenience, the artefacts could be produced after the
>>>>>>>>> release,
>>>>>>>>> accompanied with all the "caveat" notes which you mentioned.
>>>>>>>>>
>>>>>>>>> That would certainly increase the release rate.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Binary releases still need to be reviewed to ensure that the
>>>>>>>> correct N
>>>>>>>> & L files are present, and that the archives don't contain
>>>>>>>> material
>>>>>>>> with disallowed licenses.
>>>>>>>>
>>>>>>>> It's not unknown for automated build processes to include
>>>>>>>> files that
>>>>>>>> should not be present.
>>>>>>>>
>>>>>>>
>>>>>>> I fail to see the difference of principle between the
>>>>>>> "release" context
>>>>>>> and, say, the daily snapshot context.
>>>>>>
>>>>>>
>>>>>> Snapshots are not (should not) be promoted to the general
>>>>>> public as
>>>>>> releases of the ASF.
>>>>>>
>>>>>>> What I mean is that there seem to be a contradiction between
>>>>>>> saying that
>>>>>>> a "release" is only about _source_ and the obligation to check
>>>>>>> _binaries_.
>>>>>>
>>>>>>
>>>>>> There is no contradiction here.
>>>>>> The ASF releases source, they are required in a release.
>>>>>> Binaries are optional.
>>>>>> That does not mean that the ASF mirror system can be used to
>>>>>> distribute arbitrary binaries.
>>>>>>
>>>>>>> It can occur that disallowed material is, at some point in
>>>>>>> time, part of
>>>>>>> the repository and/or the snapshot binaries.
>>>>>>> However, what is forbidden is... forbidden, at all times.
>>>>>>
>>>>>>
>>>>>> As with most things, this is not a strict dichotomy.
>>>>>>
>>>>>>> If it is indeed a problem to distribute forbidden material,
>>>>>>> shouldn't
>>>>>>> this be corrected in the repository? [That's indeed what you
>>>>>>> did with
>>>>>>> the blocking of the release.]
>>>>>>
>>>>>>
>>>>>> If the repo is discovered to contain disallowed material, it
>>>>>> needs to
>>>>>> be removed.
>>>>>>
>>>>>>> Then again, once the repository is "clean", it can be tagged
>>>>>>> and that
>>>>>>> tagged _source_ is the release.
>>>>>>
>>>>>>
>>>>>> Not quite.
>>>>>>
>>>>>> A release is a source archive that is voted on and distributed
>>>>>> via the
>>>>>> ASF mirror system.
>>>>>> The contents must agree with the source tag, but the source tag
>>>>>> is not
>>>>>> the release.
>>>>>>
>>>>>>> Non-compliant binaries would thus only be the result of a
>>>>>>> "mistake"
>>>>>>> (if the build system is flawed, it's another problem,
>>>>>>> unrelated to
>>>>>>> the released contents, which is _source_) to be corrected per
>>>>>>> se.
>>>>>>
>>>>>>
>>>>>> Not so. There are other failure modes.
>>>>>>
>>>>>> An automated build obviously reduces the chances of mistakes,
>>>>>> but it
>>>>>> can still create an archive containing files that should not be
>>>>>> there.
>>>>>> [Or indeed, omits files that should be present]
>>>>>> For example, the workspace contains spurious files which are
>>>>>> implicitly included by the assembly instructions.
>>>>>> Or the build process creates spurious files that are
>>>>>> incorrectly added
>>>>>> to the archive.
>>>>>> Or the build incorrectly includes jars that are supposed to be
>>>>>> provided by the end user
>>>>>> etc.
>>>>>>
>>>>>> I have seen all the above in RC votes.
>>>>>> There are probably other falure modes.
>>>>>>
>>>>>>> My proposition is that it's an independent step: once the
>>>>>>> build system is adjusted to the expectations, "correct"
>>>>>>> binaries can be
>>>>>>> generated from the same tagged release.
>>>>>>
>>>>>>
>>>>>> It does not matter when the binary is built.
>>>>>> If it is distributed by the PMC as a formal release, it must
>>>>>> not contain any surprises, e.g. it must be licensed under the
>>>>>> AL.
>>>>>>
>>>>>> It is therefore vital that the contents are as expected from
>>>>>> the build.
>>>>>>
>>>>>> Note also that a formal release becomes an act of the PMC by
>>>>>> the voting process.
>>>>>> The ASF can then assume responsibility for any legal issues
>>>>>> that may arise.
>>>>>> Otherwise it is entirely the personal responsibility of the
>>>>>> person who
>>>>>> releases it.
>>>>>
>>>>>
>>>>> I think the last two points are really important: binaries must
>>>>> be
>>>>> checked and the foundation provides a legal protection for the
>>>>> project
>>>>> if something weird occurs.
>>>>>
>>>>> I also think another point is important: many if not most users
>>>>> do
>>>>> really expect binaries and not source. From our internal Apache
>>>>> point
>>>>> of view, these are a by-product,. For many others it is the
>>>>> important
>>>>> thing. It is mostly true in maven land as dependencies are
>>>>> automatically retrieved in binary form, not source form. So the
>>>>> maven
>>>>> central repository as a distribution system is important.
>>>>>
>>>>> Even if for some security reason it sounds at first thought
>>>>> logical to
>>>>> rely on source only and compile oneself, in an industrial
>>>>> context project teams do not have enough time to do it for all
>>>>> their dependencies, so they use binaries provided by trusted
>>>>> third parties. A
>>>>> long time ago, I compiled a lot of free software tools for the
>>>>> department I worked for at that time. I do not do this anymore,
>>>>> and
>>>>> trust the binaries provided by the packaging team for a
>>>>> distribution
>>>>> (typically Debian). They do rely on source and compile
>>>>> themselves. Hey,
>>>>> I even think Emmanuel here belongs to the Debian java team ;-) I
>>>>> guess
>>>>> such teams that do rely on source are rather the exception than
>>>>> the
>>>>> rule. The other examples I can think of are packaging teams,
>>>>> development teams that need bleeding edge (and will also
>>>>> directly depend on the repository, not even the release),
>>>>> projects that need to
>>>>> introduce their own patches and people who have critical needs
>>>>> (for
>>>>> example when safety of people is concerned or when they need
>>>>> full control for legal or contractual reasons). Many other
>>>>> people download
>>>>> binaries directly and would simply not consider using a project
>>>>> if it
>>>>> is not readily available: they don't have time for this and
>>>>> don't want
>>>>> to learn how to build tens or hundred of different projects they
>>>>> simply
>>>>> use.
>>>>>
>>>>
>>>> I do not disagree with anything said on this thread. [In
>>>> particular, I
>>>> did not at all imply that any one committer could take
>>>> responsibility
>>>> for releasing unchecked items.]
>>>>
>>>> I'm simply suggesting that what is called the release
>>>> process/management
>>>> could be made simpler (and _consequently_ could lead to more
>>>> regularly
>>>> releasing the CM code), by separating the concerns.
>>>> The concerns are
>>>>  1. "code" (the contents), and
>>>>  2. "artefacts" (the result of the build system acting on the
>>>> "code").
>>>>
>>>> Checking of one of these is largely independent from checking the
>>>> other.
>>>
>>>
>>> Unfortunately, not really. One principle that we have (maybe not
>>> crystal clear in the release doco) is that when we do distribute
>>> binaries, they should really be "convenience binaries" which means
>>> that everything needed to create them is in the source or its
>>> documented dependencies.  What that means is that what we tag as
>>> the
>>> source release needs to be able to generate any binaries that we
>>> subsequently release.  The only way to really test that is to
>>> generate the binaries and inspect them as part of verifying the
>>> release.
>>
>>
>> Only way?  That's certainly not obvious to me: Since a tag/branch
>> uniquely identifies a set of files, that is, the "source release
>> [that
>> is] able to generate any binaries that we subsequently release",
>> if a
>> RM can do it at (source) release time, he (or someone else!) can
>> do it
>> later, too (by running the build from a clone of the repository in
>> its
>> tagged state).
>>
>>> As others have pointed out, anything we release has to be verified
>>> and voted on.  As RM and reviewer, I think it is actually easier
>>> to roll and verify source and binaries together.
>>
>
> +1
>
>>
>> It's precisely my main point.
>> I won't dispute that you can prefer doing both (and nobody would
>> forbid
>> a RM to do just that) but the point is about the possibility to
>> release
>> source-only code (as the first step of a two-step procedure which I
>> described earlier).
>> [IMHO, the two-step one seems easier (both for the RM and the
>> reviewer),
>> (mileage does vary).]
>
> What is easier?
> It seems to me there will be at least one other step in your
> proposed process, i.e. a second VOTE e-mail

Yes, that's obviously what I meant:
Two steps == two votes

[But: source releases need not necessarily be accompanied with
"binaries", which, I imagine, could lead to official releases
occurring more often (due to the reduced number of checks).]

> These will both contain most of the same information.

No.
The first step is about the source, i.e. the code which humans create.
The second step is about the files which a build system creates.

As I indicated previously, the first vote will be about a set of
reviewers being satisfied with the state of the souce code, while
the second vote will be about another set of reviewers being satisfied
with the results of the build system ("no glitch", as you described
in an earlier message).

> Is the intention to announce the source release separately from the
> binary release?
> If so, there will need to be 2 announce mails, and 2 updates to the
> download page.

Is there a problem with that?
There are actually several possible cases (depending on the will of
the RM):
  * one-step release (only source code)
  * two-steps (source, then binaries based on that source)
  * combined (as is done up to now)
  * binaries (based on any previously released source)

>> In short is it forbidden (by the official/legal rules of ASF) to
>> proceed
>> as I propose?
>
> Dunno, depends on what exactly you are proposing.

Cf. above (and previous mails).

In practice the release could (IIUC) be like the link provided
by Luc in RC1 of CM 3.4 (whose target was a TAR of the tagged
repository).


>> It is impossible technically?
>
> Currently the Maven build process creates:
> - Maven source and binary jars
> - ASF source and binary bundles

AFAIU, the JARs (source and binary) are "binaries", the binary
bundles are "binaries". Only the ASF source is "source".

> It's not clear to me what exactly you propose to release in stage
> one,

The ASF source (e.g. in the form of a tarball, or the appropriate
"git clone" command).

> but there will need to be some changes to the process in order to
> release just the ASF source.

I don't see which.
A "source RM" would just stop the process after resolving/postponing
the pending issues, and checking the various reports about the source
code. [Then create the tag, and request a vote.]

A "binary RM" would take on from that point (a tagged repository),
i.e. create all the binaries, sign them, etc.

> There is no point releasing the Maven source jars separately from
> the binary jars; they are not complete as they only contain java
> files for
> use with IDEs.

I don't understand that.
In principle, a JAR with the Java sources is indeed the necessary and sufficient condition for users to create the executable bytecode, with
whatever build system they wish.
But I agree that it's not useful to not release all the files needed
to easily run maven. [And, for convenience, a source release would be
accompanied with instructions on how to build a JAR of the compiled
classes, using maven.]

> But in any case, AFAIK it is very tricky to release new files into
> an existing Maven folder, and it may cause problems for end users.

I don't understand what you mean by "release new files into an
existing Maven folder"...

Gilles

>>
>>
>>> Phil
>>>
>>>
>>>> [The more so that, as you said, no fool-proof link between the
>>>> two can
>>>> be ensured: From a security POV, checking the former requires a
>>>> code
>>>> review, while using the latter requires trust in the build
>>>> system.]
>>>>
>>>> Thus we could release the "code", after checking and voting on
>>>> the concerned elements (i.e. the repository state corresponding
>>>> to a specific tag + the web site).
>>>>
>>>> Then we could release the "binaries", as a convenience, after
>>>> checking
>>>> and voting on the concerned elements (i.e. the files about to be
>>>> distributed).
>>>>
>>>> I think that it's an added flexibility that would, for example,
>>>> allow
>>>> the tagging of the repository without necessarily release
>>>> binaries (i.e.
>>>> not involving that part of the work); and to release binaries
>>>> (say, at
>>>> regular intervals) based on the latest tagged code (i.e. not
>>>> involving
>>>> the work about solving/evaluating/postponing issues).
>>>>
>>>> [I completely admit that, at first, it might look a little more
>>>> confusing for the plain user, but (IIUC) it would be a better
>>>> representation of the reality covered by stating that the ASF
>>>> releases source code.]
>>>>
>>>>
>>>> Best regards,
>>>> Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to