Re: [Rpm-maint] [PATCH 2/2] Add RPMTAG_IDENTITY

2018-03-14 Thread Vladimir D. Seleznev
On Wed, Mar 14, 2018 at 09:44:41PM +0300, Ivan Zakharyaschev wrote:
> Hello!
> 
> On Wed, 14 Mar 2018, Vladimir D. Seleznev wrote:
> 
> > On Wed, Mar 14, 2018 at 10:20:58AM -0400, Jeff Johnson wrote:
> 
> >> I'd also suggest a more specific name than IDENTITY because there are
> >> many definitions of reproducibility, as well as alternative schemes
> >> like building, and there are surely going to be multiple attempts to
> >> Get It Right! that make IDENTITY a misnomer.
> >
> > This tag is not only about reproducibility, it supposes to represent
> > package build result identity, so we can differentiate one build from
> > another from one source package (with same NEVR) if there are
> > significant difference, such as different rundeps, generated binary
> 
> What does "rundeps" mean?

It means runtime dependencies.

> > files, different filelist that packaged to the files, etc. One benefit
> > of this is reproducible build proof, second one is that we can use value
> > from this tag to generate more strict intersubpackage deps, so we can
> > use these for binary package rebuild without release upping (in case new
> > SONAME buildreq or new compiler).
> >
> > We couldn't think of better tag name (you can propose a better name).

-- 
   With best regards,
   Vladimir D. Seleznev
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [PATCH 2/2] Add RPMTAG_IDENTITY

2018-03-14 Thread Ivan Zakharyaschev

Hello!

On Wed, 14 Mar 2018, Vladimir D. Seleznev wrote:


On Wed, Mar 14, 2018 at 10:20:58AM -0400, Jeff Johnson wrote:



I'd also suggest a more specific name than IDENTITY because there are
many definitions of reproducibility, as well as alternative schemes
like building, and there are surely going to be multiple attempts to
Get It Right! that make IDENTITY a misnomer.


This tag is not only about reproducibility, it supposes to represent
package build result identity, so we can differentiate one build from
another from one source package (with same NEVR) if there are
significant difference, such as different rundeps, generated binary


What does "rundeps" mean?


files, different filelist that packaged to the files, etc. One benefit
of this is reproducible build proof, second one is that we can use value
from this tag to generate more strict intersubpackage deps, so we can
use these for binary package rebuild without release upping (in case new
SONAME buildreq or new compiler).

We couldn't think of better tag name (you can propose a better name).



--
Best regards,
Ivan
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [PATCH 2/2] Add RPMTAG_IDENTITY

2018-03-14 Thread Vladimir D. Seleznev
On Wed, Mar 14, 2018 at 10:20:58AM -0400, Jeff Johnson wrote:
> 
> Afaict, RPMTAG_IDENTITY is an attempt at a reproducible invariant of a
> package header through rebuilding, which is poisoned by a
> RPMTAG_BUILDTIME tag (and likely file stat(2) info) being included in
> the header SHA1 (or SHA256) plaintext.

So there is black list filter for tags while identity calculation.

> Note also changes in current rpm to pass in a BUILDTIME to preserve
> reproducibility.
> 
> There are huge legacy compatibility problems committing to a
> precomputed static value in a header: consider what happens if/when
> the plaintext definition needs to change.

Sorry, I don't understand this part: what precomputed static value do
you mean?

> I'd suggest using a header tag extension rather than a retrieved value
> so that the plaintext definition can be more easily managed.
> 
> I'd also suggest a more specific name than IDENTITY because there are
> many definitions of reproducibility, as well as alternative schemes
> like building, and there are surely going to be multiple attempts to
> Get It Right! that make IDENTITY a misnomer.

This tag is not only about reproducibility, it supposes to represent
package build result identity, so we can differentiate one build from
another from one source package (with same NEVR) if there are
significant difference, such as different rundeps, generated binary
files, different filelist that packaged to the files, etc. One benefit
of this is reproducible build proof, second one is that we can use value
from this tag to generate more strict intersubpackage deps, so we can
use these for binary package rebuild without release upping (in case new
SONAME buildreq or new compiler).

We couldn't think of better tag name (you can propose a better name).

-- 
   With best regards,
   Vladimir D. Seleznev
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [PATCH 2/2] Add RPMTAG_IDENTITY

2018-03-14 Thread Vladimir D. Seleznev
On Wed, Mar 14, 2018 at 01:45:31PM +0100, Florian Festi wrote:
> On 03/12/2018 10:04 PM, vsele...@altlinux.org wrote:
> > From: "Vladimir D. Seleznev" 
> > 
> > This tag represents binary package build characteristic: if two binary
> > packages have equal RPMTAG_IDENTITY values, it means that these packages
> > have no significant differences.
> > 
> > One of the applications of RPMTAG_IDENTITY is reproducible build
> > verification.
> > 
> > This tag is reserved for ALT Linux Team and marked as unimplemented.
> 
> I really like this idea.
> 
> It still needs some thought on how to actually do that properly and what
> to put into the tag?
>  * URL to the distgit commit
>  * buildsystem name + hash
>  * different thing for each distro?

I think there should be just specific hashsum of significant part of
package header (see introductory message) in that tag without any
additional info.  For other stuffs there are other tags like
RPMTAG_BUILDHOST, RPMTAG_DISTRIBUTION etc.

I'll try to send draft patches for it calculation as soon as possible,
but I'm not sure when.

> But it is fine to figure this out on the go. In the end I would like RPM
> upstream to at least be able give some guide lines on how to use it.
> 
> We also probably need a way to set the tag via cli param or environment
> variable so it can be set automatically by the build systems. This is
> how this should be used IMHO.

I don't understand why it should be set? It supposed to be a sume
representation of build result.

-- 
   With best regards,
   Vladimir D. Seleznev
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [PATCH 2/2] Add RPMTAG_IDENTITY

2018-03-14 Thread Jeff Johnson

Afaict, RPMTAG_IDENTITY is an attempt at a reproducible invariant of a package 
header through rebuilding, which is poisoned by a RPMTAG_BUILDTIME tag (and 
likely file stat(2) info) being included in the header SHA1 (or SHA256) 
plaintext.

Note also changes in current rpm to pass in a BUILDTIME to preserve 
reproducibility.

There are huge legacy compatibility problems committing to a precomputed static 
value in a header: consider what happens if/when the plaintext definition needs 
to change.

I'd suggest using a header tag extension rather than a retrieved value so that 
the plaintext definition can be more easily managed.

I'd also suggest a more specific name than IDENTITY because there are many 
definitions of reproducibility, as well as alternative schemes like building, 
and there are surely going to be multiple attempts to Get It Right! that make 
IDENTITY a misnomer.

hth

73 de Jeff
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [PATCH 2/2] Add RPMTAG_IDENTITY

2018-03-14 Thread Florian Festi
On 03/12/2018 10:04 PM, vsele...@altlinux.org wrote:
> From: "Vladimir D. Seleznev" 
> 
> This tag represents binary package build characteristic: if two binary
> packages have equal RPMTAG_IDENTITY values, it means that these packages
> have no significant differences.
> 
> One of the applications of RPMTAG_IDENTITY is reproducible build
> verification.
> 
> This tag is reserved for ALT Linux Team and marked as unimplemented.

I really like this idea.

It still needs some thought on how to actually do that properly and what
to put into the tag?
 * URL to the distgit commit
 * buildsystem name + hash
 * different thing for each distro?

But it is fine to figure this out on the go. In the end I would like RPM
upstream to at least be able give some guide lines on how to use it.

We also probably need a way to set the tag via cli param or environment
variable so it can be set automatically by the build systems. This is
how this should be used IMHO.

Florian

-- 

Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Paul Argiry, Charles Cachera, Michael Cunningham,
Michael O'Neill
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


[Rpm-maint] [PATCH 2/2] Add RPMTAG_IDENTITY

2018-03-12 Thread vseleznv
From: "Vladimir D. Seleznev" 

This tag represents binary package build characteristic: if two binary
packages have equal RPMTAG_IDENTITY values, it means that these packages
have no significant differences.

One of the applications of RPMTAG_IDENTITY is reproducible build
verification.

This tag is reserved for ALT Linux Team and marked as unimplemented.

Signed-off-by: Vladimir D. Seleznev 
---
 lib/rpmtag.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/rpmtag.h b/lib/rpmtag.h
index 395ea8e..973a6b6 100644
--- a/lib/rpmtag.h
+++ b/lib/rpmtag.h
@@ -371,6 +371,7 @@ typedef enum rpmTag_e {
 RPMTAG_PAYLOADDIGEST   = 5092, /* s[] */
 RPMTAG_PAYLOADDIGESTALGO   = 5093, /* i */
 RPMTAG_AUTOINSTALLED   = 5094, /* i reservation (unimplemented) */
+RPMTAG_IDENTITY= 5095, /* s reservation (unimplemented) */
 
 RPMTAG_FIRSTFREE_TAG   /*!< internal */
 } rpmTag;
-- 
2.10.4

___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint