Re: [gentoo-portage-dev] [PATCH 1/3] Have repoman check if the packages to unpack rare archive formats from SRC_URI are present in DEPEND (bug #205909).

2014-01-17 Thread Sebastian Luther
Am 16.01.2014 01:07, schrieb Tom Wijsman:

 +
 +archive_formats_eapi_3_to_5 = {
 + \.tar.xz:app-arch/tar,
 + \.xz:app-arch/xz-utils,

Shouldn't .tar.xz require both tar and xz-utils? Other entries may
have the same problem.




Re: [gentoo-portage-dev] [PATCH 1/3] Have repoman check if the packages to unpack rare archive formats from SRC_URI are present in DEPEND (bug #205909).

2014-01-17 Thread Sebastian Luther
Am 16.01.2014 22:40, schrieb Tom Wijsman:
 On Thu, 16 Jan 2014 08:03:03 +0100 Sebastian Luther
 sebastianlut...@gmx.de wrote:
 
 Am 16.01.2014 01:07, schrieb Tom Wijsman:
 --- bin/repoman   | 53 
 +
 man/repoman.1 |  4  2 files changed, 57 insertions(+)
 
 diff --git a/bin/repoman b/bin/repoman index d1542e9..9b703dc
 100755 --- a/bin/repoman +++ b/bin/repoman @@ -36,6 +36,9 @@
 pym_path = 
 osp.join(osp.dirname(osp.dirname(osp.realpath(__file__))),
 pym) sys.path.insert(0, pym_path) import portage 
 portage._internal_caller = True + +from portage._sets.profiles
 import PackagesSystemSet +system_set_atoms = 
 PackagesSystemSet(portage.settings.profiles).getAtoms() 
 portage._disable_legacy_globals()
 
 You should be using repoman_settings instead of
 portage.settings.
 
 If I understand correctly, that is this URL?
 
 http://dev.gentoo.org/~zmedico/portage/doc/api/portage.repository.config-module.html

  How do I get the @system set out of that?

portage.settings and repoman_settings are instances of
portage.package.ebuild.config.config. I just looked at the code and
think now that you should keep using PackagesSystemSet to get the
@system atoms. Just turn them into a set afterwards with set(atom.cp
for atom in system_set_atoms).

The reason to use atom.cp is that =cat/foo-1 could be part of that
set and then 'cat/foo in system_set_atoms' would return False.

 
 Considering the later use
 
 Which use?

The if entry not in system_set_atoms line. You're using __contains__
there (with the 'in'). You don't use the additional magic provided by
PackageSet (which is a super class of PackagesSystemSet).

 
 you don't need PackagesSystemSet set here, just use a set.
 
 Okay, thus I need to create some kind of set object here (I don't
 see one in the list of
 http://dev.gentoo.org/~zmedico/portage/doc/api/ though) and then
 specify that it would be the @system set? Which class?
 

Whenever I say 'set' I mean python's builtin set.

 And use atom.cp instead of the atoms.
 
 So, if I understood correctly; using list comprehension, I
 directly transform the getAtoms() to a list of atom.cp's... Okay,
 good idea.
 
 try: @@ -300,6 +303,7 @@ qahelp = { inherit.missing: Ebuild
 uses functions from an eclass but does not inherit it,
 inherit.unused: Ebuild inherits an eclass but does not use
 it, java.eclassesnotused: With virtual/jdk in DEPEND you
 must inherit a java eclass, +  unpack.DEPEND.missing: A rare
 archive format was used in SRC_URI, but its package to unpack
 it is missing in DEPEND., wxwidgets.eclassnotused: Ebuild
 DEPENDs on x11-libs/wxGTK without inheriting wxwidgets.eclass,
 KEYWORDS.dropped: Ebuilds that appear to have dropped
 KEYWORDS for some arch, KEYWORDS.missing: Ebuilds that have
 a missing or empty KEYWORDS variable, @@ -399,6 +403,7 @@
 qawarnings = set(( metadata.warning, portage.internal, 
 repo.eapi.deprecated, +unpack.DEPEND.missing, 
 usage.obsolete, upstream.workaround, LIVEVCS.stable, @@
 -479,6 +484,25 @@ ruby_deprecated = frozenset([ 
 ruby_targets_ree18, ])
 
 +# TODO: Add functionality to support checking for deb2targz
 on platforms where +#   GNU binutils is absent; see PMS 5,
 section 11.3.3.13. +archive_formats = { +
 \.7[zZ]:app-arch/p7zip, +
 \.(bz2?|tbz2):app-arch/bzip2, + \.jar:app-arch/unzip, +
 \.(LH[aA]|lha|lzh):app-arch/lha, +
 \.lzma:app-arch/lzma-utils, +
 \.(rar|RAR):app-arch/unrar, +
 \.(tar(\.(bz2?|gz|Z))?|tbz2|t[bg]z)?:app-arch/tar, +
 \.(gz|tar\.Z|t[bg]z|[zZ]):app-arch/gzip, +
 \.(zip|ZIP):app-arch/unzip, +} + 
 +archive_formats_eapi_3_to_5 = { +  \.tar.xz:app-arch/tar, +
 \.xz:app-arch/xz-utils, +} + metadata_xml_encoding =
 'UTF-8' metadata_xml_declaration = '?xml version=1.0
 encoding=%s?' % \ (metadata_xml_encoding,) @@ -1559,6
 +1583,7 @@ for x in effective_scanlist: fetchlist_dict =
 portage.FetchlistDict(checkdir, repoman_settings, portdb)
 myfiles_all = [] src_uri_error = False +needed_unpack_depends
 = {} for mykey in fetchlist_dict: try: 
 myfiles_all.extend(fetchlist_dict[mykey]) @@ -1573,7 +1598,22
 @@ for x in effective_scanlist: stats[SRC_URI.syntax] += 1 
 fails[SRC_URI.syntax].append( %s.ebuild SRC_URI: %s % 
 (mykey, e)) + + # Compare each SRC_URI entry against 
 archive_formats; if one of the +# extensions match, we
 remember which archive depends are needed to +  # check them
 later on. + needed_unpack_depends[mykey] = [] + for
 file_extension in archive_formats or \ +
 ((re.match('[345]$',
 eapi) is not None) \
 
 Use portage.eapi for the line above.
 
 Why? 'eapi' is the EAPI of the ebuild, what is wrong with that?

What I want you to do is to change (re.match('[345]$', eapi) into
something like: eapi_has_xz_unpack(eapi). the function
eapi_has_xz_unpack needs to be written. It should be part of portage.eapi.

 
 You may have to add a new function to portage.eapi.
 
 What would the purpose of that function 

Re: [gentoo-portage-dev] [PATCH 1/3] Have repoman check if the packages to unpack rare archive formats from SRC_URI are present in DEPEND (bug #205909).

2014-01-17 Thread Tom Wijsman
On Fri, 17 Jan 2014 09:35:58 +0100
Sebastian Luther sebastianlut...@gmx.de wrote:

 The if entry not in system_set_atoms line. You're using __contains__
 there (with the 'in'). You don't use the additional magic provided by
 PackageSet (which is a super class of PackagesSystemSet).

This is __contains__, which does what I want as far as I can see:
http://dev.gentoo.org/~zmedico/portage/doc/api/portage._sets.base-pysrc.html#PackageSet.__contains__
What is there wrong with this?

As for the additional magic, do you mean containsCPV? Looking at it:
http://dev.gentoo.org/~zmedico/portage/doc/api/portage._sets.base-pysrc.html#PackageSet.containsCPV
It seems more complex than is necessary, is there a benefit to this?

Sorry, I don't understand what is meant here; all the rest of your
feedback is clear and has been implemented and is in the v2 I plan to
send to the mailing list soon.

-- 
With kind regards,

Tom Wijsman (TomWij)
Gentoo Developer

E-mail address  : tom...@gentoo.org
GPG Public Key  : 6D34E57D
GPG Fingerprint : C165 AF18 AB4C 400B C3D2  ABF0 95B2 1FCD 6D34 E57D


signature.asc
Description: PGP signature


Re: [gentoo-portage-dev] [PATCH 1/3] Have repoman check if the packages to unpack rare archive formats from SRC_URI are present in DEPEND (bug #205909).

2014-01-17 Thread Sebastian Luther
Am 18.01.2014 00:00, schrieb Tom Wijsman:
 On Fri, 17 Jan 2014 09:35:58 +0100
 Sebastian Luther sebastianlut...@gmx.de wrote:
 
 The if entry not in system_set_atoms line. You're using __contains__
 there (with the 'in'). You don't use the additional magic provided by
 PackageSet (which is a super class of PackagesSystemSet).
 
 This is __contains__, which does what I want as far as I can see:
 http://dev.gentoo.org/~zmedico/portage/doc/api/portage._sets.base-pysrc.html#PackageSet.__contains__
 What is there wrong with this?

There's nothing wrong with that. It's just that the builtin set type
supports that too.

 
 As for the additional magic, do you mean containsCPV? Looking at it:
 http://dev.gentoo.org/~zmedico/portage/doc/api/portage._sets.base-pysrc.html#PackageSet.containsCPV
 It seems more complex than is necessary, is there a benefit to this?

I thought more about iterAtomsForPackage, which is used in lots of
places in the resolver.

The comment was about the fact that you're using a more complicated
selfmade class instead of a builtin type, even if the builtin type would
suffice. The comment made more sense when I was still suggesting to not
use PackageSystemSet at all.



Re: [gentoo-portage-dev] [PATCH 1/3] Have repoman check if the packages to unpack rare archive formats from SRC_URI are present in DEPEND (bug #205909).

2014-01-16 Thread Tom Wijsman
On Wed, 15 Jan 2014 17:44:15 -0800
Alec Warner anta...@gentoo.org wrote:

 On Wed, Jan 15, 2014 at 4:07 PM, Tom Wijsman tom...@gentoo.org
 wrote:
 
  ---
   bin/repoman   | 53
  + man/repoman.1
  |  4  2 files changed, 57 insertions(+)
 
 
 I urge you to not author new checks like this. /usr/bin/repoman is
 already a mess.
 
 Write these checks as functions, put them in a different file, and
 just call them from the giant messy loop. At least that way the
 checks are self contained (great for avoiding things like variable
 re-use or shadowing).

My plan is to first work a bit on repoman to get to know it, then when
knowing better where everything is work on refactoring it. If I start
to refactor right away, or start adding random files without knowing
where everything is; I would create a more broken design than it is.

Don't worry, I plan on a fully refactored Portage as well as good
practices when adding new checks; which will definitely become
necessary as we go forward. There's ~80 bugs, imagine all the code that
that would bring; definitely don't want those files to grow much longer.

-- 
With kind regards,

Tom Wijsman (TomWij)
Gentoo Developer

E-mail address  : tom...@gentoo.org
GPG Public Key  : 6D34E57D
GPG Fingerprint : C165 AF18 AB4C 400B C3D2  ABF0 95B2 1FCD 6D34 E57D


signature.asc
Description: PGP signature


Re: [gentoo-portage-dev] [PATCH 1/3] Have repoman check if the packages to unpack rare archive formats from SRC_URI are present in DEPEND (bug #205909).

2014-01-16 Thread Alexander Berntsen
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 16/01/14 22:18, Tom Wijsman wrote:
 My plan is to first work a bit on repoman to get to know it, then 
 when knowing better where everything is work on refactoring it.
That, along with I'll use this ugly short cut, but only this one
time!, is what they all say. Plans change. Things happen.

 If I start to refactor right away, or start adding random files 
 without knowing where everything is; I would create a more broken 
 design than it is.
Yes. Please think before you start hammering your keyboard. The keyboard
is just a means to an end.

 Don't worry, I plan on a fully refactored Portage as well as good 
 practices when adding new checks; which will definitely become 
 necessary as we go forward.
We have to worry. Because, as mentioned, plans change; things happen.


Please do not push this patch. And please do not carry on with this
mentality of yes, I know what I am doing is wrong, but I'll fix it
eventually, some day, I promise.
- -- 
Alexander
alexan...@plaimi.net
http://plaimi.net/~alexander
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iF4EAREIAAYFAlLYTd4ACgkQRtClrXBQc7VNmQD+KhIxhOAC4/IPB4hn+ugL1NIH
zNtPtTObzBMIivybstQBAISvE55N7Gb7hh6os+pZcvVilVqhhQP/V3b1RIk9tC2j
=0lAa
-END PGP SIGNATURE-



Re: [gentoo-portage-dev] [PATCH 1/3] Have repoman check if the packages to unpack rare archive formats from SRC_URI are present in DEPEND (bug #205909).

2014-01-16 Thread Tom Wijsman
On Thu, 16 Jan 2014 08:03:03 +0100
Sebastian Luther sebastianlut...@gmx.de wrote:

 Am 16.01.2014 01:07, schrieb Tom Wijsman:
  ---
   bin/repoman   | 53
  + man/repoman.1
  |  4  2 files changed, 57 insertions(+)
  
  diff --git a/bin/repoman b/bin/repoman
  index d1542e9..9b703dc 100755
  --- a/bin/repoman
  +++ b/bin/repoman
  @@ -36,6 +36,9 @@ pym_path =
  osp.join(osp.dirname(osp.dirname(osp.realpath(__file__))), pym)
  sys.path.insert(0, pym_path) import portage
   portage._internal_caller = True
  +
  +from portage._sets.profiles import PackagesSystemSet
  +system_set_atoms =
  PackagesSystemSet(portage.settings.profiles).getAtoms()
  portage._disable_legacy_globals() 
 
 You should be using repoman_settings instead of portage.settings.

If I understand correctly, that is this URL?

http://dev.gentoo.org/~zmedico/portage/doc/api/portage.repository.config-module.html

How do I get the @system set out of that?

 Considering the later use

Which use?

 you don't need PackagesSystemSet set here,
 just use a set.

Okay, thus I need to create some kind of set object here (I don't see
one in the list of http://dev.gentoo.org/~zmedico/portage/doc/api/
though) and then specify that it would be the @system set? Which class?

 And use atom.cp instead of the atoms.

So, if I understood correctly; using list comprehension, I directly
transform the getAtoms() to a list of atom.cp's... Okay, good idea.

   try:
  @@ -300,6 +303,7 @@ qahelp = {
  inherit.missing: Ebuild uses functions from an eclass
  but does not inherit it, inherit.unused: Ebuild inherits an
  eclass but does not use it, java.eclassesnotused: With
  virtual/jdk in DEPEND you must inherit a java eclass,
  +   unpack.DEPEND.missing: A rare archive format was used
  in SRC_URI, but its package to unpack it is missing in DEPEND.,
  wxwidgets.eclassnotused: Ebuild DEPENDs on x11-libs/wxGTK
  without inheriting wxwidgets.eclass, KEYWORDS.dropped: Ebuilds
  that appear to have dropped KEYWORDS for some arch,
  KEYWORDS.missing: Ebuilds that have a missing or empty KEYWORDS
  variable, @@ -399,6 +403,7 @@ qawarnings =
  set(( metadata.warning, portage.internal,
  repo.eapi.deprecated, +unpack.DEPEND.missing,
   usage.obsolete,
   upstream.workaround,
   LIVEVCS.stable,
  @@ -479,6 +484,25 @@ ruby_deprecated = frozenset([
  ruby_targets_ree18,
   ])
   
  +# TODO: Add functionality to support checking for deb2targz on
  platforms where +#   GNU binutils is absent; see PMS 5, section
  11.3.3.13. +archive_formats = {
  +   \.7[zZ]:app-arch/p7zip,
  +   \.(bz2?|tbz2):app-arch/bzip2,
  +   \.jar:app-arch/unzip,
  +   \.(LH[aA]|lha|lzh):app-arch/lha,
  +   \.lzma:app-arch/lzma-utils,
  +   \.(rar|RAR):app-arch/unrar,
  +   \.(tar(\.(bz2?|gz|Z))?|tbz2|t[bg]z)?:app-arch/tar,
  +   \.(gz|tar\.Z|t[bg]z|[zZ]):app-arch/gzip,
  +   \.(zip|ZIP):app-arch/unzip,
  +}
  +
  +archive_formats_eapi_3_to_5 = {
  +   \.tar.xz:app-arch/tar,
  +   \.xz:app-arch/xz-utils,
  +}
  +
   metadata_xml_encoding = 'UTF-8'
   metadata_xml_declaration = '?xml version=1.0 encoding=%s?' %
  \ (metadata_xml_encoding,)
  @@ -1559,6 +1583,7 @@ for x in effective_scanlist:
  fetchlist_dict = portage.FetchlistDict(checkdir,
  repoman_settings, portdb) myfiles_all = []
  src_uri_error = False
  +   needed_unpack_depends = {}
  for mykey in fetchlist_dict:
  try:
  myfiles_all.extend(fetchlist_dict[mykey])
  @@ -1573,7 +1598,22 @@ for x in effective_scanlist:
  stats[SRC_URI.syntax] += 1
  fails[SRC_URI.syntax].append(
  %s.ebuild SRC_URI: %s %
  (mykey, e)) +
  +   # Compare each SRC_URI entry against
  archive_formats; if one of the
  +   # extensions match, we remember which archive
  depends are needed to
  +   # check them later on.
  +   needed_unpack_depends[mykey] = []
  +   for file_extension in archive_formats or \
  +   ((re.match('[345]$', eapi) is not None) \
 
 Use portage.eapi for the line above.

Why? 'eapi' is the EAPI of the ebuild, what is wrong with that?

 You may have to add a new function to portage.eapi.

What would the purpose of that function be?

  +   and file_extension in
  archive_formats_eapi_3_to_5):
  +   for entry in fetchlist_dict[mykey]:
  +   if re.match('.*%s$' %
  file_extension, entry) is not None:
  +   format =
  archive_formats[file_extension]
 
 As these regex are used frequently, they should be compiled using
 re.compile.

I know, but it contains %s; but, I'll look if I can make a list of
regex, one for each file extension. Or rather, I'll first try to instead
match the last characters of the string using a substring without
having to create a regex at all, which should be even faster.

  +

Re: [gentoo-portage-dev] [PATCH 1/3] Have repoman check if the packages to unpack rare archive formats from SRC_URI are present in DEPEND (bug #205909).

2014-01-16 Thread Alexander Berntsen
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Your ill-placed attempts at being clever are missing the point.
Portage is a mess. We don't need it to become more messy to the point
of maintainability.

Yes, no one fixing bugs (because they are all designing a grand
redesign of Portage) would be bad. However, this is not likely to
happen. It hasn't happened before. And now we have a bunch of great
new volunteers.

Sebastian even *told* you specifically how to do this properly. So
please do.

- -- 
Alexander
alexan...@plaimi.net
http://plaimi.net/~alexander
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iF4EAREIAAYFAlLYW7QACgkQRtClrXBQc7WgOwEAqOuBXIMZyWbXRyY9VQEbwDCg
tXmXa266YsSYgqRNlDMBAJTq9pyOvIhqhRPjLTrlA3EodPUSiDY5wRlMjSZ08xeE
=WljA
-END PGP SIGNATURE-



Re: [gentoo-portage-dev] [PATCH 1/3] Have repoman check if the packages to unpack rare archive formats from SRC_URI are present in DEPEND (bug #205909).

2014-01-16 Thread Tom Wijsman
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thu, 16 Jan 2014 23:22:44 +0100
Alexander Berntsen alexan...@plaimi.net wrote:

 Your ill-placed attempts at being clever are missing the point.

Why are they missing the point?

 Portage is a mess. We don't need it to become more messy to the point
 of maintainability.

How do we plan to fix that?

 Yes, no one fixing bugs (because they are all designing a grand
 redesign of Portage) would be bad.

Do you agree?

 However, this is not likely to happen.

Why do you think so?

 It hasn't happened before.

It could happen in the future.

 And now we have a bunch of great new volunteers.

Yes, we do.

 Sebastian even *told* you specifically how to do this properly.

Where did he?

 So please do.

Why?

- -- 
With kind regards,

Tom Wijsman (TomWij)
Gentoo Developer

E-mail address  : tom...@gentoo.org
GPG Public Key  : 6D34E57D
GPG Fingerprint : C165 AF18 AB4C 400B C3D2  ABF0 95B2 1FCD 6D34 E57D
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (GNU/Linux)

iQEcBAEBAgAGBQJS2GT9AAoJEJWyH81tNOV95ukH+wZ0yB4KOgfOd6z90cpYC0Ec
4RLK8HbKVYIIytxnnhR5Ny/BR5/ANlOYQDIFUytkJyKNmVPx3nP6kY+wHD5qOYkF
3DlJpxRy6wx/E43+wpvVv+dSNDHxkvyy8OeZ5QuAcFi1oYaeYdctfOU4/URihGzO
MjA5h0ydQ8CcHoTMkJsFjS7wL3HdSy1m1SRh9kiOuOr9hz4HqOImjJQ1/6Yb38uS
MVewW2oMEnEB99UANB5CswZHvvHcxU6d4hSmKlL1DfEuEq8hodM552n+WqpTgRhW
YzLmhUFqe0fkS9p4wAa5tPDB8O34P+nEvSAwtVqDFwqJmX0+H+r2INp7LJmDARo=
=FaJT
-END PGP SIGNATURE-


Re: [gentoo-portage-dev] [PATCH 1/3] Have repoman check if the packages to unpack rare archive formats from SRC_URI are present in DEPEND (bug #205909).

2014-01-16 Thread Tom Wijsman
On Wed, 15 Jan 2014 17:44:15 -0800
Alec Warner anta...@gentoo.org wrote:

 Write these checks as functions

Will do in v2, might also look into whether this part of the code can
be refactored already into its own file; having it similar to the
structure we have in Checks.py.

We could then name Checks.py as SyntaxChecks.py and name this new file
for the metadata checks of the loop as MetadataChecks.py; then later,
we can look into creating VCSDiffChecks.py.

Given that these files could grow, we will probably want to split the
files up and put the split up files into directories like
checks/syntax/, checks/metadata/ and so on.

I really would love to see repoman refactored, redesigned and/or
rewritten; but it is to soon for me to do this the most wise way, as I
need to understand the code. I plan to do this 2 - 4 weeks from now.

-- 
With kind regards,

Tom Wijsman (TomWij)
Gentoo Developer

E-mail address  : tom...@gentoo.org
GPG Public Key  : 6D34E57D
GPG Fingerprint : C165 AF18 AB4C 400B C3D2  ABF0 95B2 1FCD 6D34 E57D


signature.asc
Description: PGP signature


Re: [gentoo-portage-dev] [PATCH 1/3] Have repoman check if the packages to unpack rare archive formats from SRC_URI are present in DEPEND (bug #205909).

2014-01-15 Thread Alec Warner
On Wed, Jan 15, 2014 at 4:07 PM, Tom Wijsman tom...@gentoo.org wrote:

 ---
  bin/repoman   | 53 +
  man/repoman.1 |  4 
  2 files changed, 57 insertions(+)


I urge you to not author new checks like this. /usr/bin/repoman is already
a mess.

Write these checks as functions, put them in a different file, and just
call them from the giant messy loop. At least that way the checks are self
contained (great for avoiding things like variable re-use or shadowing).

-A



 diff --git a/bin/repoman b/bin/repoman
 index d1542e9..9b703dc 100755
 --- a/bin/repoman
 +++ b/bin/repoman
 @@ -36,6 +36,9 @@ pym_path =
 osp.join(osp.dirname(osp.dirname(osp.realpath(__file__))), pym)
  sys.path.insert(0, pym_path)
  import portage
  portage._internal_caller = True
 +
 +from portage._sets.profiles import PackagesSystemSet
 +system_set_atoms = PackagesSystemSet(portage.settings.profiles).getAtoms()
  portage._disable_legacy_globals()

  try:
 @@ -300,6 +303,7 @@ qahelp = {
 inherit.missing: Ebuild uses functions from an eclass but does
 not inherit it,
 inherit.unused: Ebuild inherits an eclass but does not use it,
 java.eclassesnotused: With virtual/jdk in DEPEND you must
 inherit a java eclass,
 +   unpack.DEPEND.missing: A rare archive format was used in
 SRC_URI, but its package to unpack it is missing in DEPEND.,
 wxwidgets.eclassnotused: Ebuild DEPENDs on x11-libs/wxGTK
 without inheriting wxwidgets.eclass,
 KEYWORDS.dropped: Ebuilds that appear to have dropped KEYWORDS
 for some arch,
 KEYWORDS.missing: Ebuilds that have a missing or empty KEYWORDS
 variable,
 @@ -399,6 +403,7 @@ qawarnings = set((
  metadata.warning,
  portage.internal,
  repo.eapi.deprecated,
 +unpack.DEPEND.missing,
  usage.obsolete,
  upstream.workaround,
  LIVEVCS.stable,
 @@ -479,6 +484,25 @@ ruby_deprecated = frozenset([
 ruby_targets_ree18,
  ])

 +# TODO: Add functionality to support checking for deb2targz on platforms
 where
 +#   GNU binutils is absent; see PMS 5, section 11.3.3.13.
 +archive_formats = {
 +   \.7[zZ]:app-arch/p7zip,
 +   \.(bz2?|tbz2):app-arch/bzip2,
 +   \.jar:app-arch/unzip,
 +   \.(LH[aA]|lha|lzh):app-arch/lha,
 +   \.lzma:app-arch/lzma-utils,
 +   \.(rar|RAR):app-arch/unrar,
 +   \.(tar(\.(bz2?|gz|Z))?|tbz2|t[bg]z)?:app-arch/tar,
 +   \.(gz|tar\.Z|t[bg]z|[zZ]):app-arch/gzip,
 +   \.(zip|ZIP):app-arch/unzip,
 +}
 +
 +archive_formats_eapi_3_to_5 = {
 +   \.tar.xz:app-arch/tar,
 +   \.xz:app-arch/xz-utils,
 +}
 +
  metadata_xml_encoding = 'UTF-8'
  metadata_xml_declaration = '?xml version=1.0 encoding=%s?' % \
 (metadata_xml_encoding,)
 @@ -1559,6 +1583,7 @@ for x in effective_scanlist:
 fetchlist_dict = portage.FetchlistDict(checkdir, repoman_settings,
 portdb)
 myfiles_all = []
 src_uri_error = False
 +   needed_unpack_depends = {}
 for mykey in fetchlist_dict:
 try:
 myfiles_all.extend(fetchlist_dict[mykey])
 @@ -1573,7 +1598,22 @@ for x in effective_scanlist:
 stats[SRC_URI.syntax] += 1
 fails[SRC_URI.syntax].append(
 %s.ebuild SRC_URI: %s % (mykey,
 e))
 +
 +   # Compare each SRC_URI entry against archive_formats; if
 one of the
 +   # extensions match, we remember which archive depends are
 needed to
 +   # check them later on.
 +   needed_unpack_depends[mykey] = []
 +   for file_extension in archive_formats or \
 +   ((re.match('[345]$', eapi) is not None) \
 +   and file_extension in
 archive_formats_eapi_3_to_5):
 +   for entry in fetchlist_dict[mykey]:
 +   if re.match('.*%s$' % file_extension,
 entry) is not None:
 +   format =
 archive_formats[file_extension]
 +
 +   if format not in
 needed_unpack_depends[mykey]:
 +
 needed_unpack_depends[mykey].append(format)
 del fetchlist_dict
 +
 if not src_uri_error:
 # This test can produce false positives if SRC_URI could
 not
 # be parsed for one or more ebuilds. There's no point in
 @@ -2010,6 +2050,17 @@ for x in effective_scanlist:
 atoms = None
 badsyntax.append(str(e))

 +   if atoms and mytype == 'DEPEND':
 +   # We check whether the needed archive
 dependencies are present
 +   # in DEPEND, which were determined from
 SRC_URI.
 +   for entry in needed_unpack_depends[catdir
 + '/' + y]:
 +   if entry not in system_set_atoms
 and entry \
 +  

Re: [gentoo-portage-dev] [PATCH 1/3] Have repoman check if the packages to unpack rare archive formats from SRC_URI are present in DEPEND (bug #205909).

2014-01-15 Thread Sebastian Luther
Am 16.01.2014 01:07, schrieb Tom Wijsman:
 ---
  bin/repoman   | 53 +
  man/repoman.1 |  4 
  2 files changed, 57 insertions(+)
 
 diff --git a/bin/repoman b/bin/repoman
 index d1542e9..9b703dc 100755
 --- a/bin/repoman
 +++ b/bin/repoman
 @@ -36,6 +36,9 @@ pym_path = 
 osp.join(osp.dirname(osp.dirname(osp.realpath(__file__))), pym)
  sys.path.insert(0, pym_path)
  import portage
  portage._internal_caller = True
 +
 +from portage._sets.profiles import PackagesSystemSet
 +system_set_atoms = PackagesSystemSet(portage.settings.profiles).getAtoms()
  portage._disable_legacy_globals()
  

You should be using repoman_settings instead of portage.settings.

Considering the later use, you don't need PackagesSystemSet set here,
just use a set. And use atom.cp instead of the atoms.

  try:
 @@ -300,6 +303,7 @@ qahelp = {
   inherit.missing: Ebuild uses functions from an eclass but does not 
 inherit it,
   inherit.unused: Ebuild inherits an eclass but does not use it,
   java.eclassesnotused: With virtual/jdk in DEPEND you must inherit a 
 java eclass,
 + unpack.DEPEND.missing: A rare archive format was used in SRC_URI, 
 but its package to unpack it is missing in DEPEND.,
   wxwidgets.eclassnotused: Ebuild DEPENDs on x11-libs/wxGTK without 
 inheriting wxwidgets.eclass,
   KEYWORDS.dropped: Ebuilds that appear to have dropped KEYWORDS for 
 some arch,
   KEYWORDS.missing: Ebuilds that have a missing or empty KEYWORDS 
 variable,
 @@ -399,6 +403,7 @@ qawarnings = set((
  metadata.warning,
  portage.internal,
  repo.eapi.deprecated,
 +unpack.DEPEND.missing,
  usage.obsolete,
  upstream.workaround,
  LIVEVCS.stable,
 @@ -479,6 +484,25 @@ ruby_deprecated = frozenset([
   ruby_targets_ree18,
  ])
  
 +# TODO: Add functionality to support checking for deb2targz on platforms 
 where
 +#   GNU binutils is absent; see PMS 5, section 11.3.3.13.
 +archive_formats = {
 + \.7[zZ]:app-arch/p7zip,
 + \.(bz2?|tbz2):app-arch/bzip2,
 + \.jar:app-arch/unzip,
 + \.(LH[aA]|lha|lzh):app-arch/lha,
 + \.lzma:app-arch/lzma-utils,
 + \.(rar|RAR):app-arch/unrar,
 + \.(tar(\.(bz2?|gz|Z))?|tbz2|t[bg]z)?:app-arch/tar,
 + \.(gz|tar\.Z|t[bg]z|[zZ]):app-arch/gzip,
 + \.(zip|ZIP):app-arch/unzip,
 +}
 +
 +archive_formats_eapi_3_to_5 = {
 + \.tar.xz:app-arch/tar,
 + \.xz:app-arch/xz-utils,
 +}
 +
  metadata_xml_encoding = 'UTF-8'
  metadata_xml_declaration = '?xml version=1.0 encoding=%s?' % \
   (metadata_xml_encoding,)
 @@ -1559,6 +1583,7 @@ for x in effective_scanlist:
   fetchlist_dict = portage.FetchlistDict(checkdir, repoman_settings, 
 portdb)
   myfiles_all = []
   src_uri_error = False
 + needed_unpack_depends = {}
   for mykey in fetchlist_dict:
   try:
   myfiles_all.extend(fetchlist_dict[mykey])
 @@ -1573,7 +1598,22 @@ for x in effective_scanlist:
   stats[SRC_URI.syntax] += 1
   fails[SRC_URI.syntax].append(
   %s.ebuild SRC_URI: %s % (mykey, e))
 +
 + # Compare each SRC_URI entry against archive_formats; if one of 
 the
 + # extensions match, we remember which archive depends are 
 needed to
 + # check them later on.
 + needed_unpack_depends[mykey] = []
 + for file_extension in archive_formats or \
 + ((re.match('[345]$', eapi) is not None) \

Use portage.eapi for the line above. You may have to add a new function
to portage.eapi.

 + and file_extension in 
 archive_formats_eapi_3_to_5):
 + for entry in fetchlist_dict[mykey]:
 + if re.match('.*%s$' % file_extension, entry) is 
 not None:
 + format = archive_formats[file_extension]

As these regex are used frequently, they should be compiled using
re.compile.

 +
 + if format not in 
 needed_unpack_depends[mykey]:
 + 
 needed_unpack_depends[mykey].append(format)

I'd make needed_unpack_depends[mykey] a set. Then you can just add()
instead of checking and appending.

   del fetchlist_dict
 +
   if not src_uri_error:
   # This test can produce false positives if SRC_URI could not
   # be parsed for one or more ebuilds. There's no point in
 @@ -2010,6 +2050,17 @@ for x in effective_scanlist:
   atoms = None
   badsyntax.append(str(e))
  
 + if atoms and mytype == 'DEPEND':

Use if atoms and buildtime: here.

 + # We check whether the needed archive 
 dependencies are present
 + # in DEPEND, which were determined from SRC_URI.
 + for entry in